Astropods Spec
Abstract
The AstroAI Spec defines a declarative YAML format for describing the topology of an AI agent — its container, model dependencies, knowledge stores, tool services, integrations, and data ingestion pipelines. The spec is consumed by build tools and deployment servers; it intentionally excludes runtime, orchestration, and deployment-environment concerns. At deploy time, the platform combines this spec with runtime configuration (credentials, interfaces, schedules) to produce a resolved deployment spec, which is then translated into infrastructure manifests.
Conventions
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.
1. Introduction
An AstroAI Spec file (astropods.yml) is a YAML document that declares:
- The agent’s container image (pre-built or build-from-source).
- Components the agent depends on — models, knowledge stores, and tools — each supplied by either a platform-managed provider or a user-managed container.
- Custom providers for external API services that require credential injection.
- Data ingestion pipelines with trigger semantics.
- Local development overrides.
A central concept is provider binding. Components that an agent depends on — models, knowledge stores, and tools — are each declared as either a provider reference or a container definition:
- A provider is a named, platform-known service (e.g.
ollama,anthropic,qdrant). The platform resolves each provider to one of two kinds:- Self-hosted — the platform deploys and manages a container on the agent’s behalf.
- Cloud — the platform injects credentials for an external API.
- A container gives the user full control over the image, port, and configuration.
This design lets authors mix managed and custom components freely within a single spec.
The spec does not cover: resource limits (CPU/memory), observability, rate limits, budgets, security policies, deployment region, or interface routing (Slack, web). These are deployment-time concerns configured separately.
The document format is YAML. Implementations MUST accept files named astropods.yml.
2. Top-Level Structure
A conforming document MUST contain the following top-level fields:
Map keys serve as entry names and are used in credential injection (see Section 8).
2.1 Meta
3. Agent
The agent object defines the agent’s primary service — its container image or build configuration.
An agent entry MUST specify exactly one of image or build. Providing both or neither is invalid.
3.1 BuildConfig
BuildSecret
3.2 Healthcheck
Applies to the agent definition and to any ContainerConfig.healthcheck in component sections.
Implementations SHOULD support both test (exec-based) and path (HTTP-based) health checks. When path is provided, the implementation SHOULD generate an equivalent test command.
4. Component Sections: Models, Knowledge, Tools
Models, knowledge stores, and tools share a unified provider binding scheme. Each entry operates in exactly one of two modes:
- Provider mode — the entry specifies a
providerstring. The platform resolves this to either a self-hosted provider (deploys a container from its registry) or a cloud provider (injects credentials). - Container mode — the entry specifies a
containerobject. The user manages the image, port, and configuration.
These modes are mutually exclusive: an entry MUST specify exactly one of provider or container. Providing both or neither is invalid.
4.1 Models
The models section declares AI models the agent consumes — LLMs (e.g. Claude, GPT, Llama), embedding models, or any model served behind an inference API. Each entry in the models map:
4.2 Knowledge
Each entry in the knowledge map:
When persistent is true, the platform SHOULD provision durable storage for the entry regardless of mode.
4.3 Tools
The tools section declares services the agent invokes to perform actions or retrieve data. Tools can be HTTP APIs, MCP (Model Context Protocol) servers, or any service exposed over a network port. Each entry in the tools map:
For external API services that need credentials but no platform-managed container, define a custom provider in the providers section (see Section 5) instead.
4.4 ContainerConfig
Used by container-mode entries and by ingestion containers.
A ContainerConfig SHOULD specify at least one of image or build.
GPUConfig
4.5 Input
An Input declares a user-supplied value that the platform prompts for at deploy time and injects as an environment variable into the target container. The name is used directly as the env var key. See Section 8.4 for injection targets.
The datatype field controls validation and type coercion applied before injection. secret is orthogonal to datatype — it controls storage and logging: when true, the platform stores the value securely and never logs it. The display-as field controls UI rendering: short-text renders a single-line field, long-text a multi-line field, and select a dropdown using options.
5. Custom Providers
The providers section extends the platform’s built-in provider registry with user-defined entries. A custom provider declares the variables it requires so the platform can prompt for them at deploy time. Custom providers behave like cloud providers (§8.1): they inject credentials into the agent, not connection details.
Each variable’s name is a suffix. The full env var key is formed as {UPPER(provider)}_{varName}, following the same rule as §8.1. Duplicate-entry handling also mirrors §8.1: when multiple entries reference the same custom provider, each entry gets a qualified key; the primary entry also gets the bare key.
Custom providers can be referenced by name from the models, knowledge, and tools sections, just like built-in providers. The scope field controls which sections are allowed to reference the provider — the platform MUST reject references from sections not listed in scope.
6. Ingestion
The ingestion section declares data ingestion pipelines. Each entry is a container that runs on a trigger.
IngestionTrigger
Trigger type semantics:
schedule— runs on a cron schedule. The cron expression is supplied at deploy time.startup— runs once automatically at deploy time.manual— runs on demand via API invocation.webhook— deploys as a long-running service that receives incoming HTTP requests. The container SHOULD declare aportwhen using this trigger type.
7. Dev
The dev section provides local development overrides consumed by astro dev. These fields are deployment concerns that do not belong in the normative agent topology.
DevOverrides
8. Environment Variable Injection Model
The platform automatically injects environment variables into the agent to wire it to its dependencies. The injection model differs by entry mode: cloud providers inject credentials, self-hosted providers inject connection details, and container-mode entries inject generic connection details.
8.1 Cloud Provider Credentials
Cloud providers (in models, knowledge, tools) require user-provided credentials at deploy time. The env var key is derived from the provider name, not the entry name:
Single entry for a provider:
Example: one anthropic model entry → ANTHROPIC_API_KEY.
Multiple entries for the same provider (duplicate handling):
Each entry gets a name-qualified key:
Additionally, a “primary” entry also receives the bare {UPPER(provider)}_{suffix} key for convenience. The primary entry is the one whose name matches the provider (e.g. an entry named anthropic using provider: anthropic); if no entry name matches, the first alphabetically is primary.
When the entry name equals the provider name, the redundant qualified form (e.g. ANTHROPIC_ANTHROPIC_API_KEY) is omitted — only the bare key is produced.
Examples (single entry):
models.primarywithprovider: anthropic→ANTHROPIC_API_KEYtools.githubwithprovider: github→GITHUB_TOKENknowledge.vectorswithprovider: pinecone→PINECONE_API_KEY
Examples (duplicate entries, two anthropic models):
models.anthropic+models.sonnetboth withprovider: anthropic:anthropic(name matches provider, primary) →ANTHROPIC_API_KEYsonnet→ANTHROPIC_SONNET_API_KEY
Cloud provider credentials are always required.
8.2 Self-Hosted Provider Connection Details
Self-hosted providers deploy a container. The platform injects connection env vars using the provider’s env prefix:
Single entry for a provider:
Example: one qdrant knowledge entry → QDRANT_HOST, QDRANT_PORT, QDRANT_URL.
Multiple entries for the same self-hosted provider:
Each entry gets name-qualified keys; the first alphabetically also gets bare keys:
Model providers additionally inject {EnvPrefix}_BASE_URL (with /api appended) and {EnvPrefix}_MODEL when a model name is specified.
8.3 Container-Mode Connection Details
Container-mode entries (no provider) receive generic section-prefixed env vars:
- Models:
MODEL_{UPPER(name)}_HOST,MODEL_{UPPER(name)}_PORT,MODEL_{UPPER(name)}_URL - Knowledge:
KNOWLEDGE_{UPPER(name)}_HOST,KNOWLEDGE_{UPPER(name)}_PORT - Tools:
TOOL_{UPPER(name)}_HOST,TOOL_{UPPER(name)}_PORT,TOOL_{UPPER(name)}_URL
8.4 Inputs
Inputs are user-supplied values prompted at deploy time. Each input’s name is used directly as the env var key (no prefix) in the target container:
providers[].variables is a template only — it declares what variables a provider requires so the platform can prompt for them at deploy time.
Example: inputs: [{name: OPENAI_API_KEY, datatype: secret}] → OPENAI_API_KEY in the target container.
8.5 Name Sanitization
Entry names used in env var keys are sanitized: converted to lowercase, hyphens, underscores and dots replaced with underscores, non-alphanumeric characters removed, consecutive underscores collapsed, then uppercased. For example, entry name my-model sanitizes to my_model, then uppercases to MY_MODEL.
9. Validation Rules
Implementations MUST enforce the following validation rules:
specMUST be a non-empty string.nameMUST be a non-empty string.agentMUST specify exactly one ofimageorbuild. Ifbuildis present,build.contextandbuild.dockerfileare REQUIRED.- For each entry in
models:providerandcontainerare mutually exclusive. Exactly one MUST be present. - For each entry in
knowledge:providerandcontainerare mutually exclusive. Exactly one MUST be present. - For each entry in
tools:providerandcontainerare mutually exclusive. Exactly one MUST be present. - For each entry in
providers:scopeMUST be present and contain one or more ofmodels,knowledge,tools.variablesMUST be present and MUST contain at least one element. Each variable MUST have a non-emptynameand a validdatatype. 7a. When a component entry references a custom provider by name, the referencing section MUST be listed in that provider’sscope. - For each entry in
ingestion: bothcontainerandtriggerare REQUIRED.trigger.typeMUST be one ofschedule,startup,manual,webhook. - When a
BuildConfigis provided (inagent.build,container.build),contextanddockerfileare REQUIRED. - When
gpu.runtimeis provided, it MUST be one ofcudaorrocm. - When an input’s
display-asisselect,optionsMUST be present and non-empty. - Each
Inputin any context MUST have a non-emptynameanddatatypeMUST be one ofstring,boolean,number,array,object.
Appendix A: Provider Registries (Non-Normative)
The following tables document the platform’s built-in provider registries as of this specification version. Implementations MAY extend these registries.
A.1 Model Providers
Self-Hosted
When model is specified for a self-hosted provider, the platform sets {ENV_PREFIX}_MODEL (e.g. OLLAMA_MODEL=llama3.2).
Cloud
A.2 Knowledge Providers
Self-Hosted
Cloud
A.3 Tool Providers
Cloud
Appendix B: JSON Schema
A machine-readable JSON Schema for this specification is maintained at astropods.schema.json in the astro-spec package. The schema is generated from the normative type definitions and MAY be used for editor autocompletion and pre-validation.
Schema ID: https://astropods.ai/schema/package.json