Skip to content

openarmature.prompts

Prompt-management capability; fetch, render, and trace named prompts.

PromptBackend

Bases: Protocol

Backend protocol; implementations and sibling packages plug into this.

A PromptBackend exposes one operation: fetch a prompt by name and label. Backends do NOT render; rendering is the manager's concern.

Operation semantics:

  • fetch() MUST be reentrant: multiple concurrent calls on the same backend are permitted.
  • fetch() does NOT render or otherwise mutate the template.
  • fetch() MUST raise PromptNotFound when no prompt matches (name, label).
  • fetch() MUST raise PromptStoreUnavailable when the backend is unreachable (network failure, filesystem I/O error, vendor API timeout).

Backends MAY cache their own results internally. When a backend serves a cached result, the returned Prompt's template_hash MUST still be correct for the served template (caching MUST NOT break content-addressing), and fetched_at MUST reflect the original fetch time, not the cache hit time.

fetch async

fetch(
    name: str,
    label: str = "production",
    *,
    cache_ttl_seconds: int | None = None
) -> Prompt

Return the prompt registered as (name, label).

label defaults to "production". Raises PromptNotFound if no prompt matches, and PromptStoreUnavailable if the backing store is unreachable. The returned Prompt carries its raw template plus metadata; rendering is the manager's job, not the backend's.

cache_ttl_seconds is a read-side cache control: None preserves the backend's current behavior, 0 forces a fresh read past any client-side cache, and N > 0 bounds a served cached entry's staleness to N seconds. Cacheless backends ignore it; caching backends honor it.

FilesystemPromptBackend

FilesystemPromptBackend(
    root: Path,
    *,
    layout: Literal["per-label", "flat"] = "per-label",
    sampling_source: Literal[
        "none", "per-prompt-sidecar", "unified"
    ] = "none"
)

Reads prompts from a directory tree.

Two layouts are supported via the constructor:

  • layout="per-label" (default): <root>/<label>/<name>.j2. The label subdirectory keeps name-collisions across labels distinct (e.g., prompts/production/greeting.j2 and prompts/staging/greeting.j2). A filesystem backend may interpret label as a subdirectory or filename suffix; this is the subdirectory variant.
  • layout="flat": <root>/<name>.j2. The same template is returned regardless of which label was requested; the Prompt's label field is the requested label verbatim. Useful when label-based A/B routing is driven by a :class:~openarmature.prompts.label_resolver.LabelResolver rather than a directory tree.

The version field is derived from the template content hash (first 16 hex chars of the SHA-256, ~64 bits) so two file contents map deterministically to two distinct version strings without needing a sidecar metadata file. The 16-char prefix puts the birthday-paradox collision boundary at ~4B distinct templates, well past any realistic single-backend exposure.

Optional sampling_source populates Prompt.sampling from a sidecar file, per the informative filesystem convention:

  • "none" (default): never populate sampling.
  • "per-prompt-sidecar": read <name>.config.json from the same directory as the template (i.e., <root>/<label>/<name>.config.json under per-label layout, <root>/<name>.config.json under flat). A missing sidecar leaves sampling = None.
  • "unified": read <root>/prompt_configs.json at backend construction time and key into it by prompt name. A name not in the unified map leaves sampling = None. Construction raises :class:PromptStoreUnavailable if the file exists but cannot be parsed.

This backend reads templates from disk on every fetch; no caching.

fetch async

fetch(
    name: str,
    label: str = "production",
    *,
    cache_ttl_seconds: int | None = None
) -> Prompt

Read the prompt template and (optionally) its sidecar sampling config.

Returns a Prompt whose version is the leading 16 hex chars of the template's SHA-256 and template_hash is the full digest. Raises PromptNotFound when the template is missing and PromptStoreUnavailable on other I/O errors.

The filesystem backend is cacheless, so cache_ttl_seconds is accepted for protocol conformance and ignored.

PromptError

Bases: Exception

Base for prompt-management errors. Subclasses set category to one of the canonical identifier strings.

PromptNotFound

PromptNotFound(
    *args: Any,
    name: str,
    label: str,
    backend: str | None = None
)

Bases: PromptError

Raised when no prompt matches (name, label).

Non-transient: retrying the same name + label will not succeed without changing the backends or the prompt store contents.

PromptRenderError

PromptRenderError(
    *args: Any,
    name: str,
    version: str,
    label: str,
    variables: dict[str, Any],
    description: str
)

Bases: PromptError

Raised when render fails: undefined variable under strict handling, template parse error, or variable-coercion failure.

Carries the source prompt's identity plus the variable mapping and a description of the render failure.

Non-transient: retrying the same render with the same prompt + variables will not succeed. Callers whose backend serves a fixed template later should re-fetch + re-render rather than relying on retry-middleware to auto-retry the failed render.

PromptStoreUnavailable

PromptStoreUnavailable(
    *args: Any,
    name: str | None = None,
    label: str | None = None,
    backends_tried: list[str] | None = None,
    causes: list[BaseException] | None = None
)

Bases: PromptError

Raised when backend infrastructure fails: network unreachable, filesystem I/O error, vendor API 5xx, vendor API timeout.

Transient: the same fetch may succeed when the backend recovers. PromptManager.fetch raises this only after ALL composed backends raise it; in that aggregate case backends_tried lists the backends consulted (in order) and causes carries the per-backend exceptions index-aligned to backends_tried so operators can distinguish "backend A 503 + backend B 503" from "backend A 503 + backend B OSError". The __cause__ chain still points at the last unavailable for stack-trace continuity.

PromptGroup

Bases: BaseModel

An ordered N≥2 sequence of PromptResult instances under one logical observability grouping.

The group is a structural hint to observability, not a control-flow primitive. User code is responsible for executing each member's LLM call. The group's contribution is the group_name that observability propagates onto every member call's span so trace UIs can render them as one unit.

Attributes:

Name Type Description
group_name str

Stable identifier for this group pattern.

members list[PromptResult]

Ordered sequence of at least two PromptResult instances. Order matches the application's intended call sequence; sequential execution is not required.

LabelResolver

Bases: Protocol

Resolves a prompt name to the label to fetch under.

Implementations MUST follow the fallback chain in :meth:resolve: per-name override > default override > the "production" fallback.

resolve

resolve(name: str) -> str

Return the label to fetch name under.

Synchronous; deterministic for given resolver state.

MappingLabelResolver

MappingLabelResolver(mapping: Mapping[str, str])

Reference resolver backed by a static name → label mapping.

The mapping recognizes one reserved key, "default", as the resolver's default-override; every other key is a per-name override. Construct from a literal dict in code or from a parsed JSON file at startup; the resolver is immutable after construction.

>>> r = MappingLabelResolver({"default": "production", "experimental": "staging"})
>>> r.resolve("experimental")
'staging'
>>> r.resolve("anything-else")
'production'

PromptManager

PromptManager(
    *backends: PromptBackend,
    label_resolver: LabelResolver | None = None,
    jinja_undefined: type[Undefined] = StrictUndefined
)

Composes one or more PromptBackends and exposes fetch + render.

Users interact with the manager; backends are an implementation detail of construction. The manager owns:

  • fetch: consults backends in order with fallback semantics.
  • render: synchronous local string transform; produces a PromptResult.
  • get: convenience: render(await fetch(...), variables).

Constructor knobs:

  • label_resolver: optional LabelResolver consulted by :meth:fetch / :meth:get when no explicit label argument is supplied (step 2 of the fallback chain).
  • jinja_undefined: Jinja Undefined subclass for render-time variable resolution. Default StrictUndefined for strict-by-default rendering; pass jinja2.ChainableUndefined or any other Undefined subclass to opt out.

fetch async

fetch(
    name: str,
    label: str | None = None,
    *,
    cache_ttl_seconds: int | None = None
) -> Prompt

Consult composed backends in order, applying the fallback chain.

Label is resolved by a three-step chain: explicit argument > configured LabelResolver > the "production" fallback.

  • First successful fetch wins; further backends are not consulted.
  • PromptNotFound from any backend STOPS the chain: the error propagates. Logical absence MUST NOT silently substitute a stale alternative.
  • PromptStoreUnavailable from a backend continues to the next. After ALL backends are exhausted with unavailable failures, the manager raises PromptStoreUnavailable.

cache_ttl_seconds is a read-side cache control forwarded to each backend's fetch: None keeps current behavior, 0 forces a fresh read, N > 0 bounds a served entry's staleness to N seconds; a negative value is rejected. Cacheless backends ignore it.

render

render(
    prompt: Prompt,
    variables: Mapping[str, Any] | None = None,
    *,
    placeholders: (
        Mapping[str, Sequence[Message]] | None
    ) = None
) -> PromptResult

Apply variables (and optionally placeholders) and return a PromptResult.

Render is synchronous; no I/O. Variables are strict by default: a template reference to a name not in variables raises PromptRenderError.

For a :class:TextPrompt, placeholders is ignored ("a Text-prompt renders to exactly one Message with role: "user" and content equal to the rendered template text"). Implementations MUST NOT raise on a non-empty placeholders mapping passed alongside a Text prompt.

For a :class:ChatPrompt, the chat_template is rendered segment-by-segment — content segments substitute variables into the text (or per-block content) and produce one Message per segment; placeholder segments inject the caller-supplied list[Message] from placeholders[<name>]. An empty injected list is valid (the chat-history "first turn" case); an unfilled placeholder name raises prompt_render_error.

get async

get(
    name: str,
    label: str | None = None,
    variables: Mapping[str, Any] | None = None,
    *,
    placeholders: (
        Mapping[str, Sequence[Message]] | None
    ) = None,
    cache_ttl_seconds: int | None = None
) -> PromptResult

Convenience equivalent to render(await fetch(name, label), variables).

label follows the same three-step resolution as :meth:fetch. placeholders is forwarded to :meth:render. cache_ttl_seconds is forwarded to :meth:fetch (the read-side cache control).

ChatPrompt

Bases: _PromptBase

A role-tagged, multi-segment chat prompt.

chat_template is an ordered list of :class:ChatSegment entries — content segments carrying a role + content (text template or content-blocks template) and placeholder segments carrying a name that the caller fills at render time with a list[Message]. The rendered :class:PromptResult.messages is the in-order concatenation per segment.

ContentSegment

Bases: BaseModel

One role-tagged content segment of a chat prompt.

role is one of the three canonical authoring roles from the Message shape; the fourth role ("tool") is intentionally excluded — tool-result messages have a distinct per-message shape that doesn't map to a template-author surface. Tool-loop content flows through placeholder segments instead.

content is either a single text template (the common case) or an ordered non-empty list of :class:ContentBlockTemplate entries for multimodal user messages (text + image). Image blocks are user-only — a non-user role with an image-block-containing list raises prompt_render_error at render time. Construction-time validation here surfaces the same condition earlier for ergonomic feedback.

ImageInlineBlockTemplate

Bases: BaseModel

Inline base64 image content block template. Renders to an llm-provider inline image block; base64_data and media_type are variable-substituted.

ImageURLBlockTemplate

Bases: BaseModel

URL image content block template. Renders to an llm-provider URL image block; url is variable-substituted.

PlaceholderSegment

Bases: BaseModel

A placeholder slot in a chat prompt. At render time the caller supplies a list[Message] to inject in place of this segment; an empty list injects zero messages (valid; the first-turn case), while an absent mapping entry raises prompt_render_error.

The placeholder name MUST match [A-Za-z_][A-Za-z0-9_]* — ASCII identifier shape — to avoid collision with backend placeholder syntax.

PromptResult

Bases: BaseModel

The rendered output of applying variables to a prompt.

Carries the rendered Message sequence (ready to pass to Provider.complete()) plus the source prompt's identity metadata and a rendered_hash that captures the rendered content.

The rendered_hash is the cache-key value most useful to downstream consumers: two renders with the same template AND the same variables produce the same hash.

Attributes:

Name Type Description
name str

Propagated from the source Prompt.

version str

Propagated from the source Prompt.

label str

Propagated from the source Prompt.

template_hash str

Propagated from the source Prompt.

rendered_hash str

SHA-256 of the canonical serialization of the rendered messages list.

messages list[Message]

Ordered non-empty sequence of Message records.

variables dict[str, Any]

Variable mapping used to render. v1 policy: pass-through unchanged (no automatic redaction). Keys are always preserved; future redaction policies would redact values, never strip keys.

fetched_at datetime

Propagated from the source Prompt.

rendered_at datetime

Time this PromptResult was rendered. Distinct from fetched_at: a single fetched prompt may render many times.

SamplingConfig

Bases: RuntimeConfig

Per-prompt sampling configuration. Shape-compatible with RuntimeConfig.

TextBlockTemplate

Bases: BaseModel

Text content block template. Renders to an llm-provider text block carrying the variable-substituted text.

TextPrompt

Bases: _PromptBase

An unrendered single-string template plus identity metadata.

Renders to a single :class:UserMessage carrying the substituted template text. Text-prompts render to exactly one Message with role: "user"; multi-message and multimodal prompts go through :class:ChatPrompt.

placeholders passed to PromptManager.render are ignored for Text-prompt rendering.

current_prompt_group

current_prompt_group() -> PromptGroup | None

Return the innermost active PromptGroup, or None.

current_prompt_result

current_prompt_result() -> PromptResult | None

Return the innermost active PromptResult, or None.

with_active_prompt

with_active_prompt(result: PromptResult) -> Iterator[None]

Mark result as the active prompt for downstream LLM calls.

When the observability extra is installed and an LLM call fires inside this context, the OTel observer surfaces openarmature.prompt.name / version / label / template_hash / rendered_hash on the LLM-call span.

Nesting is innermost-wins.

with_active_prompt_group

with_active_prompt_group(
    group: PromptGroup,
) -> Iterator[None]

Mark group as the active prompt group for downstream LLM calls.

When an LLM call fires inside this context, the OTel observer surfaces openarmature.prompt.group_name on the LLM-call span, alongside any per-prompt attributes from a concurrently active with_active_prompt.

Nesting is innermost-wins.

compute_rendered_hash

compute_rendered_hash(messages: list[Message]) -> str

SHA-256 over a canonical JSON serialization of messages.

Preserves message boundaries, roles, content (including content-block structure), and tool_calls. json.dumps(sort_keys=True, separators=(",", ":")) over the per-message model_dump(mode="json") is deterministic across runs; datetimes serialize as ISO-8601 strings.

compute_template_hash

compute_template_hash(template_source: str) -> str

SHA-256 over the UTF-8 bytes of the raw template source.