Skip to content

openarmature.prompts

Prompt-management capability — fetch, render, and trace named prompts.

PromptBackend

Bases: Protocol

Backend protocol — implementations and sibling packages plug into this.

A PromptBackend exposes one operation: fetch a prompt by name and label. Backends do NOT render; rendering is the manager's concern.

Operation semantics:

  • fetch() MUST be reentrant: multiple concurrent calls on the same backend are permitted.
  • fetch() does NOT render or otherwise mutate the template.
  • fetch() MUST raise PromptNotFound when no prompt matches (name, label).
  • fetch() MUST raise PromptStoreUnavailable when the backend is unreachable (network failure, filesystem I/O error, vendor API timeout).

Backends MAY cache their own results internally. When a backend serves a cached result, the returned Prompt's template_hash MUST still be correct for the served template (caching MUST NOT break content-addressing), and fetched_at MUST reflect the original fetch time, not the cache hit time.

FilesystemPromptBackend

FilesystemPromptBackend(root: Path)

Reads prompts from a directory tree.

Layout convention: <root>/<label>/<name>.j2. The label subdirectory keeps name-collisions across labels distinct (e.g., prompts/production/greeting.j2 and prompts/staging/greeting.j2). Spec §5 permits filesystem backends to interpret label as "a subdirectory or filename suffix"; this backend picks subdirectory.

The version field is derived from the template content hash (first 16 hex chars of the SHA-256, ~64 bits) so two file contents map deterministically to two distinct version strings without needing a sidecar metadata file. Per spec §3, this satisfies the "stable identifier" requirement. The 16-char prefix puts the birthday-paradox collision boundary at ~4B distinct templates — well past any realistic single-backend exposure. Higher-scale backends should widen further or pick a different stable identifier (semver from a sidecar metadata file, git short-SHAs, etc.).

This backend reads from disk on every fetch — no caching. A caching backend (e.g., openarmature-langfuse) that returns cached results MUST preserve the original fetched_at on the returned Prompt, not the cache-hit time, per spec §3.

PromptError

Bases: Exception

Base for prompt-management errors. Subclasses set category to one of the canonical identifier strings.

PromptNotFound

PromptNotFound(
    *args: Any,
    name: str,
    label: str,
    backend: str | None = None
)

Bases: PromptError

Raised when no prompt matches (name, label).

Non-transient: retrying the same name + label will not succeed without changing the backends or the prompt store contents.

PromptRenderError

PromptRenderError(
    *args: Any,
    name: str,
    version: str,
    label: str,
    variables: dict[str, Any],
    description: str
)

Bases: PromptError

Raised when render fails: undefined variable under strict handling, template parse error, or variable-coercion failure.

Carries the source prompt's identity plus the variable mapping and a description of the render failure.

Non-transient per spec §10: retrying the same render with the same prompt + variables will not succeed. Callers whose backend serves a fixed template later should re-fetch + re-render rather than relying on retry-middleware to auto-retry the failed render.

PromptStoreUnavailable

PromptStoreUnavailable(
    *args: Any,
    name: str | None = None,
    label: str | None = None,
    backends_tried: list[str] | None = None,
    causes: list[BaseException] | None = None
)

Bases: PromptError

Raised when backend infrastructure fails: network unreachable, filesystem I/O error, vendor API 5xx, vendor API timeout.

Transient: the same fetch may succeed when the backend recovers. PromptManager.fetch raises this only after ALL composed backends raise it; in that aggregate case backends_tried lists the backends consulted (in order) and causes carries the per-backend exceptions index-aligned to backends_tried so operators can distinguish "backend A 503 + backend B 503" from "backend A 503 + backend B OSError". The __cause__ chain still points at the last unavailable for stack-trace continuity.

PromptGroup

Bases: BaseModel

An ordered N≥2 sequence of PromptResult instances under one logical observability grouping.

The group is a structural hint to observability, not a control-flow primitive. User code is responsible for executing each member's LLM call. The group's contribution is the group_name that observability propagates onto every member call's span so trace UIs can render them as one unit.

Attributes:

Name Type Description
group_name str

Stable identifier for this group pattern.

members list[PromptResult]

Ordered sequence of at least two PromptResult instances. Order matches the application's intended call sequence; the spec does not require sequential execution.

PromptManager

PromptManager(*backends: PromptBackend)

Composes one or more PromptBackends and exposes fetch + render.

Users interact with the manager; backends are an implementation detail of construction. The manager owns:

  • fetch — consults backends in order per §8 fallback semantics.
  • render — synchronous local string transform; produces a PromptResult.
  • get — convenience: render(await fetch(...), variables).

fetch async

fetch(name: str, label: str = 'production') -> Prompt

Consult composed backends in order, applying §8 fallback.

  • First successful fetch wins; further backends are not consulted.
  • PromptNotFound from any backend STOPS the chain — the error propagates. Logical absence MUST NOT silently substitute a stale alternative.
  • PromptStoreUnavailable from a backend continues to the next. After ALL backends are exhausted with unavailable failures, the manager raises PromptStoreUnavailable.

render

render(
    prompt: Prompt, variables: dict[str, Any] | None = None
) -> PromptResult

Apply variables to prompt.template and return a PromptResult.

Render is synchronous — no I/O. Variables are strict by default per §7: a template reference to a name not in variables raises PromptRenderError.

The render output is always a single UserMessage carrying the rendered text in v1. Multi-message decomposition (system + user split) is deferred to a follow-on; callers needing that today fetch the raw template and construct the messages list manually.

get async

get(
    name: str,
    label: str = "production",
    variables: dict[str, Any] | None = None,
) -> PromptResult

Convenience equivalent to render(await fetch(name, label), variables).

Prompt

Bases: BaseModel

An unrendered template plus identity metadata.

A prompt carries enough information to be rendered, traced, and content-addressed without a backend round-trip. template is the raw template source string (Jinja2 syntax in Python); compilation happens on render so Prompt stays serializable and engine-agnostic.

Attributes:

Name Type Description
name str

Stable identifier within the backend.

version str

Backend-defined version string. Two distinct version strings denote distinct prompt contents.

label str

The label under which this prompt was fetched (e.g., "production", "latest", "variant-a").

template str

Raw template source.

template_hash str

SHA-256 of the raw template source. Format "sha256:<hex>".

fetched_at datetime

Time the prompt was fetched from its backend. When a caching backend serves a cached result, fetched_at MUST reflect the original fetch time, not the cache hit time.

metadata dict[str, Any] | None

Optional backend-supplied metadata.

PromptResult

Bases: BaseModel

The rendered output of applying variables to a prompt.

Carries the rendered Message sequence (ready to pass to Provider.complete()) plus the source prompt's identity metadata and a rendered_hash that captures the rendered content.

The rendered_hash is the cache-key value most useful to downstream consumers: two renders with the same template AND the same variables produce the same hash.

Attributes:

Name Type Description
name str

Propagated from the source Prompt.

version str

Propagated from the source Prompt.

label str

Propagated from the source Prompt.

template_hash str

Propagated from the source Prompt.

rendered_hash str

SHA-256 of the canonical serialization of the rendered messages list.

messages list[Message]

Ordered non-empty sequence of Message records.

variables dict[str, Any]

Variable mapping used to render. v1 policy: pass-through unchanged (no automatic redaction). Keys are always preserved; future redaction policies would redact values, never strip keys.

fetched_at datetime

Propagated from the source Prompt.

rendered_at datetime

Time this PromptResult was rendered. Distinct from fetched_at: a single fetched prompt may render many times.

current_prompt_group

current_prompt_group() -> PromptGroup | None

Return the innermost active PromptGroup, or None.

current_prompt_result

current_prompt_result() -> PromptResult | None

Return the innermost active PromptResult, or None.

with_active_prompt

with_active_prompt(result: PromptResult) -> Iterator[None]

Mark result as the active prompt for downstream LLM calls.

When the observability extra is installed and an LLM call fires inside this context, the OTel observer surfaces openarmature.prompt.name / version / label / template_hash / rendered_hash on the LLM-call span.

Nesting is innermost-wins.

with_active_prompt_group

with_active_prompt_group(
    group: PromptGroup,
) -> Iterator[None]

Mark group as the active prompt group for downstream LLM calls.

When an LLM call fires inside this context, the OTel observer surfaces openarmature.prompt.group_name on the LLM-call span, alongside any per-prompt attributes from a concurrently active with_active_prompt.

Nesting is innermost-wins.

compute_rendered_hash

compute_rendered_hash(messages: list[Message]) -> str

SHA-256 over a canonical JSON serialization of messages.

Preserves message boundaries, roles, content (including content-block structure per llm-provider §3.1), and tool_calls. json.dumps(sort_keys=True, separators=(",", ":")) over the per-message model_dump(mode="json") is deterministic across runs; datetimes serialize as ISO-8601 strings.

compute_template_hash

compute_template_hash(template_source: str) -> str

SHA-256 over the UTF-8 bytes of the raw template source.