openarmature.prompts¶
Prompt-management capability — fetch, render, and trace named prompts.
PromptBackend ¶
Bases: Protocol
Backend protocol — implementations and sibling packages plug into this.
A PromptBackend exposes one operation: fetch a prompt by name
and label. Backends do NOT render; rendering is the manager's
concern.
Operation semantics:
fetch()MUST be reentrant: multiple concurrent calls on the same backend are permitted.fetch()does NOT render or otherwise mutate the template.fetch()MUST raisePromptNotFoundwhen no prompt matches(name, label).fetch()MUST raisePromptStoreUnavailablewhen the backend is unreachable (network failure, filesystem I/O error, vendor API timeout).
Backends MAY cache their own results internally. When a backend
serves a cached result, the returned Prompt's template_hash
MUST still be correct for the served template (caching MUST NOT
break content-addressing), and fetched_at MUST reflect the
original fetch time, not the cache hit time.
FilesystemPromptBackend ¶
Reads prompts from a directory tree.
Layout convention: <root>/<label>/<name>.j2. The label
subdirectory keeps name-collisions across labels distinct
(e.g., prompts/production/greeting.j2 and
prompts/staging/greeting.j2). Spec §5 permits filesystem
backends to interpret label as "a subdirectory or filename
suffix"; this backend picks subdirectory.
The version field is derived from the template content hash
(first 16 hex chars of the SHA-256, ~64 bits) so two file
contents map deterministically to two distinct version strings
without needing a sidecar metadata file. Per spec §3, this
satisfies the "stable identifier" requirement. The 16-char
prefix puts the birthday-paradox collision boundary at ~4B
distinct templates — well past any realistic single-backend
exposure. Higher-scale backends should widen further or pick a
different stable identifier (semver from a sidecar metadata
file, git short-SHAs, etc.).
This backend reads from disk on every fetch — no caching. A
caching backend (e.g., openarmature-langfuse) that returns
cached results MUST preserve the original fetched_at on the
returned Prompt, not the cache-hit time, per spec §3.
PromptError ¶
Bases: Exception
Base for prompt-management errors. Subclasses set category
to one of the canonical identifier strings.
PromptNotFound ¶
Bases: PromptError
Raised when no prompt matches (name, label).
Non-transient: retrying the same name + label will not succeed without changing the backends or the prompt store contents.
PromptRenderError ¶
PromptRenderError(
*args: Any,
name: str,
version: str,
label: str,
variables: dict[str, Any],
description: str
)
Bases: PromptError
Raised when render fails: undefined variable under strict handling, template parse error, or variable-coercion failure.
Carries the source prompt's identity plus the variable mapping and a description of the render failure.
Non-transient per spec §10: retrying the same render with the same prompt + variables will not succeed. Callers whose backend serves a fixed template later should re-fetch + re-render rather than relying on retry-middleware to auto-retry the failed render.
PromptStoreUnavailable ¶
PromptStoreUnavailable(
*args: Any,
name: str | None = None,
label: str | None = None,
backends_tried: list[str] | None = None,
causes: list[BaseException] | None = None
)
Bases: PromptError
Raised when backend infrastructure fails: network unreachable, filesystem I/O error, vendor API 5xx, vendor API timeout.
Transient: the same fetch may succeed when the backend recovers.
PromptManager.fetch raises this only after ALL composed
backends raise it; in that aggregate case backends_tried
lists the backends consulted (in order) and causes carries
the per-backend exceptions index-aligned to backends_tried
so operators can distinguish "backend A 503 + backend B 503"
from "backend A 503 + backend B OSError". The __cause__ chain
still points at the last unavailable for stack-trace continuity.
PromptGroup ¶
Bases: BaseModel
An ordered N≥2 sequence of PromptResult instances under one logical observability grouping.
The group is a structural hint to observability, not a control-flow
primitive. User code is responsible for executing each member's
LLM call. The group's contribution is the group_name that
observability propagates onto every member call's span so trace
UIs can render them as one unit.
Attributes:
| Name | Type | Description |
|---|---|---|
group_name |
str
|
Stable identifier for this group pattern. |
members |
list[PromptResult]
|
Ordered sequence of at least two PromptResult instances. Order matches the application's intended call sequence; the spec does not require sequential execution. |
PromptManager ¶
PromptManager(*backends: PromptBackend)
Composes one or more PromptBackends and exposes fetch + render.
Users interact with the manager; backends are an implementation detail of construction. The manager owns:
fetch— consults backends in order per §8 fallback semantics.render— synchronous local string transform; produces aPromptResult.get— convenience:render(await fetch(...), variables).
fetch
async
¶
fetch(name: str, label: str = 'production') -> Prompt
Consult composed backends in order, applying §8 fallback.
- First successful fetch wins; further backends are not consulted.
PromptNotFoundfrom any backend STOPS the chain — the error propagates. Logical absence MUST NOT silently substitute a stale alternative.PromptStoreUnavailablefrom a backend continues to the next. After ALL backends are exhausted with unavailable failures, the manager raisesPromptStoreUnavailable.
render ¶
render(
prompt: Prompt, variables: dict[str, Any] | None = None
) -> PromptResult
Apply variables to prompt.template and return a PromptResult.
Render is synchronous — no I/O. Variables are strict by
default per §7: a template reference to a name not in
variables raises PromptRenderError.
The render output is always a single UserMessage carrying
the rendered text in v1. Multi-message decomposition (system
+ user split) is deferred to a follow-on; callers needing
that today fetch the raw template and construct the messages
list manually.
get
async
¶
get(
name: str,
label: str = "production",
variables: dict[str, Any] | None = None,
) -> PromptResult
Convenience equivalent to render(await fetch(name, label), variables).
Prompt ¶
Bases: BaseModel
An unrendered template plus identity metadata.
A prompt carries enough information to be rendered, traced, and
content-addressed without a backend round-trip. template is
the raw template source string (Jinja2 syntax in Python);
compilation happens on render so Prompt stays serializable
and engine-agnostic.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Stable identifier within the backend. |
version |
str
|
Backend-defined version string. Two distinct version strings denote distinct prompt contents. |
label |
str
|
The label under which this prompt was fetched (e.g., "production", "latest", "variant-a"). |
template |
str
|
Raw template source. |
template_hash |
str
|
SHA-256 of the raw template source. Format
|
fetched_at |
datetime
|
Time the prompt was fetched from its backend.
When a caching backend serves a cached result,
|
metadata |
dict[str, Any] | None
|
Optional backend-supplied metadata. |
PromptResult ¶
Bases: BaseModel
The rendered output of applying variables to a prompt.
Carries the rendered Message sequence (ready to pass to
Provider.complete()) plus the source prompt's identity
metadata and a rendered_hash that captures the rendered
content.
The rendered_hash is the cache-key value most useful to
downstream consumers: two renders with the same template AND
the same variables produce the same hash.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Propagated from the source Prompt. |
version |
str
|
Propagated from the source Prompt. |
label |
str
|
Propagated from the source Prompt. |
template_hash |
str
|
Propagated from the source Prompt. |
rendered_hash |
str
|
SHA-256 of the canonical serialization of the rendered messages list. |
messages |
list[Message]
|
Ordered non-empty sequence of |
variables |
dict[str, Any]
|
Variable mapping used to render. v1 policy: pass-through unchanged (no automatic redaction). Keys are always preserved; future redaction policies would redact values, never strip keys. |
fetched_at |
datetime
|
Propagated from the source Prompt. |
rendered_at |
datetime
|
Time this PromptResult was rendered. Distinct
from |
current_prompt_group ¶
current_prompt_group() -> PromptGroup | None
Return the innermost active PromptGroup, or None.
current_prompt_result ¶
current_prompt_result() -> PromptResult | None
Return the innermost active PromptResult, or None.
with_active_prompt ¶
with_active_prompt(result: PromptResult) -> Iterator[None]
Mark result as the active prompt for downstream LLM calls.
When the observability extra is installed and an LLM call fires
inside this context, the OTel observer surfaces
openarmature.prompt.name / version / label /
template_hash / rendered_hash on the LLM-call span.
Nesting is innermost-wins.
with_active_prompt_group ¶
with_active_prompt_group(
group: PromptGroup,
) -> Iterator[None]
Mark group as the active prompt group for downstream LLM calls.
When an LLM call fires inside this context, the OTel observer
surfaces openarmature.prompt.group_name on the LLM-call
span, alongside any per-prompt attributes from a concurrently
active with_active_prompt.
Nesting is innermost-wins.
compute_rendered_hash ¶
SHA-256 over a canonical JSON serialization of messages.
Preserves message boundaries, roles, content (including
content-block structure per llm-provider §3.1), and tool_calls.
json.dumps(sort_keys=True, separators=(",", ":")) over the
per-message model_dump(mode="json") is deterministic across
runs; datetimes serialize as ISO-8601 strings.
compute_template_hash ¶
SHA-256 over the UTF-8 bytes of the raw template source.