Metadata-Version: 2.4
Name: arbis-llmwrap
Version: 0.3.7
Summary: Decorator to wrap LLM calls for production use with flexible prompt binding.
License: MIT
Keywords: llm,decorator,prompt,logging,cython
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.31
Requires-Dist: cryptography>=42.0
Provides-Extra: dev
Requires-Dist: cython>=3.0; extra == "dev"
Requires-Dist: wheel; extra == "dev"
Provides-Extra: integration
Requires-Dist: openai<2.31.0,>=2.30.0; extra == "integration"
Requires-Dist: jiter<0.14.0,>=0.13.0; extra == "integration"
Dynamic: license-file

# llmwrap

Usage guide for `llmwrap` with **eight distinct integration patterns**. Each pattern appears as a pair: `wrap_llm_call` (decorator) and `wrap_llm_line` (inline) implement the same behavior; only prompt binding differs. Additional permutations (OpenRouter-only snippets, naming variants, LangChain-style flows) remain in `tests/testing_different_interfaces.py`.

This document intentionally covers API usage only. Internal algorithm details are not disclosed.

## Table of Contents

- [Install](#install)
- [Public APIs](#public-apis)
- [Config Fields](#config-fields)
  - [Workflow graph example (linear chain)](#workflow-graph-example-linear-chain)
- [Distinct example pairs](#distinct-example-pairs)
  - [1. Chat Completions](#1-chat-completions)
  - [2. Responses API](#2-responses-api)
  - [3. Multipart assistant content](#3-multipart-assistant-content)
  - [4. Tool calls with passthrough](#4-tool-calls-with-passthrough)
  - [5. Nested wrapped calls](#5-nested-wrapped-calls)
  - [6. Custom response extractor](#6-custom-response-extractor)
  - [7. Non-reconstructable model output](#7-non-reconstructable-model-output)
  - [8. Tool agent with wrapped LLM tools](#8-tool-agent-with-wrapped-llm-tools)
- [Console output](#console-output)
- [Notes](#notes)
- [License](#license)

## Install

```bash
pip install arbis-llmwrap
```

## Public APIs

```python
from llmwrap import wrap_llm_call, wrap_llm_line, openai_sdk_result_text
```

## Config Fields

### Core Mental Model

For each wrapped LLM call, you are describing where this call sits in your orchestration graph.

- `workflow_group_name` = top-most business/process container
- `Workflow_name` = a workflow inside that group
- `agent_name` = the exact executing step/agent inside that workflow
- `Run_Id` = one end-to-end execution id tying all related calls together

So for one user request that traverses many agents, all calls share the same `Run_Id`, but `agent_name` changes per step.

**Golden rule:** one `Run_Id` = one user query/request. Reuse it across all hops for that query; allocate a new one for the next query.

### Shared Fields (`wrap_llm_call` and `wrap_llm_line`)

#### `company_name: str` (Required)
What it is: Organization identifier.  
Why it matters: High-level tenant/org partition for ingest and analytics.  
How to fill: Stable canonical company name, e.g. `"ArbisAI"`.  
Best practice: Keep consistent casing/spelling across all services.

#### `project_name: str` (Required)
What it is: Product/application name inside the company.  
Why it matters: Separates telemetry for different apps under same company.  
How to fill: Stable app id, e.g. `"Arbis-Decorator"` or `"support-copilot"`.  
Best practice: Do not use random env-specific names unless intentional.

#### `agent_name: str` (Required)
What it is: The current executing agent/step label for this call.  
Why it matters: This is the node identifier in your execution chain.  
How to fill: Step-specific name, e.g. `"Agent-1"`, `"Retriever"`, `"Writer"`.  
Best practice: Human-readable and unique enough within workflow.

#### `Run_Id: str | None` (Optional, strongly recommended)
What it is: Correlation id for one complete run/user query/session.  
Why it matters: Lets you reconstruct the full path across agents/workflows.  
How to fill: Generate once at run start (`uuid4`) and reuse on every call in that run.  
Example: All A1 -> A2 -> B3 -> B4 calls carry same `Run_Id`.

#### `Workflow_name: str | None` (Optional, recommended)
What it is: Workflow containing this agent.  
Why it matters: Groups agent nodes into workflow-level segments.  
How to fill: Use logical workflow id, e.g. `"A"`, `"B"`, `"triage_flow"`.  
Best practice: Stable naming so workflow-level analytics remain clean.

#### `workflow_group_name: str | None` (Optional, recommended)
What it is: Top-level grouping for one or more workflows.  
Why it matters: Organizes related workflows under a single umbrella.  
How to fill: Product/process group, e.g. `"Workflow Group 1"` or `"customer_support"`.  
Best practice: This should typically remain constant while `Workflow_name` may change within the run.

#### `agent_parent: list[str] | None` (Optional)
What it is: Upstream agent lineage for current agent call.  
Why it matters: Captures agent-to-agent dependency edges.  
How to fill:

- Root/first step: `None`
- Next step consuming Agent-1 output: `["Agent-1"]`
- For deeper nesting, include lineage order if you track ancestry.

Best practice: At minimum include immediate parent when there is one.

#### `Workflow_parent_name: list[str] | None` (Optional)
What it is: Parent workflow lineage when transitioning across workflows.  
Why it matters: Captures workflow-level dependency (A -> B, etc.).  
How to fill:

- Within same workflow: often `None`
- First step in child workflow B after A: `["A"]`

Best practice: Set it at workflow boundary transitions for clear graph reconstruction.

#### `metadata: dict | None` (Optional)
What it is: Arbitrary JSON-serializable context for user/request/business metadata.  
Why it matters: Adds business observability and audit dimensions.  
How to fill: Include only safe, needed keys. Typical:

- `user_id`
- `role`
- `tenant_id`
- `request_id`
- `session_id`
- `channel`

Best practice:

- Avoid secrets/PII unless policy permits
- Keep schema consistent across calls
- You can enrich this per step if needed, but keep core keys stable per run

#### `secret_key: str` (Required)
What it is: Wrapper authentication/authorization key used by ingest path.  
Why it matters: Required for secure wrapper operations.  
How to fill: From env var (`WRAP_SECRET_KEY`), never hardcoded.  
Best practice: Rotate regularly and keep out of logs.

#### `max_tries: int = 1` (Optional)
What it is: Retry budget for wrapped execution pipeline.  
Why it matters: Improves resilience against transient errors.  
How to fill: Integer `>= 1`; often 1 to 3.  
Best practice: Use higher values only where retries are safe and expected.

#### `response_extractor: Callable[[Any], str] | None` (Optional)
What it is: Custom function that extracts answer text from raw model output.  
Why it matters: Needed when return shape is custom/non-standard.  
How to fill: Provide function returning `str`, e.g. `lambda obj: obj["raw_text"]`.  
Best practice: Pair with merge/writeback settings if preserving object shape.

#### `prompt_json_pointer: str | None` (Optional)
What it is: RFC 6901 pointer to prompt field inside structured prompt payload.  
Why it matters: Wrap only one field instead of whole JSON payload.  
How to fill: Example `"/messages/0/content"` or `"/query"`.  
Best practice: Validate pointer path exists in your payload schema.

#### `passthrough_when: Callable[[Any], bool] | None` (Optional)
What it is: Predicate to bypass parse/ingest flow and return raw output unchanged.  
Why it matters: Useful for tool-calls or intermediate SDK objects.  
How to fill: Function returning `True` when output should pass through.  
Example: detect `tool_calls` and skip final merge logic.

#### `return_merger: Callable[[Any, str], Any] | None` (Optional)
What it is: Hook to combine original output object with extracted answer text.  
Why it matters: Gives full control over final response shape.  
How to fill: `(base_output, answer_text) -> desired_output`.  
Best practice: Use when default merge behavior does not match your API contract.

#### `response_answer_json_pointer: str | None` (Optional)
What it is: Explicit pointer for where extracted answer should be written back.  
Why it matters: Deterministic answer placement in complex return objects.  
How to fill: RFC 6901 path like `"/choices/0/message/content"`.  
Best practice: Prefer explicit pointer in non-standard/ambiguous structures.

### `wrap_llm_call`-Specific Field

#### `prompt_arg: str = "prompt"` (Optional param with default, functionally required correctness)
What it is: Name of decorated function argument containing prompt payload.  
Why it matters: Wrapper must know which argument to wrap.  
How to fill: Match actual function signature, e.g. `"messages"`, `"question"`.  
Best practice: Always set explicitly if your prompt arg is not named `prompt`.

### `wrap_llm_line`-Specific Fields

#### `llm_call: Callable[[Any], Any]` (Required)
What it is: Callable that executes model request using wrapped prompt payload.  
Why it matters: This is the actual invocation path for line-level API.  
How to fill: `lambda prompt: client.chat.completions.create(...)` etc.  
Best practice: Keep deterministic and side-effect-free except model call.

#### `prompt: Any` (Required)
What it is: Raw prompt input to be wrapped.  
Why it matters: Source content entering wrapper pipeline.  
How to fill:

- plain string prompt, or
- dict/JSON string when using `prompt_json_pointer`

Best practice: Ensure shape matches what `llm_call` expects after wrapping.

### Practical Fill Pattern (for your hierarchy)

For one incoming user query:

1. Generate `run_id` once.
2. Keep `workflow_group_name` constant for entire orchestration.
3. Set `Workflow_name` based on current workflow segment.
4. Set `agent_name` for current step.
5. Set lineage:
   - first agent: `agent_parent=None`, `Workflow_parent_name=None`
   - downstream in same workflow: `agent_parent=[prev_agent]`
   - first step in new workflow B after A: `Workflow_parent_name=["A"]`
6. Attach `metadata` with user context (`user_id`, `role`, etc.).

### Workflow graph example (linear chain)

This matches the topology and field usage in `tests/test_linear_workflow_group_chain.py`: **four agents**, **two workflows** inside **one workflow group**, with prompt text handed off along the chain.

**Graph (who talks to whom)**

Data flow: **Agent-1 → Agent-2 → Agent-3 → Agent-4**.

```text
workflow_group_name = "Workflow Group 1"     ← same on every call

Workflow_name = "A"
  Agent-1   (root: no parents)
    ↓
  Agent-2   (agent_parent = ["Agent-1"])

Workflow_name = "B"   (starts after A’s last agent)
  Agent-3   (agent_parent = ["Agent-2"], Workflow_parent_name = ["A"])
    ↓
  Agent-4   (agent_parent = ["Agent-3"])
```

**How each field builds the graph**

- **`workflow_group_name`** — Set to **`"Workflow Group 1"`** (or your real group id) on **all** steps so ingest knows every call belongs to the same product/process bucket.
- **`Workflow_name`** — **`"A"`** for Agent-1 and Agent-2; **`"B"`** for Agent-3 and Agent-4. This splits the run into two workflow segments under that group.
- **`agent_name`** — The **current node**: `"Agent-1"` … `"Agent-4"`. Must be distinct per step so each hop is identifiable.
- **`Run_Id`** — **One** id reused on **all four** `wrap_llm_line` calls so analytics can stitch them into a single user/query trace (**one session = one Run_Id**). In this repo, the id is allocated from the VeryTrace run-id API on `WRAP_ARBIS_BASE_URL` and then reused for every agent hop.
- **`agent_parent`** — **Immediate upstream agent name(s)**. `None` only for the first agent. Agent-2 points at Agent-1; Agent-3 at Agent-2; Agent-4 at Agent-3. That encodes the **agent-level** edge list.
- **`Workflow_parent_name`** — **Workflow-level** lineage when you **enter a new workflow** that continues after another. Here workflow **B** follows **A**, so **only Agent-3** (the first step in B) sets `Workflow_parent_name=["A"]`. Agent-4 stays in B with no new workflow transition, so it uses `None` (same as agents fully inside A).

**Optional `metadata`** — Use the **same** dict on each call for fields like `user_id` and `role` so every step in the run carries the same user context; you can add step-specific keys if needed.

**Run_Id API (session id allocation)**

In `tests/test_linear_workflow_group_chain.py`, the session id comes from the run-id API before the first agent call:

- Endpoint: `POST {WRAP_ARBIS_BASE_URL}/api/unique-run-id`
- Request JSON: `{"secret_key": WRAP_SECRET_KEY, "company_name": WRAP_COMPANY_NAME}`
- Response JSON: `{"run_id": "..."}`

Then that `run_id` is passed as `Run_Id` to Agent-1, Agent-2, Agent-3, and Agent-4. This is what links all hops into one session in the graph.

**Step-by-step summary**

| Step | `agent_name` | `Workflow_name` | `workflow_group_name` | `agent_parent` | `Workflow_parent_name` |
|------|--------------|-----------------|------------------------|----------------|-------------------------|
| 1 | `Agent-1` | `A` | `Workflow Group 1` | `None` | `None` |
| 2 | `Agent-2` | `A` | `Workflow Group 1` | `["Agent-1"]` | `None` |
| 3 | `Agent-3` | `B` | `Workflow Group 1` | `["Agent-2"]` | `["A"]` |
| 4 | `Agent-4` | `B` | `Workflow Group 1` | `["Agent-3"]` | `None` |

**Code shape** (prompt strings omitted—see the test file for the full handoff text):

```python
import requests

from llmwrap import openai_sdk_result_text, wrap_llm_line

WORKFLOW_GROUP = "Workflow Group 1"
WORKFLOW_A = "A"
WORKFLOW_B = "B"

# One Run_Id for the whole chain (one session id for all agent hops).
def allocate_run_id_from_api() -> str:
    base = CFG.wrap_arbis_base_url.rstrip("/")
    resp = requests.post(
        f"{base}/api/unique-run-id",
        json={
            "secret_key": CFG.key,
            "company_name": CFG.company,
        },
        timeout=30,
    )
    resp.raise_for_status()
    return str(resp.json()["run_id"])


run_id = allocate_run_id_from_api()

METADATA = {"user_id": "user-42", "role": "analyst"}  # optional; same dict every hop


def run_agent(*, client, model, agent_name, workflow_name, agent_parent, workflow_parent_name, prompt):
    return wrap_llm_line(
        llm_call=lambda p: client.chat.completions.create(
            model=model,
            messages=[{"role": "user", "content": p}],
            temperature=0,
        ),
        prompt=prompt,
        company_name=CFG.company,
        project_name=CFG.project,
        agent_name=agent_name,
        Run_Id=run_id,
        Workflow_name=workflow_name,
        workflow_group_name=WORKFLOW_GROUP,
        agent_parent=agent_parent,
        Workflow_parent_name=workflow_parent_name,
        metadata=METADATA,
        secret_key=CFG.key,
        max_tries=3,
    )


# Agent-1: root of the agent graph, workflow A, no workflow parent.
out1 = run_agent(
    client=openai_client,
    model=CFG.model,
    agent_name="Agent-1",
    workflow_name=WORKFLOW_A,
    agent_parent=None,
    workflow_parent_name=None,
    prompt='...',  # e.g. ask for a single marked line — see test_linear_workflow_group_chain.py
)
text1 = openai_sdk_result_text(out1).strip()

# Agent-2: still workflow A; parent agent is Agent-1.
out2 = run_agent(
    client=openai_client,
    model=CFG.model,
    agent_name="Agent-2",
    workflow_name=WORKFLOW_A,
    agent_parent=["Agent-1"],
    workflow_parent_name=None,
    prompt=f"The previous agent output is:\n{text1}\n...",
)
text2 = openai_sdk_result_text(out2).strip()

# Agent-3: first step in workflow B; consumes Agent-2; workflow B follows A.
out3 = run_agent(
    client=openai_client,
    model=CFG.model,
    agent_name="Agent-3",
    workflow_name=WORKFLOW_B,
    agent_parent=["Agent-2"],
    workflow_parent_name=[WORKFLOW_A],
    prompt=f"Previous workflow A ended with:\n{text2}\n...",
)
text3 = openai_sdk_result_text(out3).strip()

# Agent-4: still workflow B; parent agent Agent-3 only.
out4 = run_agent(
    client=openai_client,
    model=CFG.model,
    agent_name="Agent-4",
    workflow_name=WORKFLOW_B,
    agent_parent=["Agent-3"],
    workflow_parent_name=None,
    prompt=f"Previous agent in workflow B produced:\n{text3}\n...",
)
```

## Distinct example pairs

**Convention.** Snippets assume `CFG`, `openai_client`, `RUN_ID`, `WORKFLOW_NAME`, `WORKFLOW_GROUP_NAME`, `AGENT_PARENT`, `WORKFLOW_PARENT_NAME`, `METADATA`, and (where needed) `TOOLS` / `MESSAGES` exist in your app. The same tracking kwargs appear in every call so ingest can correlate steps.

**How to read this section.** Each numbered scenario is one *behavioral class* for the wrapper. For each class we show the same behavior twice: `wrap_llm_call` binds the prompt via a decorated function argument; `wrap_llm_line` passes `llm_call` and `prompt` at the call site. Ingest and response handling are the same; only binding style differs.

**Quick contrast**

| # | Scenario | What makes it different from the others |
|---|----------|----------------------------------------|
| 1 | Chat Completions | Normal chat completion object (or dict with the same `choices` / `message` / `content` layout). OpenRouter uses this same shape via an OpenAI-compatible client. |
| 2 | Responses API | Uses `client.responses.create` and string `input`, not `chat.completions`. |
| 3 | Multipart assistant content | Assistant `content` is a **list** of parts (e.g. multiple `{"type":"text",...}`), not a single string. |
| 4 | Tool calls + passthrough | `passthrough_when` returns the raw SDK object on tool-call turns so you can run another round trip before a final wrapped answer. |
| 5 | Nested wrapped calls | A parent function calls other wrapped functions; each invocation is a separate wrapped LLM call with its own `agent_name`. |
| 6 | Custom `response_extractor` | Answer lives in a custom field (here `raw_text`); without `return_merger` / answer pointer you often get a **plain string** back. |
| 7 | Non-reconstructable return | Object cannot be deep-copied or rebuilt; the wrapper uses `model_dump()`-style serialization and returns a **dict tree** instead of the original type. |
| 8 | Tool agent + wrapped LLM tools | Top chat turn uses `tools=` and `passthrough_when` (same idea as row 4), but **each tool name is implemented by a separate wrapped LLM function**, so every tool invocation is its own ingest (`agent_name` per tool). |

### 1. Chat Completions

**What this demonstrates.** The common Chat Completions path: `messages` in, completion object out. The library can detect the answer slot and preserve the SDK return shape when possible.

**How it differs.** This is the baseline. **OpenRouter** (or any OpenAI-compatible base URL) is not a separate pattern: use the same `chat.completions.create` call with a different client and model id.

**`wrap_llm_call`** — prompt is the `messages` argument of your function.

```python
@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex1_chat_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="messages",
    max_tries=1,
)
def one_turn(messages):
    return openai_client.chat.completions.create(
        model=CFG.model, messages=messages, temperature=0
    )
```

**`wrap_llm_line`** — prompt lives inside a dict; `prompt_json_pointer` selects the user text to wrap.

```python
payload = {
    "model": CFG.model,
    "messages": [{"role": "user", "content": "one sentence"}],
}
out = wrap_llm_line(
    llm_call=lambda p: openai_client.chat.completions.create(
        model=p["model"], messages=p["messages"], temperature=0
    ),
    prompt=payload,
    prompt_json_pointer="/messages/0/content",
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex1_chat_line",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    max_tries=1,
)
```

### 2. Responses API

**What this demonstrates.** The OpenAI **Responses** API: a string (or structured) `input` and `responses.create`, which returns a different object shape than chat completions.

**How it differs.** From example 1: different method, different prompt parameter name (`input` vs `messages`), different default extraction rules inside the wrapper.

**`wrap_llm_call`**

```python
@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex2_responses_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="question",
    max_tries=1,
)
def ask(question: str):
    return openai_client.responses.create(model=CFG.model, input=question)
```

**`wrap_llm_line`**

```python
out = wrap_llm_line(
    llm_call=lambda prompt: openai_client.responses.create(
        model=CFG.model, input=prompt
    ),
    prompt="Return one sentence.",
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex2_responses_line",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    max_tries=1,
)
```

### 3. Multipart assistant content

**What this demonstrates.** The assistant message `content` is a **list** of segments (multipart), not a single string. The wrapper still extracts or merges answer text from that structure.

**How it differs.** From examples 1–2: tests list-shaped `content` under `choices[0].message`, which is a different merge path than a scalar string.

**`wrap_llm_call`**

```python
@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex3_multipart_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="messages",
    max_tries=1,
)
def one_turn(messages):
    return {
        "choices": [{
            "message": {
                "role": "assistant",
                "content": [
                    {"type": "text", "text": "Part A"},
                    {"type": "text", "text": "Part B"},
                ],
            }
        }]
    }
```

**`wrap_llm_line`**

```python
payload = {"messages": [{"role": "user", "content": "multipart"}]}
out = wrap_llm_line(
    llm_call=lambda _p: {
        "choices": [{
            "message": {
                "role": "assistant",
                "content": [
                    {"type": "text", "text": "Part A"},
                    {"type": "text", "text": "Part B"},
                ],
            }
        }]
    },
    prompt=payload,
    prompt_json_pointer="/messages/0/content",
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex3_multipart_line",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    max_tries=1,
)
```

### 4. Tool calls with passthrough

**What this demonstrates.** When the model returns **tool calls**, you usually want the **raw** completion object back so your app can execute tools and call the model again. `passthrough_when` does that: no parse/ingest on those turns; later turns without tool calls follow the normal wrapped path.

**How it differs.** From examples 1–3: introduces **branching behavior** based on the raw return value and requires **tools** in the request.

**Related pattern.** Example 8 uses the same passthrough mechanism when **each named tool is backed by its own wrapped LLM** (see `test_top_level_tool_agent_with_two_real_tools_and_nested_wrapped_llm_tools` in `tests/testing_different_interfaces.py`).

**`wrap_llm_call`**

```python
def has_tool_calls(raw):
    try:
        return bool(raw.choices[0].message.tool_calls)
    except Exception:
        return False

@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex4_tools_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="messages",
    passthrough_when=has_tool_calls,
    max_tries=1,
)
def run_turn(messages):
    return openai_client.chat.completions.create(
        model=CFG.model, messages=messages, tools=TOOLS, tool_choice="auto"
    )
```

**`wrap_llm_line`**

```python
out = wrap_llm_line(
    llm_call=lambda p: openai_client.chat.completions.create(
        model=p["model"],
        messages=p["messages"],
        tools=p["tools"],
        tool_choice="auto",
    ),
    prompt={"model": CFG.model, "messages": MESSAGES, "tools": TOOLS},
    prompt_json_pointer="/messages/1/content",
    passthrough_when=has_tool_calls,
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex4_tools_line",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    max_tries=1,
)
```

### 5. Nested wrapped calls

**What this demonstrates.** Composition: a **top-level** wrapped function calls **other** wrapped functions. Each inner call is a full wrapper invocation (its own `agent_name`, same or different `Run_Id` depending on how you thread context). This replaces the older README variants that repeated the same idea under different names (separate tools vs “manager” vs “hierarchy”).

**How it differs.** From examples 1–4: not about return shape; about **call graph** and multiple ingest events per user request.

**`wrap_llm_call`**

```python
@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex5_subagent_a",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="question",
    max_tries=1,
)
def subagent_a(question: str):
    return openai_client.responses.create(model=CFG.model, input=question)

@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex5_subagent_b",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="question",
    max_tries=1,
)
def subagent_b(question: str):
    return openai_client.responses.create(model=CFG.model, input=question)

@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex5_top_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="question",
    max_tries=1,
)
def top_agent(question: str):
    return subagent_a(question) + "\n" + subagent_b(question)
```

**`wrap_llm_line`**

```python
def top_agent(question: str):
    a = wrap_llm_line(
        llm_call=lambda q: openai_client.responses.create(model=CFG.model, input=q),
        prompt=question,
        company_name=CFG.company,
        project_name=CFG.project,
        agent_name="ex5_subagent_a",
        Run_Id=RUN_ID,
        Workflow_name=WORKFLOW_NAME,
        workflow_group_name=WORKFLOW_GROUP_NAME,
        agent_parent=AGENT_PARENT,
        Workflow_parent_name=WORKFLOW_PARENT_NAME,
        metadata=METADATA,
        secret_key=CFG.key,
        max_tries=1,
    )
    b = wrap_llm_line(
        llm_call=lambda q: openai_client.responses.create(model=CFG.model, input=q),
        prompt=question,
        company_name=CFG.company,
        project_name=CFG.project,
        agent_name="ex5_subagent_b",
        Run_Id=RUN_ID,
        Workflow_name=WORKFLOW_NAME,
        workflow_group_name=WORKFLOW_GROUP_NAME,
        agent_parent=AGENT_PARENT,
        Workflow_parent_name=WORKFLOW_PARENT_NAME,
        metadata=METADATA,
        secret_key=CFG.key,
        max_tries=1,
    )
    return f"{a}\n{b}"
```

### 6. Custom response extractor

**What this demonstrates.** Your model function returns a **custom** object shape. You supply `response_extractor` so the wrapper knows where the answer text lives for ingest.

**How it differs.** From examples 1–5: answer is **not** in the default chat/response slots; if you do not also supply `return_merger` or `response_answer_json_pointer`, the wrapper may give you a **plain string** instead of preserving the original object.

**`wrap_llm_call`**

```python
@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex6_extractor_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="messages",
    response_extractor=lambda obj: obj["raw_text"],
    max_tries=1,
)
def one_turn(messages):
    return {"raw_text": "Model content here", "meta": {"id": "abc"}}
```

**`wrap_llm_line`**

```python
out = wrap_llm_line(
    llm_call=lambda _prompt: {"raw_text": "model answer", "meta": {"id": "abc"}},
    prompt="hello",
    response_extractor=lambda obj: obj["raw_text"],
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex6_extractor_line",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    max_tries=1,
)
```

### 7. Non-reconstructable model output

**What this demonstrates.** Some return values cannot be cloned or rebuilt into the same Python type. The wrapper still needs a structured view for merge/ingest, so it falls back to a **dict** produced from `model_dump()`-style data.

**How it differs.** Focuses on **serialization / copy** failure, not on chat tools or custom extractors.

**`wrap_llm_call`**

```python
class NonCopyable:
    def model_dump(self):
        return {
            "choices": [{"message": {"role": "assistant", "content": "hello"}}]
        }

    def __deepcopy__(self, memo):
        raise RuntimeError("cannot deepcopy")

@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex7_noncopy_call",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="messages",
    max_tries=1,
)
def one_turn(messages):
    return NonCopyable()
```

**`wrap_llm_line`**

```python
class NonCopyable:
    def model_dump(self):
        return {
            "choices": [{"message": {"role": "assistant", "content": "hello"}}]
        }

    def __deepcopy__(self, memo):
        raise RuntimeError("cannot deepcopy")

out = wrap_llm_line(
    llm_call=lambda _prompt: NonCopyable(),
    prompt="one sentence",
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex7_noncopy_line",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    max_tries=1,
)
```

### 8. Tool agent with wrapped LLM tools

**Plain summary.** The coordinator model gets a **tool menu** via `chat.completions.create(..., tools=...)`. When it returns `tool_calls`, your code runs **real Python** for each name. If a “tool” is actually another LLM call, wrap that call too: then ingest sees one event per tool LLM (its own `agent_name`) and separate events for the coordinator.

**What goes in `tools`?**  
`tools` is a **list of JSON descriptions** for the API only. Each entry is usually `{"type": "function", "function": {...}}` with:

- **`name`**: string the model will put in `tool_calls[].function.name`
- **`description`**: free text shown to the model
- **`parameters`**: a small JSON Schema object (`type`, `properties`, `required`, …) describing the `arguments` JSON the model should output

It is **not** a list of Python callables. The provider uses this list so the model knows what to ask for; **you** map `name` → your own functions (or wrapped LLM helpers) after the response comes back.

**Do I need something like a `run_coordinator` function?**  
**llmwrap does not require it.** Chat Completions tool use is **multi-step**: one response is either “here are `tool_calls`” or “here is final text”. After tool calls you append the assistant turn and one `{"role": "tool", ...}` message per call, then **call the coordinator again**. Any name (`run_coordinator`, your agent framework, inline code) is fine—the README uses one small driver so the flow is obvious.

**Runnable reference.** `test_top_level_tool_agent_with_two_real_tools_and_nested_wrapped_llm_tools` and `test_top_level_tool_agent_with_two_real_tools_and_nested_wrapped_llm_tools_line_wrapper` in `tests/testing_different_interfaces.py`.

**How it differs from example 4.** Same `passthrough_when` on the coordinator; example 8 adds **wrapped LLM functions** as the bodies behind specific tool names.

**`wrap_llm_call`**

```python
import json

from llmwrap import openai_sdk_result_text, wrap_llm_call

# Declares two callable tools the MODEL may request (API schema, not Python functions).
TOOL_DEFINITIONS_FOR_API = [
    {
        "type": "function",
        "function": {
            "name": "get_operational_fact",
            "description": "Return one short operational fact about a topic.",
            "parameters": {
                "type": "object",
                "properties": {"topic": {"type": "string", "description": "Subject to summarize"}},
                "required": ["topic"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_risk_fact",
            "description": "Return one short risk-oriented fact about a topic.",
            "parameters": {
                "type": "object",
                "properties": {"topic": {"type": "string"}},
                "required": ["topic"],
            },
        },
    },
]


def has_tool_calls(raw):
    try:
        return bool(raw.choices[0].message.tool_calls)
    except Exception:
        return False


@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex8_weather_tool",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="question",
    max_tries=1,
)
def weather_tool_llm(question: str):
    return openai_client.chat.completions.create(
        model=CFG.model,
        messages=[{"role": "user", "content": question}],
        temperature=0,
    )


@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex8_risk_tool",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="question",
    max_tries=1,
)
def risk_tool_llm(question: str):
    return openai_client.chat.completions.create(
        model=CFG.model,
        messages=[{"role": "user", "content": question}],
        temperature=0,
    )


@wrap_llm_call(
    company_name=CFG.company,
    project_name=CFG.project,
    agent_name="ex8_tool_coordinator",
    Run_Id=RUN_ID,
    Workflow_name=WORKFLOW_NAME,
    workflow_group_name=WORKFLOW_GROUP_NAME,
    agent_parent=AGENT_PARENT,
    Workflow_parent_name=WORKFLOW_PARENT_NAME,
    metadata=METADATA,
    secret_key=CFG.key,
    prompt_arg="messages",
    passthrough_when=has_tool_calls,
    max_tries=1,
)
def coordinator_turn(messages: list, tool_definitions: list):
    return openai_client.chat.completions.create(
        model=CFG.model,
        messages=messages,
        tools=tool_definitions,
        tool_choice="auto",
        temperature=0,
    )


# App-side driver (name arbitrary): repeat coordinator → execute tools → append results.
def run_until_final_answer(user_text: str) -> str:
    messages = [
        {
            "role": "system",
            "content": "Use the tools when needed, then answer in plain language.",
        },
        {"role": "user", "content": user_text},
    ]
    for _ in range(8):
        out = coordinator_turn(messages, TOOL_DEFINITIONS_FOR_API)
        msg = out.choices[0].message
        if not msg.tool_calls:
            return openai_sdk_result_text(out)
        messages.append(msg.model_dump(exclude_none=True))
        for tc in msg.tool_calls:
            args = json.loads(tc.function.arguments or "{}")
            topic = str(args.get("topic", ""))
            if tc.function.name == "get_operational_fact":
                fact = openai_sdk_result_text(
                    weather_tool_llm(f"One operational fact about {topic}")
                )
            elif tc.function.name == "get_risk_fact":
                fact = openai_sdk_result_text(
                    risk_tool_llm(f"One risk fact about {topic}")
                )
            else:
                fact = "unknown tool"
            messages.append(
                {
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": json.dumps({"fact": fact}),
                }
            )
    raise RuntimeError("Tool loop limit exceeded")
```

**`wrap_llm_line`**

Same `TOOL_DEFINITIONS_FOR_API` and the same driver idea: **coordinator** = one `wrap_llm_line` with `passthrough_when` and `prompt_json_pointer` on the user message; **each tool** = `wrap_llm_line` on a small payload (here `{"model", "question"}`).

```python
import json

from llmwrap import openai_sdk_result_text, wrap_llm_line

# TOOL_DEFINITIONS_FOR_API: identical list as in the wrap_llm_call example above.


def weather_tool_llm(question: str):
    payload = {"model": CFG.model, "question": question}
    return wrap_llm_line(
        llm_call=lambda p: openai_client.chat.completions.create(
            model=p["model"],
            messages=[{"role": "user", "content": p["question"]}],
            temperature=0,
        ),
        prompt=payload,
        prompt_json_pointer="/question",
        company_name=CFG.company,
        project_name=CFG.project,
        agent_name="ex8_weather_tool",
        Run_Id=RUN_ID,
        Workflow_name=WORKFLOW_NAME,
        workflow_group_name=WORKFLOW_GROUP_NAME,
        agent_parent=AGENT_PARENT,
        Workflow_parent_name=WORKFLOW_PARENT_NAME,
        metadata=METADATA,
        secret_key=CFG.key,
        max_tries=1,
    )


def risk_tool_llm(question: str):
    payload = {"model": CFG.model, "question": question}
    return wrap_llm_line(
        llm_call=lambda p: openai_client.chat.completions.create(
            model=p["model"],
            messages=[{"role": "user", "content": p["question"]}],
            temperature=0,
        ),
        prompt=payload,
        prompt_json_pointer="/question",
        company_name=CFG.company,
        project_name=CFG.project,
        agent_name="ex8_risk_tool",
        Run_Id=RUN_ID,
        Workflow_name=WORKFLOW_NAME,
        workflow_group_name=WORKFLOW_GROUP_NAME,
        agent_parent=AGENT_PARENT,
        Workflow_parent_name=WORKFLOW_PARENT_NAME,
        metadata=METADATA,
        secret_key=CFG.key,
        max_tries=1,
    )


def has_tool_calls(raw):
    try:
        return bool(raw.choices[0].message.tool_calls)
    except Exception:
        return False


def coordinator_turn(messages: list, tool_definitions: list):
    return wrap_llm_line(
        llm_call=lambda p: openai_client.chat.completions.create(
            model=p["model"],
            messages=p["messages"],
            tools=p["tools"],
            tool_choice="auto",
            temperature=0,
        ),
        prompt={"model": CFG.model, "messages": messages, "tools": tool_definitions},
        prompt_json_pointer="/messages/1/content",
        passthrough_when=has_tool_calls,
        company_name=CFG.company,
        project_name=CFG.project,
        agent_name="ex8_tool_coordinator",
        Run_Id=RUN_ID,
        Workflow_name=WORKFLOW_NAME,
        workflow_group_name=WORKFLOW_GROUP_NAME,
        agent_parent=AGENT_PARENT,
        Workflow_parent_name=WORKFLOW_PARENT_NAME,
        metadata=METADATA,
        secret_key=CFG.key,
        max_tries=1,
    )


def run_until_final_answer(user_text: str) -> str:
    messages = [
        {"role": "system", "content": "Use the tools when needed, then answer."},
        {"role": "user", "content": user_text},
    ]
    for _ in range(8):
        out = coordinator_turn(messages, TOOL_DEFINITIONS_FOR_API)
        msg = out.choices[0].message
        if not msg.tool_calls:
            return openai_sdk_result_text(out)
        messages.append(msg.model_dump(exclude_none=True))
        for tc in msg.tool_calls:
            args = json.loads(tc.function.arguments or "{}")
            topic = str(args.get("topic", ""))
            if tc.function.name == "get_operational_fact":
                fact = openai_sdk_result_text(
                    weather_tool_llm(f"One operational fact about {topic}")
                )
            elif tc.function.name == "get_risk_fact":
                fact = openai_sdk_result_text(
                    risk_tool_llm(f"One risk fact about {topic}")
                )
            else:
                fact = "unknown tool"
            messages.append(
                {
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": json.dumps({"fact": fact}),
                }
            )
    raise RuntimeError("Tool loop limit exceeded")
```

## Console output

Each wrapped call may emit **one line to stdout** when ingest succeeds or when ingest / logs reporting fails. The line starts with **`[Arbis-Wrapper]`**, followed by a **single JSON object** (no pretty-printing). If **stdout is a TTY**, that line is wrapped in **green** (success) or **red** (failure) ANSI sequences; if stdout is not a TTY (pipes, CI, log capture), the same JSON is printed **without** color codes.

Successful POSTs to **`/api/logs`** do not print anything to the console.

### Success (after ingest HTTP 2xx)

Emitted when encrypted **`POST …/api/ingest`** returns a success status code (after the model output was parsed successfully).

Expected shape (values are examples):

```text
[Arbis-Wrapper] {"message": "Query processed successfully; ingest accepted.", "api_status": 200, "path": "/api/ingest", "query": "<original logical prompt passed to the wrapper>"}
```

- **`message`**: Fixed success text.
- **`api_status`**: HTTP status from the ingest response (typically `200`).
- **`path`**: `"/api/ingest"` for the normal ingest path.
- **`query`**: The wrapper’s original prompt string used for correlation (the same logical input ingest uses as `prompt`, not the augmented “wrapped” block sent to the model).

If the pipeline used the **fallback** ingest instead (`POST …/ingest/fallback` after parse retries were exhausted), a success line looks the same except **`path`** is `"/ingest/fallback"` and **`query`** is still the logical prompt:

```text
[Arbis-Wrapper] {"message": "Query processed successfully; ingest accepted.", "api_status": 200, "path": "/ingest/fallback", "query": "<original logical prompt>"}
```

### Failure

Emitted when ingest or fallback ingest fails, or when **`POST …/api/logs`** fails after all retries. Same `[Arbis-Wrapper]` prefix and single JSON object; **red** when stdout is a TTY.

**Main ingest failed** (model output parsed, but `/api/ingest` did not return 2xx, or the request raised before a response):

```text
[Arbis-Wrapper] {"message": "Ingest request failed.", "http_status": 401, "error": "<API error body or short reason>"}
```

**Fallback ingest failed** (`/ingest/fallback`):

```text
[Arbis-Wrapper] {"message": "Ingest fallback request failed.", "http_status": 503, "error": "<API error body or short reason>"}
```

**Wrapper logs failed** (all attempts to `/api/logs` failed):

```text
[Arbis-Wrapper] {"message": "Failed to record wrapper logs to the server.", "http_status": 500, "error": "<API error body or exception text>"}
```

When there is **no HTTP response** (timeouts, connection errors, etc.), **`http_status`** is JSON **`null`**:

```text
[Arbis-Wrapper] {"message": "Ingest request failed.", "http_status": null, "error": "<exception message>"}
```

If the library has no usable error text, **`error`** may be **`"(no detailed error available)"`**.

## Notes

- Keep credentials in environment variables (`.env`) and never hardcode production keys.
- Use distinct `agent_name` values per workflow for clean tracking.
- Pass `Run_Id`, workflow fields, and `metadata` when you need ingest to group steps or attach user context (`user_id`, `role`, and so on); omit any argument you do not use.
- For custom return shapes, pair `response_extractor` with `response_answer_json_pointer` or `return_merger` when needed.
- Exhaustive runnable variants (including redundant naming patterns) are in `tests/testing_different_interfaces.py`. For a linear workflow-group chain demo, see `tests/test_linear_workflow_group_chain.py`.

## License

MIT. See [LICENSE](LICENSE).
