Metadata-Version: 2.4
Name: fusion-ai-sdk
Version: 1.0.2
Summary: AI SDK which has capabilities to interact with Google Gemini / OpenAI / Ollama APIs for LLM calls
Project-URL: Homepage, https://github.com/shreyasbgr/fusion-ai-sdk
Project-URL: Repository, https://github.com/shreyasbgr/fusion-ai-sdk.git
Author-email: Shreyas Banagar <shreyasbanagar7@gmail.com>, Ayishwarya Swami <swamiayishwarya@gmail.com>
License: MIT
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.12
Requires-Dist: google-genai>=1.57.0
Requires-Dist: ollama>=0.6.1
Requires-Dist: openai>=2.15.0
Requires-Dist: pydantic>=2.12.5
Requires-Dist: python-dotenv>=1.2.1
Description-Content-Type: text/markdown

# Fusion AI SDK

A unified Python SDK to interact with **OpenAI**, **Google Gemini**, and **Ollama** through a single, consistent interface.

[![PyPI version](https://badge.fury.io/py/fusion-ai-sdk.svg)](https://badge.fury.io/py/fusion-ai-sdk)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)

---

## Features

- 🔀 **Single interface** for OpenAI, Google Gemini, and Ollama
- 🧠 **Reasoning / Thinking support** for o-series, Gemini 2.5, DeepSeek, and QwQ models
- 📐 **Structured outputs** via Pydantic models (all providers)
- 🛠️ **Tool / Function calling** with built-in parallel `ToolExecutor`
- 🖼️ **Multimodal inputs** — pass images via URL, local path, or base64
- 📊 **Standardized response** including token usage, latency, and finish reason

---

## Installation

```bash
pip install fusion-ai-sdk
```

or with `uv`:

```bash
uv add fusion-ai-sdk
```

---

## Quick Start

The same code works for any provider — just swap `provider` and `model_name`:

```python
from fusion_ai_sdk.llm_sdk import LLMInvoker
from dotenv import load_dotenv

load_dotenv()

invoker = LLMInvoker()

llm_input = LLMInvoker.ChatCompletionInput(
    provider="openai",              # "openai" | "gemini" | "ollama"
    model_name="gpt-4o-mini",       # Any model for the chosen provider
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)

response = invoker.chat_completion(llm_input)
print(response.response)  # "Paris"
```

**Provider / model examples:**

| Provider | `provider` value | Example `model_name` |
|----------|-----------------|----------------------|
| OpenAI   | `"openai"`      | `"gpt-4o"`, `"gpt-4o-mini"`, `"o4-mini"` |
| Gemini   | `"gemini"`      | `"gemini-2.5-flash"`, `"gemini-2.0-flash"` |
| Ollama   | `"ollama"`      | `"llama3.2"`, `"deepseek-r1:7b"`, `"qwen2.5"` |

---

## Providers & API Keys

Set your API keys via environment variables (or a `.env` file with `python-dotenv`):

| Provider | Environment Variable | Notes |
|----------|---------------------|-------|
| OpenAI   | `OPENAI_API_KEY`    | Standard OpenAI or any OpenAI-compatible endpoint |
| Gemini   | `GEMINI_API_KEY`    | Google Gemini via `google-genai` |
| Ollama   | *(none)*            | Requires `host_url` pointing to your Ollama server |

> **Ollama only**: You must pass `host_url` in every request:
> ```python
> LLMInvoker.ChatCompletionInput(
>     provider="ollama",
>     host_url="http://localhost:11434",
>     model_name="llama3.2",
>     ...
> )
> ```

---

## `ChatCompletionInput` — Full Parameter Reference

```python
LLMInvoker.ChatCompletionInput(
    provider="openai",              # Required: "openai" | "gemini" | "ollama"
    model_name="gpt-4o",           # Required: Model identifier string
    messages=[...],                 # Required: List of message dicts (see Messages Format below)
    api_key=None,                   # Optional: Overrides env var API key for this call
    host_url=None,                  # Optional: Ollama host URL or custom OpenAI-compatible base URL
    response_schema=None,           # Optional: Pydantic BaseModel class for structured output
    temperature=None,               # Optional: Float 0.0–2.0
    max_output_tokens=None,         # Optional: Maximum tokens to generate
    reasoning_effort=None,          # Optional: "none" | "low" | "medium" | "high" | "dynamic"
    reasoning_summary=None,         # Optional: "none" | "concise" | "detailed" | "auto"
    gemini_thinking_budget=None,    # Optional: Int 0–24576 — Gemini 2.5 only
    tools=None,                     # Optional: List of tool dicts in OpenAI format
    tool_choice=None,               # Optional: "auto" | "none" | "required" | {"type": "function", "function": {"name": "..."}}
)
```

**Provider-specific parameter notes:**

| Parameter | OpenAI | Gemini | Ollama |
|-----------|--------|--------|--------|
| `temperature` | ✅ (not on o1/o3/o4/gpt-5) | ✅ | ✅ |
| `reasoning_effort` | ✅ (o1/o3/o4/gpt-5 only) | ✅ (gemini-2.5+) | ✅ (deepseek-r1, qwen2.5, qwq, llama3.1 70B+) |
| `reasoning_summary` | ❌ not supported | ✅ | ✅ (returns thinking block) |
| `gemini_thinking_budget` | ❌ | ✅ gemini-2.5 only | ❌ |
| `host_url` | Optional (custom endpoint) | ❌ | Required |

---

## Messages Format

Messages follow the **OpenAI message format** and are compatible across all providers.

### Text messages

```python
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hi there! How can I help?"},
]
```

Valid roles: `"system"`, `"user"`, `"assistant"`, `"tool"`

> **OpenAI only**: The `"developer"` role is also accepted (treated the same as `"system"`).

### Multimodal messages (text + image)

Images can be passed as a **local file path**, a **public URL**, or a **base64 data URI**. This format works across all providers.

```python
messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What is in this image?"},
            {
                "type": "image_url",
                "image_url": {
                    # Option 1: Local file path
                    "url": "/path/to/image.png"

                    # Option 2: Public URL
                    # "url": "https://example.com/image.jpg"

                    # Option 3: Base64 data URI
                    # "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
                }
            },
        ],
    }
]
```

> **Ollama**: Use a vision-capable model such as `llama3.2-vision` or `llava`. Remote URLs are not supported for Ollama — use local paths or base64 instead.

---

## Response Format

Every call returns a `ChatCompletionResponse` object, consistent across all providers.

```python
response = invoker.chat_completion(llm_input)

response.response          # str | Pydantic model | None — main text/parsed output
response.success           # bool — True if call succeeded
response.error_message     # str | None — populated on failure
response.reasoning_summary # str | None — thinking/reasoning trace (if requested)
response.tool_calls        # list | None — tool calls requested by the model
response.finish_reason     # str | None — e.g. "stop", "length", "tool_calls"

response.usage.input_tokens      # int | None
response.usage.output_tokens     # int | None
response.usage.reasoning_tokens  # int | None
response.usage.cached_tokens     # int | None
response.usage.total_tokens      # int | None

response.timing.start_time   # ISO timestamp str
response.timing.end_time     # ISO timestamp str
response.timing.latency_ms   # float
```

---

## Structured Outputs (Pydantic)

Pass a Pydantic `BaseModel` as `response_schema` to get a fully typed, validated Python object back — works across all providers.

```python
from pydantic import BaseModel
from fusion_ai_sdk.llm_sdk import LLMInvoker

class MovieReview(BaseModel):
    movie_name: str
    rating_out_of_10: int
    pros: list[str]
    cons: list[str]
    one_sentence_summary: str

invoker = LLMInvoker()

llm_input = LLMInvoker.ChatCompletionInput(
    provider="openai",              # Works the same for "gemini" or "ollama"
    model_name="gpt-4o-mini",
    messages=[{"role": "user", "content": "Review the movie Inception"}],
    response_schema=MovieReview,
)

response = invoker.chat_completion(llm_input)

review: MovieReview = response.response  # Fully typed Pydantic object
print(review.movie_name)         # "Inception"
print(review.rating_out_of_10)   # 9
print(review.pros)               # ["mind-bending plot", ...]
```

---

## Tool / Function Calling

### Step 1 — Define tools in OpenAI format

Tools are plain Python dictionaries. This format is compatible across all providers.

```python
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city",
            "parameters": {
                "type": "object",
                "properties": {
                    # Required parameter — model MUST provide this
                    "location": {
                        "type": "string",
                        "description": "City name, e.g. 'Paris'",
                    },
                    # Optional parameter — model may omit this
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit. Defaults to celsius if not specified.",
                    },
                },
                "required": ["location"],  # Only list truly required parameters here
            },
        },
    }
]
```

> **OpenAI only**: You can add `"strict": True` to the function definition to enforce exact schema adherence. When enabled, **every key in `properties` must also appear in `required`** — there are no optional parameters. Omit `strict` if you have optional parameters.

> **Ollama**: Tool calling is only supported by specific models: `llama3.1`, `llama3.2`, `mistral-nemo`, `qwen2.5`, `command-r`, `command-r-plus`, `firefunction-v2`. Other models will ignore tools and generate a plain text response.

### Step 2 — Map your actual Python functions

```python
def get_weather(location: str, unit: str = "celsius") -> dict:
    # Your real implementation here
    return {"temperature": 22, "condition": "Sunny", "unit": unit}

tools_map = {
    "get_weather": get_weather,
}
```

### Step 3 — Run with automatic tool execution loop

`chat_completion_with_tools` automatically calls your functions and feeds results back to the model until it produces a final text answer. Works across all providers.

```python
invoker = LLMInvoker()
executor = LLMInvoker.ToolExecutor(tools_map)

llm_input = LLMInvoker.ChatCompletionInput(
    provider="openai",              # Works the same for "gemini" or "ollama"
    model_name="gpt-4o-mini",
    messages=[{"role": "user", "content": "What's the weather like in Paris?"}],
    tools=tools,
    tool_choice="auto",             # "auto" | "none" | "required"
)

response = invoker.chat_completion_with_tools(
    input_data=llm_input,
    tool_executor=executor,
    max_iterations=5,               # Safety cap on tool→model loops
)

print(response.response)
```

### Manual tool call handling

If you need full control, use `chat_completion` directly:

```python
response = invoker.chat_completion(llm_input)

if response.tool_calls:
    for tc in response.tool_calls:
        print(tc["function"]["name"])       # e.g. "get_weather"
        print(tc["function"]["arguments"])  # JSON string of args
        print(tc["id"])                     # tool_call_id to include in the tool reply message
```

---

## Reasoning & Thinking

All three providers support reasoning via the same `reasoning_effort` parameter. The SDK maps this to each provider's native API internally.

```python
llm_input = LLMInvoker.ChatCompletionInput(
    provider="openai",              # Works the same for "gemini" or "ollama"
    model_name="o4-mini",
    messages=[{"role": "user", "content": "Solve step by step: 2x + 5 = 13"}],
    reasoning_effort="high",        # "low" | "medium" | "high" | "dynamic"
)

response = invoker.chat_completion(llm_input)
print(response.response)            # Final answer
print(response.reasoning_summary)   # Thinking trace (if supported and requested)
```

**Reasoning support by provider:**

| Provider | Supported models | `reasoning_summary` |
|----------|-----------------|---------------------|
| OpenAI   | `o1`, `o3`, `o4-mini`, `gpt-5` | ❌ not exposed by API |
| Gemini   | `gemini-2.5-flash`, `gemini-2.5-pro` | ✅ `"concise"` / `"detailed"` / `"auto"` |
| Ollama   | `deepseek-r1`, `qwen2.5`, `qwq`, `llama3.1` (70B+) | ✅ returns thinking block |

> **Gemini only**: You can set an explicit token budget instead of an effort level using `gemini_thinking_budget` (0–24576). This takes precedence over `reasoning_effort`.
> ```python
> gemini_thinking_budget=8192
> ```

> **OpenAI only**: `temperature` is not supported for reasoning models (o1, o3, o4, gpt-5). The parameter is silently ignored if passed.

---

## Custom OpenAI-Compatible Endpoint

Point the SDK at any server that speaks the OpenAI API (e.g. LM Studio, vLLM, Together AI) using `host_url`:

```python
llm_input = LLMInvoker.ChatCompletionInput(
    provider="openai",
    model_name="meta-llama/Llama-3-8b-chat-hf",
    api_key="your-provider-api-key",
    host_url="https://api.together.xyz/v1",  # Replace with your endpoint
    messages=[{"role": "user", "content": "Hello!"}],
)
```

---

## Checking Model Info

```python
invoker = LLMInvoker()

# OpenAI — uses a built-in static mapping
info = invoker.get_model_info(provider="openai", model_name="gpt-4o")

# Gemini — fetches live from the Gemini API
info = invoker.get_model_info(provider="gemini", model_name="gemini-2.5-flash")

# Ollama — fetches live from your local Ollama server
info = invoker.get_model_info(
    provider="ollama",
    model_name="llama3.2",
    host_url="http://localhost:11434",
)

print(info.context_window)    # e.g. 128000
print(info.max_output_tokens) # e.g. 16384
```

---

## Full Example — Tools + Structured Output

```python
import os
from pydantic import BaseModel
from fusion_ai_sdk.llm_sdk import LLMInvoker
from dotenv import load_dotenv

load_dotenv()

class WeatherReport(BaseModel):
    city: str
    temperature_celsius: float
    condition: str
    advice: str

def get_weather(location: str) -> dict:
    # Replace with a real weather API call
    return {"temperature": 22, "condition": "Sunny"}

invoker = LLMInvoker()
executor = LLMInvoker.ToolExecutor({"get_weather": get_weather})

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"],
        },
    },
}]

llm_input = LLMInvoker.ChatCompletionInput(
    provider="openai",              # Swap to "gemini" or "ollama" as needed
    model_name="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a weather assistant."},
        {"role": "user", "content": "Give me a weather report for Tokyo."},
    ],
    tools=tools,
    response_schema=WeatherReport,
    temperature=0.3,
)

response = invoker.chat_completion_with_tools(llm_input, executor)

if response.success:
    report: WeatherReport = response.response
    print(f"City: {report.city}")
    print(f"Temp: {report.temperature_celsius}°C")
    print(f"Condition: {report.condition}")
    print(f"Advice: {report.advice}")
    print(f"\nLatency: {response.timing.latency_ms:.0f}ms")
    print(f"Tokens used: {response.usage.total_tokens}")
else:
    print(f"Error: {response.error_message}")
```

---

## License

MIT © [Shreyas Banagar](https://github.com/shreyasbgr), [Ayishwarya Swami](https://github.com/aishwarya123mathpati)
