Metadata-Version: 2.4
Name: invoicedataextraction-sdk
Version: 0.1.0
Summary: Official Python SDK for Invoice Data Extraction.
Author-email: Invoice Data Extraction <developers@invoicedataextraction.com>
License-Expression: MIT
Project-URL: Homepage, https://invoicedataextraction.com
Project-URL: Documentation, https://invoicedataextraction.com/docs
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.28.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Dynamic: license-file

# invoicedataextraction-sdk

Official Python SDK for [Invoice Data Extraction](https://invoicedataextraction.com). Handles file upload, extraction submission, polling, and result download so you can go from local files to structured output in a few lines of code.

- Python 3.9 or later

## Install

```bash
pip install invoicedataextraction-sdk
```

## Quick Start

```python
import os
from invoicedataextraction import InvoiceDataExtraction

client = InvoiceDataExtraction(
    api_key=os.environ["INVOICE_DATA_EXTRACTION_API_KEY"],
)

result = client.extract(
    folder_path="./invoices",
    prompt="Extract invoice number and total",
    output_structure="per_invoice",
    download={
        "formats": ["xlsx", "json"],
        "output_path": "./output",
    },
    console_output=True,  # remove to disable console logging
)
```

`extract(...)` uploads your files (pass a `folder_path` or a list of `files`), submits the extraction, polls until it finishes, and downloads the results. The returned `result` is the final polling response from the API. Check `result["pages"]["failed_count"]` to verify that all uploaded pages were processed successfully.

Generate an API key from your [dashboard](https://invoicedataextraction.com/dashboard?view=API). Every account includes 50 free pages per month. Additional credits can be purchased on a pay-as-you-go basis with no subscription needed.

## Staged Workflow

If you need control over individual steps — for example, uploading files in one part of your system and extracting in another — use the lower-level methods:

```python
upload = client.upload_files(
    files=["./invoice1.pdf", "./invoice2.pdf"],
)

submitted = client.submit_extraction(
    upload_session_id=upload["upload_session_id"],
    file_ids=upload["file_ids"],
    prompt="Extract invoice number and total",
    output_structure="per_invoice",
)

result = client.wait_for_extraction_to_finish(
    extraction_id=submitted["extraction_id"],
)

client.download_output(
    extraction_id=submitted["extraction_id"],
    format="xlsx",
    file_path="./output/invoices.xlsx",
)
```

## Error Handling

SDK methods raise exceptions on failure. The structured error body is on `error.body`:

```python
from invoicedataextraction import InvoiceDataExtraction
from invoicedataextraction.errors import SdkError, ApiResponseError

try:
    client.extract(...)
except (SdkError, ApiResponseError) as error:
    print(error.body["error"]["code"])       # e.g. "INVALID_INPUT"
    print(error.body["error"]["message"])     # Human-readable message
    print(error.body["error"]["retryable"])
```

When an extraction task itself fails (e.g. insufficient credits), `extract(...)` returns the failed response rather than raising — check `result["status"]` for `"completed"` or `"failed"`.

## Documentation

- [Python SDK docs](https://invoicedataextraction.com/sdk/python) — full method reference, parameters, return shapes, and examples
- [REST API docs](https://invoicedataextraction.com/api) — endpoint-level documentation for direct HTTP integration
- [Dashboard](https://invoicedataextraction.com/dashboard?view=API) — manage API keys and view extraction results

## License

MIT
