Metadata-Version: 2.4
Name: dartlab
Version: 0.6.0
Summary: 공시 문서에서 하나의 회사 맵을 만든다 — DART + EDGAR
Project-URL: Homepage, https://github.com/eddmpython/dartlab
Project-URL: Repository, https://github.com/eddmpython/dartlab
Project-URL: Issues, https://github.com/eddmpython/dartlab/issues
Author: eddmpython
License: MIT License
        
        Copyright (c) 2026 eddmpython
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Keywords: accounting,dart,disclosure,edgar,financial-statements,korea,polars,sec,sections
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: alive-progress>=3.3.0
Requires-Dist: beautifulsoup4>=4.14.3
Requires-Dist: lxml>=6.0.2
Requires-Dist: marimo>=0.20.4
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: polars>=1.0.0
Requires-Dist: requests>=2.32.5
Requires-Dist: rich>=14.3.3
Provides-Extra: ai
Requires-Dist: fastapi>=0.135.1; extra == 'ai'
Requires-Dist: httpx>=0.28.1; extra == 'ai'
Requires-Dist: sse-starlette>=2.0.0; extra == 'ai'
Requires-Dist: uvicorn[standard]>=0.30.0; extra == 'ai'
Provides-Extra: all
Requires-Dist: anthropic>=0.30.0; extra == 'all'
Requires-Dist: fastapi>=0.135.1; extra == 'all'
Requires-Dist: httpx>=0.28.1; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: plotly>=5.0.0; extra == 'all'
Requires-Dist: sse-starlette>=2.0.0; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.30.0; extra == 'all'
Provides-Extra: charts
Requires-Dist: plotly>=5.0.0; extra == 'charts'
Provides-Extra: llm
Requires-Dist: openai>=1.0.0; extra == 'llm'
Provides-Extra: llm-anthropic
Requires-Dist: anthropic>=0.30.0; extra == 'llm-anthropic'
Requires-Dist: openai>=1.0.0; extra == 'llm-anthropic'
Provides-Extra: ui
Requires-Dist: fastapi>=0.135.1; extra == 'ui'
Requires-Dist: httpx>=0.28.1; extra == 'ui'
Requires-Dist: sse-starlette>=2.0.0; extra == 'ui'
Requires-Dist: uvicorn[standard]>=0.30.0; extra == 'ui'
Description-Content-Type: text/markdown

<div align="center">

<br>

<img alt="DartLab" src=".github/assets/logo.png" width="180">

<h3>DartLab</h3>

<p><b>One company map from disclosure filings — DART + EDGAR</b></p>

<p>
<a href="https://pypi.org/project/dartlab/"><img src="https://img.shields.io/pypi/v/dartlab?style=for-the-badge&color=ea4647&labelColor=050811&logo=pypi&logoColor=white" alt="PyPI"></a>
<a href="https://pypi.org/project/dartlab/"><img src="https://img.shields.io/pypi/pyversions/dartlab?style=for-the-badge&color=c83232&labelColor=050811&logo=python&logoColor=white" alt="Python"></a>
<a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-94a3b8?style=for-the-badge&labelColor=050811" alt="License"></a>
<a href="https://github.com/eddmpython/dartlab/actions/workflows/ci.yml"><img src="https://img.shields.io/github/actions/workflow/status/eddmpython/dartlab/ci.yml?branch=master&style=for-the-badge&labelColor=050811&logo=github&logoColor=white&label=CI" alt="CI"></a>
</p>

<p>
<a href="https://eddmpython.github.io/dartlab/">Docs</a> · <a href="README_KR.md">한국어</a> · <a href="https://buymeacoffee.com/eddmpython">Sponsor</a>
</p>

<p>
<a href="https://github.com/eddmpython/dartlab/releases/tag/data-docs"><img src="https://img.shields.io/badge/Docs-260%2B_Companies-f87171?style=for-the-badge&labelColor=050811&logo=databricks&logoColor=white" alt="Docs Data"></a>
<a href="https://github.com/eddmpython/dartlab/releases/tag/data-finance-1"><img src="https://img.shields.io/badge/Finance-2,700%2B_Companies-818cf8?style=for-the-badge&labelColor=050811&logo=databricks&logoColor=white" alt="Finance Data"></a>
<a href="https://github.com/eddmpython/dartlab/releases/tag/data-report-1"><img src="https://img.shields.io/badge/Report-2,700%2B_Companies-34d399?style=for-the-badge&labelColor=050811&logo=databricks&logoColor=white" alt="Report Data"></a>
</p>

</div>

## What DartLab Is

DartLab turns corporate filings into a single company map — for both Korean DART and US EDGAR.

The center of that map is `sections`: a horizontalized matrix built from disclosure sections across periods. Instead of treating a filing as a pile of unrelated parsers, DartLab aligns the document structure first, then lets stronger sources fill in what they own:

- **`docs`** — section structure, narrative text with heading/body separation, tables, and evidence
- **`finance`** — authoritative numeric statements (BS, IS, CF) and financial ratios
- **`report`** — authoritative structured disclosure APIs (DART only)

```python
import dartlab

c = dartlab.Company("005930")   # Samsung Electronics (DART)
c.sections                      # full company map (topic × period)
c.topics                        # topic list with source, blocks, periods
c.show("companyOverview")       # open one topic
c.show("IS", period=["2024Q4", "2023Q4"])  # vertical view (period × item)
c.BS                            # balance sheet
c.ratios                        # ratio time series (항목 × period)
c.insights                      # 7-area grades (A~F)

us = dartlab.Company("AAPL")    # Apple (EDGAR)
us.sections
us.show("10-K::item1Business")
us.BS
us.ratios
```

## Install

```bash
uv add dartlab
```

AI interface:

```bash
uv add "dartlab[ai]"
uv run dartlab ai
```

## Quick Start

### Sections — The Company Map

`sections` is a Polars DataFrame where each row is a disclosure block and each period column holds the raw payload. Periods are sorted newest-first, and annual reports appear as Q4:

```
chapter │ topic            │ blockType │ textNodeType │ 2025Q4 │ 2024Q4 │ 2024Q3 │ …
I       │ companyOverview  │ text      │ heading      │ "…"    │ "…"    │ "…"    │
I       │ companyOverview  │ text      │ body         │ "…"    │ "…"    │ "…"    │
I       │ companyOverview  │ table     │ null         │ "…"    │ "…"    │ null   │
II      │ businessOverview │ text      │ heading      │ "…"    │ "…"    │ "…"    │
III     │ BS               │ table     │ null         │ —      │ —      │ —      │ (finance)
VII     │ dividend         │ table     │ null         │ —      │ —      │ —      │ (report)
```

Text blocks carry structural metadata — `textNodeType` (heading/body), `textLevel`, and `textPath` — so you can distinguish section headers from narrative content.

### Show, Trace, Diff

```python
c = dartlab.Company("005930")

# show — open any topic with source-aware priority
c.show("BS")                # → finance DataFrame
c.show("companyOverview")   # → sections-based text + tables
c.show("dividend")          # → report DataFrame (all quarters)

# vertical view — compare specific periods side by side
c.show("IS", period=["2024Q4", "2023Q4"])  # period × item

# trace — why a topic came from docs, finance, or report
c.trace("BS")               # → {"primarySource": "finance", ...}

# diff — text change detection (3 modes)
c.diff()                                    # full summary
c.diff("businessOverview")                  # topic history
c.diff("businessOverview", "2024", "2025")  # line-by-line diff
```

### Finance

```python
c.BS                    # balance sheet (account × period, newest first)
c.IS                    # income statement
c.CF                    # cash flow
c.ratios                # ratio time series DataFrame (6 categories × period)
c.finance.ratios        # latest single-point RatioResult
c.finance.ratioSeries   # ratio time series across years
c.finance.timeseries    # raw account time series
```

Financial ratios cover 6 categories: profitability, stability, growth, efficiency, cashflow, and valuation.

### Insights

```python
c.insights                      # 7-area analysis
c.insights.grades()             # → {"performance": "A", "profitability": "B", …}
c.insights.performance.grade    # → "A"
c.insights.performance.details  # → ["Revenue growth +8.3%", …]
c.insights.anomalies            # → outliers and red flags
```

7 analysis areas: performance, profitability, health, cashflow, governance, risk, opportunity.

### EDGAR (US)

Same `Company` interface, different data source:

```python
us = dartlab.Company("AAPL")

us.sections                         # 10-K/10-Q sections with heading/body
us.show("10-K::item1Business")      # business description
us.show("10-K::item1ARiskFactors")  # risk factors
us.BS                               # SEC XBRL balance sheet
us.ratios                           # same 47 ratios
us.diff("10-K::item7Mdna")          # MD&A text changes
```

EDGAR sections include the same text structure metadata (heading/body separation, textLevel, textPath) as DART.

## OpenAPI — Raw Public APIs

Use source-native wrappers when you want raw disclosure APIs directly.

### OpenDart (Korea)

```python
from dartlab import OpenDart

d = OpenDart()                                  # auto-detect API key
d = OpenDart(["key1", "key2"])                  # multi-key rotation

d.search("카카오", listed=True)                  # company search
d.filings("삼성전자", "2024")                    # filing list
d.company("삼성전자")                            # corporate profile
d.finstate("삼성전자", 2024)                     # financial statements
d.report("삼성전자", "배당", 2024)                # 56 report categories

# convenience proxy
s = d("삼성전자")
s.finance(2024)
s.report("배당", 2024)
s.filings("2024")
```

### OpenEdgar (US)

```python
from dartlab import OpenEdgar

e = OpenEdgar()

e.search("Apple")                               # ticker search
e.company("AAPL")                               # company info
e.filings("AAPL", forms=["10-K", "10-Q"])       # filing list
e.companyFactsJson("AAPL")                      # XBRL facts
e.companyConceptJson("AAPL", "us-gaap", "Revenue")  # single tag series
```

These wrappers keep the original source surface intact, while saved parquet stays compatible with DartLab's `Company` engine.

## Core Ideas

### 1. Sections First

`sections` is the backbone. A company is described as one horizontalized map of disclosure units across periods — not a loose set of parser outputs.

### 2. Source-Aware Company

`Company` is a merged company object. When `finance` or `report` is more authoritative than docs for a given topic, it overrides automatically. `trace()` tells you which source was chosen and why.

### 3. Text Structure

Narrative text is not a flat string. DartLab splits it into heading/body rows with level and path metadata, enabling structural comparison across periods. This works for both Korean DART and English EDGAR filings.

### 4. Raw Access

You can always go deeper:

```python
c.docs.sections          # pure docs horizontalization
c.finance.BS             # finance engine directly
c.report.extract("배당")  # report engine directly
```

## Stability

| Tier | Scope |
|------|-------|
| **Stable** | DART Company (sections, show, trace, diff, BS/IS/CF, ratios, insights) |
| **Beta** | EDGAR Company, OpenDart, OpenEdgar, Server API |
| **Experimental** | AI tools, export |

See [docs/stability.md](docs/stability.md).

## Documentation

- Docs: https://eddmpython.github.io/dartlab/
- Sections guide: https://eddmpython.github.io/dartlab/docs/getting-started/sections
- Quick start: https://eddmpython.github.io/dartlab/docs/getting-started/quickstart
- API overview: https://eddmpython.github.io/dartlab/docs/api/overview
- Blog: https://eddmpython.github.io/dartlab/blog/

## Data

DartLab ships with pre-built datasets via GitHub Releases:

| Dataset | Coverage | Source |
|---------|----------|--------|
| DART docs | 260+ companies | Korean disclosure text + tables |
| DART finance | 2,700+ companies | XBRL financial statements |
| DART report | 2,700+ companies | Structured disclosure APIs |
| EDGAR docs | 970+ companies | 10-K/10-Q sections |
| EDGAR finance | 970+ companies | SEC XBRL facts |

## Contributing

The project prefers experiments before engine changes. If you want to propose a parser or mapping change, validate it first and then bring the result back into the engine.

## License

MIT
