Metadata-Version: 2.4
Name: societal-costs-cli-140326-2305
Version: 0.1.0
Classifier: Programming Language :: Python
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Environment :: Console
Summary: Unified orchestrator CLI for societal-costs prepare and results stages.
Keywords: societal-costs,epidemiology,pipeline,cli,rust
License: MIT OR Apache-2.0
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM

# societal_costs_cli

Single-crate study workspace for societal-costs orchestration, ingest, analysis, and reporting.
The crate keeps the CLI entrypoint and study logic under the same package (`src/`).

Separate study-agnostic pipeline utilities remain in shared workspace crates.

## Shared orchestration crate

- `orchestrator_core` (`../../../orchestrator_core`): study-agnostic orchestration primitives (manifest/status/resume/artifact contracts, stage adapters, typed stage failure contracts)

## Platform crates (`../../../generic_cohort_builder/crates/`)

- `cohort_builder_common`: shared errors, tracing helpers, and path utilities
- `cohort_builder_domain`: study-agnostic cohort/register-domain logic
- `cohort_builder_domain_scd`: SCD-specific domain contracts/classification logic
- `cohort_builder_ingest`: study-agnostic register IO modules
- `cohort_builder_io`: study-agnostic Arrow wide writer utilities

## Usage

```bash
cargo run --release -- prepare /path/to/pipeline.toml
cargo run --release -- results /path/to/pipeline.toml
cargo run --release -- study /path/to/pipeline.toml
cargo run --release -- config /path/to/pipeline.toml
# or use canonical default config:
cargo run --release -- study
```

The CLI enforces prepare steps and writes outputs to `report.output_dir`.

Useful flags:

```bash
# Dry-run stage plan only
cargo run --release -- study /path/to/pipeline.toml --print-plan

# Dry-run using canonical default config
cargo run --release -- study --print-plan

# Fail fast when existing manifests drift from current config/runtime identity
cargo run --release -- study /path/to/pipeline.toml --resume-policy strict

# Analysis always runs in-process in unified orchestrator mode.
cargo run --release -- study /path/to/pipeline.toml
```

Migration notes for legacy CLI naming/entrypoints:

- `cli_migration.md`

Optional config controls:

```toml
[orchestrator]
resume_prepare_stage = true
resume_analysis_stage = true
```

Prepare source mode:

```toml
[pipeline]
# build (default): run raw-register prepare build
# files: reuse existing core handoff artifacts in report.output_dir
cohort_source = "build"
```

Config template:

- `pipeline.template.toml`
- `pipeline.default.toml`

No-config resolution order for `study`:

1. canonical checked-in config (`config/pipeline.default.toml`)
2. generated fallback preset (only when canonical default is unavailable)

Optional override:

- `SOCIETAL_COSTS_RUN_ALL_CONFIG` for explicit default config path.
- Shared runtime env override reference:
  - `/Users/tobiaskragholm/dev/phd-rust-projects/crates/studies/shared/study_cli_common/README.md`

## Effective Config + Path Resolution

Use `config` to inspect the final resolved config after runtime env overrides and path normalization:

```bash
cargo run --release -- config /path/to/pipeline.toml --format json
cargo run --release -- config /path/to/pipeline.toml --format toml --output /tmp/effective.toml
```

Runtime now emits concise config-resolution lines for prepare/results:

```text
[config] prepare config=... strategy=base_paths(2) catalog_extra=1 report_output_dir=... analysis_output_dir=... overrides=SOCIETAL_COSTS_ANALYSIS_OUTPUT_DIR
```

Field meanings:

- `strategy`: whether path resolution uses `base_paths`, `base_path`, or no base root
- `catalog_extra`: number of additional catalog files from `SOCIETAL_COSTS_CATALOG_FILES`
- `overrides`: active runtime override env vars (`SOCIETAL_COSTS_*` / `STUDY_*`)

## Checks

Study workspace:

```bash
cargo check --workspace
cargo test --workspace
scripts/check_workspace_boundaries.sh
```

Platform workspace:

```bash
cd ../../../generic_cohort_builder
cargo check --workspace
cargo test --workspace
```

## Output contract

Report outputs are Arrow/CSV/Markdown; JSON report writers are disabled (except `run_manifest.json` for resume state).

Stage manifests:

- `prepare_manifest.json`
- `analysis_manifest.json`

Both manifests are written at stage start (`status=running`) and finalized to `completed` or `failed`.
Resume requires strict manifest compatibility (schema/version + config digest).
Manifests also include runtime diagnostics (`execution.*`, `host.*`) for remote troubleshooting.

Schema compatibility/migration policy:

- `manifest_schema_policy.md`

## Crate Roles

- `societal_costs_cli`:
  study-specific workflow orchestrator and implementation crate (prepare/results/study, manifests, resume, logging, policy/protocol modules).
- `orchestrator_core`:
  study-agnostic orchestration crate shared across studies.

## LPR definitions

Contact vs episode semantics and stitching/classification rules:

- `lpr_contacts_and_episodes.md`

