Metadata-Version: 2.4
Name: khmer-keyboard
Version: 0.1.2
Summary: Khmer Keyboard Prediction Engine
Author: Mr. Nop Phearum
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: msgpack
Requires-Dist: python-Levenshtein
Dynamic: license-file

# Khmer Keyboard Prediction Engine

Khmer keyboard prediction engine for offline/mobile usage.

It supports:
- Next-word prediction (trigram with bigram/unigram fallback)
- Prefix suggestion
- Typo correction
- Smart auto mode (`smart`) that decides how to respond from user input

## Install

From PyPI:

```bash
pip install khmer-keyboard
```

From local source:

```bash
pip install -U .
```

## Package Structure

```text
khmer_keyboard/
  train/
    build_lm.py
    build_chargram.py
    export_model.py
  runtime/
    language_model.py
    chargram_embedder.py
    scorer.py
    personalization.py
    engine.py
  model.msgpack
```

## Quick Start

```python
from khmer_keyboard import KhmerKeyboardEngine

# Uses bundled khmer_keyboard/model.msgpack
engine = KhmerKeyboardEngine()
```

## Core APIs

```python
# Prefix suggestion
engine.suggest_prefix("ស", top_n=5)
# ["សូម", "សម្រាប់", "សួស្តី", "សកម្ម", "សាលា"]

# Typo correction
engine.correct_word("កមពជ", top_n=5)
# ["កម្ពុជ", "កម្ពុជា", "កុម្ពុជ", "កម្ពជ", "កពជ"]

# Next-word prediction from context
engine.predict_next("ខ្ញុំ", "ចង់", top_n=5)
# ["ទៅ", "ធ្វើ", "រៀន", "បាន", "សួរ"]
```

## Smart API (Recommended)

`smart(...)` is the main user-facing API.  
It auto-selects behavior:

- If input ends with space: next-word prediction
- If input is partial token: prefix suggestion
- If token looks wrong: typo correction
- If no-space merged phrase is detected: attempts phrase split, then next-word prediction
- If no strong match: safe fallback (never empty list)

Examples:

```python
engine.smart("ខ្ញុំ ចង់ ")   # next-word mode
engine.smart("ស")            # prefix mode
engine.smart("សរឡញ")         # correction/suggestion mode
engine.smart("ខញសរឡញ")       # merged phrase recovery mode
```

## Example Output (Format)

Output is always `list[str]`:

```text
['ទៅ', 'ធ្វើ', 'ញុំា', 'បាន', 'រៀន']
['សួស្តី', 'សម្រាប់', 'សូម', 'សកម្ម', 'សាលា']
```

Exact words depend on your trained model and corpus quality.

## Use External Model

```python
engine = KhmerKeyboardEngine(model_path="path/to/model.msgpack")
```
