← Back to Home

Train Custom Model

🎯 What is this?
Train a custom embedding model from your domain-specific text. The model will learn terminology and synonyms specific to your field (medical, legal, finance, etc.).
📋 Requirements:
⚠️ Important: Training takes 5-60 seconds depending on corpus size. The page will redirect when complete. Larger corpora (5000+ words) produce better synonym learning.

Upload Training Corpus

Use alphanumeric characters and underscores only (e.g., medical_uti, finance_basics)
Upload a .txt or .md file with your domain text (1000+ words recommended)
⚙️ Advanced Settings (Optional)
Output vector size (64-256). Default: 128. Higher = more capacity but larger model.
TF-IDF vocabulary size (1000-10000). Default: 5000. Higher = more terms but slower.

💡 Tips for Best Results

📚 Example Use Cases