Skip to content

ai-trainer

AI model training and validation for Kodachi OS command intelligence

Version: 9.0.1 | Size: 2.8MB | Author: Warith Al Maawali warith@digi77.com

License: LicenseRef-Kodachi-SAN-1.0 | Website: https://www.digi77.com


File Information

Property Value
Binary Name ai-trainer
Version 9.0.1
Build Date 2026-02-26T08:01:51.744791725Z
Rust Version 1.82.0
File Size 2.8MB
JSON Data View Raw JSON

SHA256 Checksum

d121c6cd455ab7b107dc14abbe93a8d4901202c0a22beef60dfc3627e32552a4

Features

Feature Description
Feature TF-IDF based command embeddings
Feature Incremental model updates
Feature Model validation and accuracy testing

Security Features

Feature Description
Inputvalidation All inputs are validated and sanitized
Ratelimiting Built-in rate limiting for network operations
Authentication Secure authentication with certificate pinning
Encryption TLS 1.3 for all network communications

System Requirements

Requirement Value
OS Linux (Debian-based)
Privileges root/sudo for system operations
Dependencies OpenSSL, libcurl

Global Options

Flag Description
-h, --help Print help information
-v, --version Print version information
-n, --info Display detailed information
-e, --examples Show usage examples
--json Output in JSON format
--json-pretty Pretty-print JSON output with indentation
--json-human Enhanced JSON output with improved formatting (like jq)
--verbose Enable verbose output
--quiet Suppress non-essential output
--no-color Disable colored output
--config <FILE> Use custom configuration file
--timeout <SECS> Set timeout (default: 30)
--retry <COUNT> Retry attempts (default: 3)

Commands

Model Management

export

Export model embeddings and metadata to JSON file

Usage:

ai-trainer export --output <FILE> [--format <FORMAT>]

Examples:

ai-trainer export --output model_export.json

snapshot

Save current model as versioned snapshot

Usage:

ai-trainer snapshot --snapshot-version <VERSION>

Examples:

ai-trainer snapshot --snapshot-version 1.0.0
ai-trainer snapshot -s 1.1.0-beta

list-snapshots

List all saved model snapshots

Usage:

ai-trainer list-snapshots

Examples:

ai-trainer list-snapshots

status

Display current model status and statistics

Usage:

ai-trainer status

Examples:

ai-trainer status

download-model

Download ONNX model, tokenizer, or GGUF model for AI engine tiers

Usage:

ai-trainer download-model [--llm [default|small|large]] [--show-models] [--all] [--output-dir <DIR>] [--force]

Examples:

ai-trainer download-model
ai-trainer download-model --llm
ai-trainer download-model --llm small
ai-trainer download-model --all
ai-trainer download-model --show-models
ai-trainer download-model --force

Model Training

train

Train AI model from command metadata (full retraining)

Usage:

ai-trainer train --data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer train --data commands.json
ai-trainer train --data commands.json --json

incremental

Update model incrementally with new command data

Usage:

ai-trainer incremental --new-data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer incremental --new-data new_commands.json

Validation & Testing

validate

Validate model accuracy against test dataset

Usage:

ai-trainer validate --test-data <FILE> [--threshold <THRESHOLD>]

Examples:

ai-trainer validate --test-data test_cases.json

Operational Scenarios

Scenario-oriented workflows generated from the binary's built-in -e --json examples.

Scenario 1: Model Training

Full model training operations

Step 1: Train model with command data

sudo ai-trainer train --data commands.json
Expected Output: Training statistics and embeddings metrics

Note

Creates new model from scratch

Step 2: Train with custom database

sudo ai-trainer train --data commands.json --database custom.db
Expected Output: Training results with custom DB location

Note

Allows custom database path specification

Step 3: Train and output results as JSON

sudo ai-trainer train --data commands.json --json
Expected Output: JSON-formatted training metrics

Note

Structured output for automation

Scenario 2: Incremental Training

Update existing models with new data

Step 1: Incrementally train with new data

sudo ai-trainer incremental --new-data updates.json
Expected Output: New embeddings added to existing model

Note

Requires existing trained model

Step 2: Incremental training with custom DB and JSON output

sudo ai-trainer incremental --new-data updates.json --database custom.db --json
Expected Output: JSON-formatted incremental training results

Note

Combines custom DB path with structured output

Scenario 3: Validation

Model accuracy testing and validation

Step 1: Validate model with test data

sudo ai-trainer validate --test-data test_commands.json
Expected Output: Validation results with accuracy metrics

Note

Tests model against known test cases

Step 2: Validate with custom accuracy threshold

sudo ai-trainer validate --test-data test_commands.json --threshold 0.90
Expected Output: Pass/fail validation with 90% threshold

Note

Default threshold is 0.85

Step 3: Validate with custom DB and JSON output

sudo ai-trainer validate --test-data test_commands.json --database custom.db --json
Expected Output: JSON-formatted validation metrics

Note

Structured validation results

Step 4: Validate with all parameters combined

sudo ai-trainer validate --test-data tests.json --threshold 0.90 --database custom.db --json
Expected Output: JSON validation with custom test data, 90% threshold, and custom DB

Note

Full parameter example for CI/CD pipelines

Scenario 4: Model Export

Export trained models and statistics

Step 1: Export trained model

sudo ai-trainer export --output model_export.json
Expected Output: Complete model export with embeddings

Note

Default format includes all embeddings

Step 2: Export in compact format

sudo ai-trainer export --output model_compact.json --format compact
Expected Output: Compact model export without full embeddings

Note

Reduces export file size

Step 3: Export statistics as JSON

sudo ai-trainer export --output model_stats.json --format stats --json
Expected Output: Model statistics without embeddings

Note

Lightweight statistics export

Step 4: Full export with JSON envelope output

sudo ai-trainer export --output model.json --format full --json
Expected Output: Complete model export with JSON status envelope

Note

Combines full embeddings export with structured output

Scenario 5: Snapshots

Model versioning and snapshot management

Step 1: Create model snapshot with version

sudo ai-trainer snapshot --snapshot-version 1.0.0
Expected Output: Versioned snapshot created successfully

Note

Preserves model state at specific version

Step 2: List all model snapshots

sudo ai-trainer list-snapshots
Expected Output: List of saved model versions

Note

Shows snapshot metadata and versions

Step 3: List snapshots as JSON

sudo ai-trainer list-snapshots --json
Expected Output: JSON-formatted snapshot listing

Note

Structured snapshot information

Step 4: Create snapshot with JSON output

sudo ai-trainer snapshot --snapshot-version 1.0.0 --json
Expected Output: JSON with snapshot name, version, and embedding count

Note

Structured output for automation

Scenario 6: Model Download

Download ONNX and GGUF model files for AI engine tiers

Step 1: Download ONNX embeddings model to default models/ directory

sudo ai-trainer download-model
Expected Output: Model files downloaded successfully

Note

Downloads all-MiniLM-L6-v2 ONNX model and tokenizer

Step 2: Download default GGUF model (Qwen2.5-3B-Instruct Q4_K_M, ~1.8GB)

sudo ai-trainer download-model --llm
Expected Output: GGUF model downloaded to models/ directory

Note

Best balance of quality, speed, and size for CPU inference

Step 3: Download small GGUF model (Qwen2.5-1.5B, ~0.9GB)

sudo ai-trainer download-model --llm small
Expected Output: Small GGUF model downloaded

Note

For systems with <4GB available RAM

Step 4: Download large GGUF model (Phi-3.5-mini, ~2.3GB)

sudo ai-trainer download-model --llm large
Expected Output: Large GGUF model downloaded

Note

Better reasoning, 128K trained context

Step 5: Download both ONNX embeddings and default GGUF model

sudo ai-trainer download-model --all
Expected Output: All model files downloaded

Note

Complete setup for all AI tiers

Step 6: List downloaded and available models

sudo ai-trainer download-model --show-models
Expected Output: Model inventory with sizes and status

Note

Shows what's installed and what can be downloaded

Step 7: Model inventory as JSON

sudo ai-trainer download-model --show-models --json
Expected Output: JSON with downloaded and available model details

Step 8: Force re-download of ONNX model

sudo ai-trainer download-model --force
Expected Output: Model files re-downloaded

Note

Overwrites existing files

Scenario 7: Status

Model status and health checks

Step 1: Show training status

sudo ai-trainer status
Expected Output: Current model status and statistics

Note

Displays model readiness and metrics

Step 2: Show training status as JSON

sudo ai-trainer status --json
Expected Output: JSON-formatted status information

Note

Structured status output for automation

Scenario 8: AI Tier Integration

Training operations related to the 6-tier AI engine (TF-IDF, ONNX, Mistral.rs, GenAI/Ollama, Legacy LLM, Claude)

Step 1: Validate model against all tier responses

sudo ai-trainer validate --test-data tests.json --json
Expected Output: Validation results covering all active AI tiers

Note

Tests model accuracy across available tiers

Step 2: Train model with feedback from all tiers

sudo ai-trainer train --data commands.json --json
Expected Output: Training metrics including multi-tier feedback data

Note

Includes feedback from mistral.rs and GenAI tier executions

Scenario 9: ONNX Intent Classifier

Evaluate the ONNX intent classifier used for fast-path routing (12 categories, <5ms inference)

Step 1: Evaluate ONNX intent classifier accuracy

sudo ai-trainer validate --test-data intent_tests.json --json
Expected Output: JSON with per-intent precision, recall, and F1-score

Note

Target: 95%+ accuracy on held-out test set

Step 2: Check if intent classifier model is downloaded

sudo ai-trainer download-model --show-models --json
Expected Output: JSON showing classifier model status

Note

Model: kodachi-intent-classifier.onnx (~65MB)

Command Examples (Raw)

Model Training

Full model training operations

Train model with command data

sudo ai-trainer train --data commands.json
Expected Output: Training statistics and embeddings metrics

Note

Creates new model from scratch

Train with custom database

sudo ai-trainer train --data commands.json --database custom.db
Expected Output: Training results with custom DB location

Note

Allows custom database path specification

Train and output results as JSON

sudo ai-trainer train --data commands.json --json
Expected Output: JSON-formatted training metrics

Note

Structured output for automation

Incremental Training

Update existing models with new data

Incrementally train with new data

sudo ai-trainer incremental --new-data updates.json
Expected Output: New embeddings added to existing model

Note

Requires existing trained model

Incremental training with custom DB and JSON output

sudo ai-trainer incremental --new-data updates.json --database custom.db --json
Expected Output: JSON-formatted incremental training results

Note

Combines custom DB path with structured output

Validation

Model accuracy testing and validation

Validate model with test data

sudo ai-trainer validate --test-data test_commands.json
Expected Output: Validation results with accuracy metrics

Note

Tests model against known test cases

Validate with custom accuracy threshold

sudo ai-trainer validate --test-data test_commands.json --threshold 0.90
Expected Output: Pass/fail validation with 90% threshold

Note

Default threshold is 0.85

Validate with custom DB and JSON output

sudo ai-trainer validate --test-data test_commands.json --database custom.db --json
Expected Output: JSON-formatted validation metrics

Note

Structured validation results

Validate with all parameters combined

sudo ai-trainer validate --test-data tests.json --threshold 0.90 --database custom.db --json
Expected Output: JSON validation with custom test data, 90% threshold, and custom DB

Note

Full parameter example for CI/CD pipelines

Model Export

Export trained models and statistics

Export trained model

sudo ai-trainer export --output model_export.json
Expected Output: Complete model export with embeddings

Note

Default format includes all embeddings

Export in compact format

sudo ai-trainer export --output model_compact.json --format compact
Expected Output: Compact model export without full embeddings

Note

Reduces export file size

Export statistics as JSON

sudo ai-trainer export --output model_stats.json --format stats --json
Expected Output: Model statistics without embeddings

Note

Lightweight statistics export

Full export with JSON envelope output

sudo ai-trainer export --output model.json --format full --json
Expected Output: Complete model export with JSON status envelope

Note

Combines full embeddings export with structured output

Snapshots

Model versioning and snapshot management

Create model snapshot with version

sudo ai-trainer snapshot --snapshot-version 1.0.0
Expected Output: Versioned snapshot created successfully

Note

Preserves model state at specific version

List all model snapshots

sudo ai-trainer list-snapshots
Expected Output: List of saved model versions

Note

Shows snapshot metadata and versions

List snapshots as JSON

sudo ai-trainer list-snapshots --json
Expected Output: JSON-formatted snapshot listing

Note

Structured snapshot information

Create snapshot with JSON output

sudo ai-trainer snapshot --snapshot-version 1.0.0 --json
Expected Output: JSON with snapshot name, version, and embedding count

Note

Structured output for automation

Model Download

Download ONNX and GGUF model files for AI engine tiers

Download ONNX embeddings model to default models/ directory

sudo ai-trainer download-model
Expected Output: Model files downloaded successfully

Note

Downloads all-MiniLM-L6-v2 ONNX model and tokenizer

Download default GGUF model (Qwen2.5-3B-Instruct Q4_K_M, ~1.8GB)

sudo ai-trainer download-model --llm
Expected Output: GGUF model downloaded to models/ directory

Note

Best balance of quality, speed, and size for CPU inference

Download small GGUF model (Qwen2.5-1.5B, ~0.9GB)

sudo ai-trainer download-model --llm small
Expected Output: Small GGUF model downloaded

Note

For systems with <4GB available RAM

Download large GGUF model (Phi-3.5-mini, ~2.3GB)

sudo ai-trainer download-model --llm large
Expected Output: Large GGUF model downloaded

Note

Better reasoning, 128K trained context

Download both ONNX embeddings and default GGUF model

sudo ai-trainer download-model --all
Expected Output: All model files downloaded

Note

Complete setup for all AI tiers

List downloaded and available models

sudo ai-trainer download-model --show-models
Expected Output: Model inventory with sizes and status

Note

Shows what's installed and what can be downloaded

Model inventory as JSON

sudo ai-trainer download-model --show-models --json
Expected Output: JSON with downloaded and available model details

Force re-download of ONNX model

sudo ai-trainer download-model --force
Expected Output: Model files re-downloaded

Note

Overwrites existing files

Status

Model status and health checks

Show training status

sudo ai-trainer status
Expected Output: Current model status and statistics

Note

Displays model readiness and metrics

Show training status as JSON

sudo ai-trainer status --json
Expected Output: JSON-formatted status information

Note

Structured status output for automation

AI Tier Integration

Training operations related to the 6-tier AI engine (TF-IDF, ONNX, Mistral.rs, GenAI/Ollama, Legacy LLM, Claude)

Validate model against all tier responses

sudo ai-trainer validate --test-data tests.json --json
Expected Output: Validation results covering all active AI tiers

Note

Tests model accuracy across available tiers

Train model with feedback from all tiers

sudo ai-trainer train --data commands.json --json
Expected Output: Training metrics including multi-tier feedback data

Note

Includes feedback from mistral.rs and GenAI tier executions

ONNX Intent Classifier

Evaluate the ONNX intent classifier used for fast-path routing (12 categories, <5ms inference)

Evaluate ONNX intent classifier accuracy

sudo ai-trainer validate --test-data intent_tests.json --json
Expected Output: JSON with per-intent precision, recall, and F1-score

Note

Target: 95%+ accuracy on held-out test set

Check if intent classifier model is downloaded

sudo ai-trainer download-model --show-models --json
Expected Output: JSON showing classifier model status

Note

Model: kodachi-intent-classifier.onnx (~65MB)

Environment Variables

Variable Description Default Values
RUST_LOG Set logging level info error
NO_COLOR Disable all colored output when set unset 1

Exit Codes

Code Description
3 Permission denied
0 Success
5 File not found
4 Network error
1 General error
2 Invalid arguments