ai-trainer

AI model training and validation for Kodachi OS command intelligence

Version: 9.8.4 | Size: 4.2MB | Author: Warith Al Maawali <warith@digi77.com>

License: LicenseRef-Kodachi-SAN-1.0 | Website: https://www.digi77.com

File Information

Property	Value
Binary Name	ai-trainer
Version	9.8.4
Build Date	REDACTED-BUILD-TIME
Rust Version	1.82.0
File Size	4.2MB
Author	Warith Al Maawali <warith@digi77.com>
License	LicenseRef-Kodachi-SAN-1.0
Category	Kodachi Binary
Description	AI model training and validation for Kodachi OS command intelligence
Git Commit	unknown
Metadata Generated	2026-06-28T11:16:33Z
Binary Timestamp	Unknown
JSON Data	View Raw JSON

SHA256 Checksum

323087352162952478afdc6110033596ff5f9a4f626064c66cbb8299ca0c344c

Features

#	Feature
1	TF-IDF based command embeddings
2	Incremental model updates
3	Model validation and accuracy testing

Security Features

Feature	Description
Input Validation	Argument parsing via clap; per-command validation is the consumer's responsibility
Rate Limiting	Not provided by cli-core
Authentication	Not provided by cli-core (see online-auth)
Encryption	Not provided by cli-core

System Requirements

Requirement	Value
OS	Linux (Debian-based)
Privileges	root/sudo for system operations
Dependencies	OpenSSL, libcurl

Global Options

Flag	Description
`-h, --help`	Print help information
`-v, --version`	Print version information
`-n, --info`	Display detailed information
`-e, --examples`	Show usage examples
`--json`	Output in JSON format
`-o, --output-format <FORMAT>`	Force output format (text\|json)
`--json-pretty`	Pretty-print JSON output with indentation
`--json-human`	Enhanced JSON output with improved formatting (like jq)
`--fields <FIELD_LIST>`	Select specific fields to include in output (comma-separated)
`--limit <NUMBER>`	Limit number of results returned
`--offset <NUMBER>`	Skip first N results (for pagination)
`-d, --work-dir <PATH>`	Working directory (defaults to auto-detected base directory)
`--port <PORT>`	Set custom port number (1024-65535)
`--log-level <LEVEL>`	Set log level (error\|warn\|info\|debug)
`--verbose`	Enable verbose output
`--quiet`	Suppress non-essential output
`--no-color`	Disable colored output
`--config <FILE>`	Use custom configuration file
`--timeout <SECS>`	Set operation timeout in seconds (optional; no default applied)
`--retry <COUNT>`	Retry attempts (optional; no default applied)

Commands

Model Management

`export`

Export model embeddings and metadata to JSON file

Usage:

ai-trainer export --output <FILE> [--format <FORMAT>]

Examples:

ai-trainer export --output model_export.json

`snapshot`

Save current model as versioned snapshot

Usage:

ai-trainer snapshot --snapshot-version <VERSION>

Examples:

ai-trainer snapshot --snapshot-version 1.0.0

ai-trainer snapshot -s 1.1.0-beta

`list-snapshots`

List all saved model snapshots

Usage:

ai-trainer list-snapshots

Examples:

ai-trainer list-snapshots

`status`

Display current model status and statistics

Usage:

ai-trainer status

Examples:

ai-trainer status

`download-model`

Download ONNX model, tokenizer, or GGUF model for AI engine tiers

Usage:

ai-trainer download-model [--llm [default|small|large|xlarge|xlarge-hq]] [--show-models] [--all] [--output-dir <DIR>] [--force] [--allow-unverified-model]

Examples:

ai-trainer download-model --allow-unverified-model

ai-trainer download-model --llm --allow-unverified-model

ai-trainer download-model --llm small --allow-unverified-model

ai-trainer download-model --llm large --allow-unverified-model

ai-trainer download-model --llm xlarge --allow-unverified-model # Qwen3-8B Q4_K_M, SPEED tuned

ai-trainer download-model --llm xlarge-hq --allow-unverified-model # Qwen3-8B Q5_K_M, QUALITY tuned

ai-trainer download-model --all --allow-unverified-model

ai-trainer download-model --show-models

ai-trainer download-model --force --allow-unverified-model

Model Training

`train`

Train AI model from command metadata (full retraining)

Usage:

ai-trainer train --data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer train --data commands.json

ai-trainer train --data commands.json --json

`incremental`

Update model incrementally with new command data

Usage:

ai-trainer incremental --new-data <FILE> [--database <DB_PATH>]

Examples:

ai-trainer incremental --new-data new_commands.json

Validation & Testing

`validate`

Validate model accuracy against test dataset

Usage:

ai-trainer validate --test-data <FILE> [--threshold <THRESHOLD>]

Examples:

ai-trainer validate --test-data test_cases.json

Operational Scenarios

Scenario-oriented workflows generated from the binary's built-in -e --json examples.

Scenario 1: Model Training

Full model training operations

Step 1: Train model with command data

sudo ai-trainer train --data commands.json

Expected Output: Training statistics and embeddings metrics

Note

Creates new model from scratch

Step 2: Train with custom database

sudo ai-trainer train --data commands.json --database custom.db

Expected Output: Training results with custom DB location

Note

Allows custom database path specification

Step 3: Train and output results as JSON

sudo ai-trainer train --data commands.json --json

Expected Output: JSON-formatted training metrics

Note

Structured output for automation

Scenario 2: Incremental Training

Update existing models with new data

Step 1: Incrementally train with new data

sudo ai-trainer incremental --new-data updates.json

Expected Output: New embeddings added to existing model

Note

Requires existing trained model

Step 2: Incremental training with custom DB and JSON output

sudo ai-trainer incremental --new-data updates.json --database custom.db --json

Expected Output: JSON-formatted incremental training results

Note

Combines custom DB path with structured output

Scenario 3: Validation

Model accuracy testing and validation

Step 1: Validate model with test data

sudo ai-trainer validate --test-data test_commands.json

Expected Output: Validation results with accuracy metrics

Note

Tests model against known test cases

Step 2: Validate with custom accuracy threshold

sudo ai-trainer validate --test-data test_commands.json --threshold 0.90

Expected Output: Pass/fail validation with 90% threshold

Note

Default threshold is 0.85

Step 3: Validate with custom DB and JSON output

sudo ai-trainer validate --test-data test_commands.json --database custom.db --json

Expected Output: JSON-formatted validation metrics

Note

Structured validation results

Step 4: Validate with all parameters combined

sudo ai-trainer validate --test-data tests.json --threshold 0.90 --database custom.db --json

Expected Output: JSON validation with custom test data, 90% threshold, and custom DB

Note

Full parameter example for CI/CD pipelines

Scenario 4: Model Export

Export trained models and statistics

Step 1: Export trained model

sudo ai-trainer export --output model_export.json

Expected Output: Complete model export with embeddings

Note

Default format includes all embeddings

Step 2: Export in compact format

sudo ai-trainer export --output model_compact.json --format compact

Expected Output: Compact model export without full embeddings

Note

Reduces export file size

Step 3: Export statistics as JSON

sudo ai-trainer export --output model_stats.json --format stats --json

Expected Output: Model statistics without embeddings

Note

Lightweight statistics export

Step 4: Full export with JSON envelope output

sudo ai-trainer export --output model.json --format full --json

Expected Output: Complete model export with JSON status envelope

Note

Combines full embeddings export with structured output

Scenario 5: Snapshots

Model versioning and snapshot management

Step 1: Create model snapshot with version

sudo ai-trainer snapshot --snapshot-version 1.0.0

Expected Output: Versioned snapshot created successfully

Note

Preserves model state at specific version

Step 2: List all model snapshots

sudo ai-trainer list-snapshots

Expected Output: List of saved model versions

Note

Shows snapshot metadata and versions

Step 3: List snapshots as JSON

sudo ai-trainer list-snapshots --json

Expected Output: JSON-formatted snapshot listing

Note

Structured snapshot information

Step 4: Create snapshot with JSON output

sudo ai-trainer snapshot --snapshot-version 1.0.0 --json

Expected Output: JSON with snapshot name, version, and embedding count

Note

Structured output for automation

Scenario 6: Model Download

Download ONNX and GGUF model files for AI engine tiers

Step 1: Download ONNX embeddings model to default models/ directory

sudo ai-trainer download-model

Expected Output: Model files downloaded successfully

Note

Downloads all-MiniLM-L6-v2 ONNX model and tokenizer

Step 2: Download default GGUF model (Qwen3-1.7B Q4_K_M, ~1.1GB)

sudo ai-trainer download-model --llm

Expected Output: GGUF model downloaded to models/ directory

Note

Best balance of quality, speed, and size for CPU inference

Step 3: Download small GGUF model (Qwen3-1.7B Q4_K_S, ~1.0GB)

sudo ai-trainer download-model --llm small

Expected Output: Small GGUF model downloaded

Note

For systems with <4GB available RAM

Step 4: Download large GGUF model (Phi-3.5-mini, ~2.3GB)

sudo ai-trainer download-model --llm large

Expected Output: Large GGUF model downloaded

Note

Better reasoning, 128K trained context

Step 5: Download 8B GGUF model tuned for SPEED (Qwen3-8B Q4_K_M, ~4.8GB)

sudo ai-trainer download-model --llm xlarge

Expected Output: Qwen3-8B Q4_K_M downloaded

Note

8-billion-parameter Qwen3 at 4-bit quantization. Use on 8+ GB RAM systems for faster tokens-per-second. Lower quality than xlarge-hq, higher than default.

Step 6: Download 8B GGUF model tuned for QUALITY (Qwen3-8B Q5_K_M, ~5.6GB)

sudo ai-trainer download-model --llm xlarge-hq

Expected Output: Qwen3-8B Q5_K_M downloaded

Note

8-billion-parameter Qwen3 at 5-bit quantization. Recommended on 16+ GB RAM systems. Best local-LLM quality available in the catalog. About 15 percent slower than xlarge.

Step 7: Download both ONNX embeddings and default GGUF model

sudo ai-trainer download-model --all

Expected Output: All model files downloaded

Note

Complete setup for all AI tiers

Step 8: List downloaded and available models

sudo ai-trainer download-model --show-models

Expected Output: Model inventory with sizes and status

Note

Shows what's installed and what can be downloaded

Step 9: Model inventory as JSON

sudo ai-trainer download-model --show-models --json

Expected Output: JSON with downloaded and available model details

Step 10: Force re-download of ONNX model

sudo ai-trainer download-model --force

Expected Output: Model files re-downloaded

Note

Overwrites existing files

Scenario 7: Status

Model status and health checks

Step 1: Show training status

sudo ai-trainer status

Expected Output: Current model status and statistics

Note

Displays model readiness and metrics

Step 2: Show training status as JSON

sudo ai-trainer status --json

Expected Output: JSON-formatted status information

Note

Structured status output for automation

Scenario 8: AI Tier Integration

Training operations related to the 6-tier AI engine (TF-IDF, ONNX, Mistral.rs, GenAI/Ollama, Legacy LLM, Claude)

Step 1: Validate model against all tier responses

sudo ai-trainer validate --test-data tests.json --json

Expected Output: Validation results covering all active AI tiers

Note

Tests model accuracy across available tiers

Step 2: Train model with feedback from all tiers

sudo ai-trainer train --data commands.json --json

Expected Output: Training metrics including multi-tier feedback data

Note

Includes feedback from mistral.rs and GenAI tier executions

Scenario 9: ONNX Intent Classifier

Evaluate the ONNX intent classifier used for fast-path routing (12 categories, <5ms inference)

Step 1: Evaluate ONNX intent classifier accuracy

sudo ai-trainer validate --test-data intent_tests.json --json

Expected Output: JSON with per-intent precision, recall, and F1-score

Note

Target: 95%+ accuracy on held-out test set

Step 2: Check if intent classifier model is downloaded

sudo ai-trainer download-model --show-models --json

Expected Output: JSON showing classifier model status

Note

Model: kodachi-intent-classifier.onnx (~65MB)

Environment Variables

Variable	Description	Default	Values
`RUST_LOG`	Set logging level	info	error\|warn\|info\|debug\|trace
`NO_COLOR`	Disable all colored output when set	unset	1\|true\|yes (any value disables color)

Exit Codes

Code	Description
3	Permission denied
5	File not found
0	Success
1	General error
2	Invalid arguments
4	Network error