Completetinymodelraven Exclusive -
./raven_cli --model_path ./models/raven_exclusive --prompt "You are a helpful assistant" --low_memory_mode The exclusive version includes a lightweight JSON schema parser. This allows the tiny model to control IoT devices. For example, sending the prompt "Turn on the living room light and set thermostat to 72" yields structured output:
| Model | Size (GB) | Tokens/Sec | HellaSwag (0-shot) | GSM8K (Math) | Raven-Specific Score | | :--- | :--- | :--- | :--- | :--- | :--- | | TinyLlama 1.1B | 1.1 | 22 | 59.3 | 12.4 | 44.1 | | Phi-3 Mini (4k) | 1.8 | 18 | 68.2 | 65.9 | 61.2 | | Qwen-1.8B | 1.9 | 15 | 61.5 | 42.8 | 53.7 | | | 0.52 | 48 | 67.1 | 63.4 | 78.5 | completetinymodelraven exclusive
While the open-source community is flooded with generic distilled models, this specific iteration stands apart. It promises not only the efficiency of a "tiny" architecture but also the specialized fine-tuning and closed-set optimization that the "Raven" tag implies. It promises not only the efficiency of a
Unlock the full potential of edge AI today. Download the CompleteTinyModelRaven Exclusive from the official Raven Vault, and run state-of-the-art language models entirely offline, at 50 tokens per second, on hardware you already own. Have you integrated the CompleteTinyModelRaven Exclusive into your stack? Join the Raven Discord community to share benchmarks and custom fine-tunes. at 50 tokens per second
