Arcee AI has launched Arcee-Spark, their first custom model based on Qwen2 7B, demonstrating impressive potential. This new model outperforms Meta’s Llama 3 8B on AGIEval and OpenAI’s GPT-3.5 on MT-Bench, setting a new standard for small language models.
Key Features of Arcee-Spark:
- Fine-Tuning: Qwen2 Base fine-tuned on 1.8 million samples
- Model Merging: Combined with Qwen2-7B-Instruct using MergeKit
- Post-Training: Further enhanced through Direct Policy Optimization (DPO)
- Performance Metrics:
- AGIEval: 51.11
- MT-Bench: 8.46
- BigBenchHard: 45.78
- EQ-Bench: 71.4
- Licensing: Released under Apache 2.0 license
The Evolution of Small LLMs
It’s astounding how far small language models have progressed. The Qwen2 7B is now able to match GPT-3.5, a model that amazed us less than two years ago with ChatGPT.