DeepSeek Releases Distilled R1 AI Model for Single GPU
DeepSeek has launched a smaller, distilled version of its R1 reasoning AI model, DeepSeek-R1-0528-Qwen3-8B. This new model offers comparable performance to larger models while requiring significantly less computing power.
Impressive Performance on a Single GPU
Built upon Alibaba's Qwen3-8B model, DeepSeek-R1-0528-Qwen3-8B outperforms Google's Gemini 2.5 Flash on the AIME 2025 math benchmark. It also achieves near parity with Microsoft's Phi 4 reasoning plus model on the HMMT math skills test.
While distilled models are generally less capable than their full-sized counterparts, they offer a crucial advantage: reduced computational demands. The full R1 model requires around twelve 80GB GPUs. In contrast, DeepSeek-R1-0528-Qwen3-8B runs on a single GPU with 40GB-80GB of RAM, similar to the hardware requirements of Qwen3-8B.
Training and Availability
DeepSeek trained the new model by fine-tuning Qwen3-8B using text generated by the full R1 model. The company positions DeepSeek-R1-0528-Qwen3-8B as a tool for both academic research on reasoning models and industrial development focused on smaller-scale models.
Released under a permissive MIT license, DeepSeek-R1-0528-Qwen3-8B is available for commercial use without restriction. Several platforms, including LM Studio, already offer API access to the model.
- Learn more about the updated R1 model: TechCrunch Article
- Explore Qwen3-8B: TechCrunch Article on Qwen3
- Read about Gemini 2.5 Flash: TechCrunch Article on Gemini
- Learn about Microsoft's Phi 4: TechCrunch Article on Phi 4
- Qwen3-8B Installation: NodeShift Blog
- DeepSeek R1 Hardware Requirements: Dev.to Article
- LM Studio API Access: LM Studio Announcement