DeepSeek Releases Distilled R1 AI Model for Single GPU

DeepSeek has launched a smaller, distilled version of its R1 reasoning AI model, DeepSeek-R1-0528-Qwen3-8B. This new model offers comparable performance to larger models while requiring significantly less computing power.

Impressive Performance on a Single GPU

Built upon Alibaba's Qwen3-8B model, DeepSeek-R1-0528-Qwen3-8B outperforms Google's Gemini 2.5 Flash on the AIME 2025 math benchmark. It also achieves near parity with Microsoft's Phi 4 reasoning plus model on the HMMT math skills test.

While distilled models are generally less capable than their full-sized counterparts, they offer a crucial advantage: reduced computational demands. The full R1 model requires around twelve 80GB GPUs. In contrast, DeepSeek-R1-0528-Qwen3-8B runs on a single GPU with 40GB-80GB of RAM, similar to the hardware requirements of Qwen3-8B.

Training and Availability

DeepSeek trained the new model by fine-tuning Qwen3-8B using text generated by the full R1 model. The company positions DeepSeek-R1-0528-Qwen3-8B as a tool for both academic research on reasoning models and industrial development focused on smaller-scale models.

Released under a permissive MIT license, DeepSeek-R1-0528-Qwen3-8B is available for commercial use without restriction. Several platforms, including LM Studio, already offer API access to the model.