LoRA Fine-tuning of DeepSeek-R1

Mathematical reasoning on a 1.5B model — 98.8% parameter reduction, 2× throughput with Unsloth

Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B with LoRA (rank 16, alpha 32) on GSM8K for mathematical reasoning.

Results

Metric Value
GSM8K accuracy 58.2% on a 1.5B model
Trainable parameters 18M / 1.5B (98.8% reduction)
Inference throughput 30 tok/s (up from 15 — 2× with Unsloth)
Hardware Single consumer GPU

The 1.5B fine-tuned model is competitive with models 100× larger on a single consumer GPU, enabling deployment on resource-constrained hardware.

Tech Stack

PyTorch · LoRA · PEFT · Unsloth · HuggingFace Transformers · GSM8K

GitHub