LoRA Fine-tuning of DeepSeek-R1 | Venkata Manideep Patibandla

Fine-tuned DeepSeek-R1-Distill-Qwen-1.5B with LoRA (rank 16, alpha 32) on GSM8K for mathematical reasoning.

Results

The 1.5B fine-tuned model is competitive with models 100× larger on a single consumer GPU, enabling deployment on resource-constrained hardware.

PyTorch · LoRA · PEFT · Unsloth · HuggingFace Transformers · GSM8K