My experience running DeepSeek R1 locally via LLaMA.cpp and R1 distilled variants in vLLM. Findings suggest that LLaMA.cpp is viable for lightweight use but lacks the batch inferencing performance needed for large-scale tasks, where vLLM performs significantly better.