
Y Droid
Add a review FollowOverview
-
Founded Date June 9, 1945
-
Sectors Public catering and catering establishments
-
Posted Jobs 0
-
Viewed 5
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation thinking designs, attaining performance equivalent to OpenAI-o1 across math, code, and reasoning tasks.
Models
DeepSeek-R1
Distilled models
DeepSeek team has actually shown that the thinking patterns of larger designs can be distilled into smaller sized models, to much better performance compared to the thinking patterns found through RL on small models.
Below are the designs developed via fine-tuning versus a number of dense models widely utilized in the research community utilizing thinking data created by DeepSeek-R1. The evaluation results demonstrate that the distilled smaller thick models perform extremely well on standards.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are licensed under the MIT License. DeepSeek-R1 series assistance industrial usage, permit for any adjustments and acquired works, including, but not restricted to, distillation for training other LLMs.