
Marinapamies
Add a review FollowOverview
-
Founded Date July 18, 1971
-
Sectors Beekeeping
-
Posted Jobs 0
-
Viewed 5
Company Description
DeepSeek’s First-generation Reasoning Models
DeepSeek’s first-generation reasoning models, attaining performance comparable to OpenAI-o1 across mathematics, code, and thinking tasks.
Models
DeepSeek-R1
Distilled designs
DeepSeek team has actually shown that the reasoning patterns of bigger designs can be distilled into smaller designs, resulting in much better efficiency compared to the reasoning patterns discovered through RL on small designs.
Below are the designs created via fine-tuning versus numerous thick designs widely used in the research study community using reasoning information generated by DeepSeek-R1. The evaluation results demonstrate that the distilled smaller thick models carry out well on benchmarks.
DeepSeek-R1-Distill-Qwen-1.5 B
DeepSeek-R1-Distill-Qwen-7B
DeepSeek-R1-Distill-Llama-8B
DeepSeek-R1-Distill-Qwen-14B
DeepSeek-R1-Distill-Qwen-32B
DeepSeek-R1-Distill-Llama-70B
License
The design weights are certified under the MIT License. DeepSeek-R1 series support commercial use, enable for any modifications and derivative works, consisting of, but not restricted to, distillation for training other LLMs.