其中long-CoT版本在关键评测基准上足够和o1抗衡,有在Code和Math方面甚至超过了o1,注意,这可是满血o1,不是o1-mini。 而short-CoT版本,对标的是GPT-4o ...
Carnegie Mellon University researchers propose a new LLM training technique that gives developers more control over chain-of-thought length.
Companies can freely deploy Light-R1-32B in commercial products, maintaining full control over their innovations.