Muon - 搜索 News

22 天

近日，月之暗面团队宣布其开源改进版的Muon优化器在算力需求上相较于传统优化器AdamW锐减48%。这一突破由OpenAI的技术人员提出的训练优化算法Muon演变而来，经过团队深入研究与优化，结果令人振奋。团队通过实验发现，Muon不仅在参数量最高达到1.5B的Llama架构模型上表现优异，其算力需求仅为AdamW的52%。这一进展标志着Muon的可扩展性得到了验证，为更大规模的训练奠定了基础。

腾讯网21 天

开源赛道太挤了！月之暗面开源新版Muon优化器

机器之心报道编辑：陈陈、佳琪省一半算力跑出2倍效果，月之暗面开源优化器Muon，同预算下全面领先。月之暗面和 DeepSeek 这次又「撞车」了。上次是论文，两家几乎前后脚放出改进版的注意力机制，可参考《撞车 DeepSeek NSA，Kimi 杨植麟署名的新注意力架构 MoBA 发布，代码也公开》、《刚刚！DeepSeek ...

1 天

【Muon Space】成功部署FireSat原型卫星，全球野火监测迈出重要一步

2025年3月15日，Muon ...

新浪网22 天

月之暗面开源改进版Muon优化器，算力需求比AdamW锐减48%，DeepSeek也适用

算力需求比AdamW直降48%，OpenAI技术人员提出的训练优化算法Muon，被月之暗面团队又推进了一步！团队发现了Muon方法的Scaling Law，做出改进并证明了 ...

腾讯网21 天

月之暗面Kimi推出Moonlight：30 亿/160 亿参数混合专家模型

IT之家 2 月 24 日消息，月之暗面 Kimi 昨日发布了“Muon 可扩展用于 LLM 训练”的新技术报告，并宣布推出“Moonlight”：一个在 Muon 上训练的 30 亿 / 160 亿参数混合专家模型（MoE）。使用了 5.7 万亿个 token，在更低的浮点运算次数（FLOPs）下实现了更好的性能，从而提升了帕累托效率边界。月之暗面称，团队发现 ...

2 天

Muon Space Deploys FireSat Protoflight, Marking a Major Milestone in Global Wildfire Monitoring

Muon Space, an end-to-end space systems provider, has successfully launched the FireSat Protoflight satellite, marking a ...

来自MSN22 天

月之暗面开源改进版Muon优化器，算力需求比AdamW锐减48%，DeepSeek也适用

克雷西发自凹非寺量子位 | 公众号 QbitAI 算力需求比AdamW直降48%，OpenAI技术人员提出的训练优化算法Muon，被月之暗面团队又推进了一步！团队发现 ...

4 小时

FireSat Protoflight Commences, Ushering in a New Era of Wildfire Monitoring & Detection

Earth Fire Alliance, the global nonprofit coalition committed to delivering transformative data and insights from all ...

腾讯网21 天

月之暗面Kimi推出Moonlight：30 亿/160 亿参数混合专家模型

IT之家 2 月 24 日消息，月之暗面 Kimi 昨日发布了“Muon 可扩展用于 LLM 训练”的新技术报告，并宣布推出“Moonlight”：一个在 Muon 上训练的 30 亿 / 160 ...

Physics World27 天

The muon’s magnetic moment exposes a huge hole in the Standard Model – unless it doesn’t

A tense particle-physics showdown will reach new heights in 2025. Over the past 25 years researchers have seen a persistent and growing discrepancy between the theoretical predictions and experimental ...

Hackaday18 天

Building A DIY Muon Tomography Device For About $100

Muon tomography, or muography, is the practice of using muons generated by cosmic rays interacting with Earth’s atmosphere to image structures on Earth’s surface, akin to producing an X-ray.

品玩 on MSN21 天

月之暗面 Kimi 开源 MoE 模型

品玩2月24日讯，Kimi 上周末发布技术报告，宣布开源 MoE 模型 Moonlight-16B-A3B。报告表示，Kimi通过深度改造 Muon 优化器，并将其运用于实际训练，证明了 Muon 在更大规模训练中的有效性，是 AdamW 训练效率的 2 倍，且模型性能相当。据悉，本次论文所使用的模型为 Moonlight-16B-A3B，总参数量为 15.29B，激活参数为 2.24B，其使 ...

当前正在显示可能无法访问的结果。

隐藏无法访问的结果