FrontierMath, a test with expert-level problems designed to measure an AI’s mathematical skills, was one of the benchmarks OpenAI used to demo its upcoming flagship AI, o3. In a post on the ...