Skip to content

New Method DeepConf Revolutionizes Language Models' Math Reasoning

DeepConf boosts accuracy and cuts token consumption. It's a game-changer for language models' mathematical reasoning.

In this image a woman is speaking with the help of microphone, besides to her a man is standing,...
In this image a woman is speaking with the help of microphone, besides to her a man is standing, and he is holding a paper in his hand, in the background we can see a bottle, Cup, glass, papers, laptop on the table and also we can find couple of chairs, speaker and a projector screen.

New Method DeepConf Revolutionizes Language Models' Math Reasoning

A new method, DeepConf, is set to revolutionize language models. Developed by Prateek Sharma, Kushnendu Chatterjee, and Adam Wierman from Meta and UCSD, this technique enhances mathematical reasoning while reducing computational costs. The code is openly available on GitHub.

DeepConf operates in two modes: offline and online. The online mode, more efficient, stops low-confidence paths early. It analyzes a model's confidence in its predictions to filter out low-quality solution paths. This results in improved accuracy and reduced token consumption.

The team has released aggressive and conservative variants of DeepConf. The aggressive variant cuts token consumption by up to 84.7%, while the conservative variant reduces it by up to 59%. However, the conservative variant is recommended for more stable results, despite being slightly less efficient. DeepConf achieved 99.9% accuracy with gpt-oss-120B on AIME 2025 in offline mode and 97.9% accuracy in online mode with an 84.7% reduction in token consumption. However, it may struggle with models overly confident in wrong answers, especially in aggressive mode.

DeepConf, a method that improves mathematical reasoning in language models, is now openly available. It reduces computational costs, increases accuracy, and can be integrated into existing systems with minimal code changes. The team recommends the conservative variant for stable results, and it has shown promising results with gpt-oss-120B.

Read also:

Latest