Skip to content

OpenAI's GDPval: AI Models Match Humans in High-Earning Tasks

AI models are catching up to humans in professional tasks. The future of work will value uniquely human skills more.

There is a poster in which there is a robot, there are animated persons who are operating the...
There is a poster in which there is a robot, there are animated persons who are operating the robot, there are artificial birds flying in the air, there are planets, there is ground, there are stars in the sky, there is watermark, there are numbers and texts.

OpenAI's GDPval: AI Models Match Humans in High-Earning Tasks

OpenAI's new benchmark, GDPval, has revealed impressive results, with AI models like ChatGPT-5 and Anthropic's Claude Opus 4.1 matching or even surpassing human professionals in various economically valuable tasks. GDPval, designed to measure AI's ability to perform professional tasks, consists of 1,320 real-world tasks sourced from 44 high-earning occupations. The benchmark uses a blind comparison method, where AI models and human experts complete the same tasks, and expert graders judge the quality of the outputs without knowing the source. In the initial test runs, OpenAI's ChatGPT-5 and Claude Opus 4.1 emerged as top performers. ChatGPT-5 led in tasks demanding high accuracy and following complex, multi-step instructions, while Claude Opus 4.1 excelled particularly in tasks requiring a strong sense of aesthetics. Overall, AI models are approaching or matching the quality of experienced human professionals in many tasks. However, AI's most common failure was not following instructions precisely, highlighting the need for human oversight. GDPval reframes the 'AI and jobs' debate, showing that jobs will change rather than disappear as AI automates more routine tasks. For businesses, GDPval provides a practical roadmap to identify workflows that can be augmented by AI, freeing up employees to focus on high-level, creative, and strategic work. The future of professional work will value uniquely human skills more, such as strategic thinking, complex problem-solving, client relationships, and creative judgment. As AI models like ChatGPT-5 and Claude Opus 4.1 continue to improve, they will likely augment and transform various sectors of the U.S. economy, changing the nature of work rather than replacing it.

Read also:

Latest