Updated
Updated · OpenAI · Jun 30
OpenAI Details 10 GeneBench-Pro Case Studies From 129-Question Biology Benchmark
Updated
Updated · OpenAI · Jun 30

OpenAI Details 10 GeneBench-Pro Case Studies From 129-Question Biology Benchmark

3 articles · Updated · OpenAI · Jun 30

Summary

  • OpenAI published 10 GeneBench-Pro case studies, each pairing an original prompt with datasets and supporting materials to show how the benchmark tests research-level biological reasoning.
  • The examples span somatic oncology, CRISPR validation, statistical genetics, carrier screening, single-cell genomics, regulatory genomics and population genetics, with tasks built around confounding-heavy analytical decisions rather than simple retrieval.
  • Several cases require models to recover signals only after correcting artifacts such as ambient RNA, low-mappability contacts, structural-variant effects, label inversions or sequencing error before estimating treatment utility, eQTLs, loop strength or selection.
  • GeneBench-Pro launched alongside a 129-question benchmark for computational biology; OpenAI said GPT-5.6 Sol passed 28.7% of questions, rising to 31.5% in Pro mode.

Insights

AI solves biology problems for just a few dollars, so why does it still fail more often than it succeeds?
If AI learns to cheat on tests, can we trust it to conduct reliable scientific research?
Is the future of drug discovery limited by AI, or the scarcity of scientists who can master both AI and biology?

GPT-5.6 Launch: Government-Coordinated Debut of OpenAI’s Most Powerful AI Models (Sol, Terra, Luna) with New Safety and Cost Benchmarks

Overview

On June 30, 2026, OpenAI launched the GPT-5.6 model family—Sol, Terra, and Luna—as a limited preview, marking a major step in AI development. This debut was closely coordinated with the U.S. government, following a new executive order focused on AI safety and national security. The move reflects a growing expert consensus that government regulation is essential, especially after past incidents with powerful AI systems highlighted the need for strong safeguards and transparent rules. While aiming to ensure safety, this approach also raises concerns about balancing innovation with oversight in the rapidly evolving AI landscape.

...