Updated
Updated · TechCrunch · May 14
Forum AI Reaches 90% Expert Consensus on Model Reviews, Targeting High-Stakes Topics
Updated
Updated · TechCrunch · May 14

Forum AI Reaches 90% Expert Consensus on Model Reviews, Targeting High-Stakes Topics

1 articles · Updated · TechCrunch · May 14
  • Founded 17 months ago, Forum AI says its AI judges now match human experts about 90% of the time when evaluating foundation models on geopolitics, mental health, finance and hiring.
  • Campbell Brown built the New York startup after seeing ChatGPT emerge as a likely information gateway while still producing weak answers, and argues major model makers prioritize coding and math over accuracy in nuanced subjects.
  • Forum AI’s early reviews found problems including Gemini citing Chinese Communist Party websites for unrelated stories, broad left-leaning political bias, and subtler failures such as missing context and straw-manning arguments.
  • Brown is pitching that scrutiny to enterprises in lending, insurance and hiring, where liability can force demand for better evaluation, even as she says today’s compliance audits and standardized benchmarks often miss serious violations.
  • Backed by a $3 million round led by Lerer Hippeau last fall, the company is betting distrust of consumer AI and pressure for safer business use will create a market for expert-built model testing.
Will corporate liability force AI to be truthful in ways social media never was?
Can AI ever be truly accurate when it learns from a biased and imperfect human world?
When experts define AI's 'truth,' whose version of reality becomes the global standard?

Forum AI’s 90% Consensus Model: Bridging the Gap Between AI Hype, Trust, and Regulation

Overview

Forum AI has quickly become a notable player in the AI evaluation sector after securing $3 million in funding. The company was founded to address the gap between the bold promises made by tech leaders about AI’s potential and the reality that most users still face unreliable chatbot responses. Forum AI’s mission is to bridge this disparity by focusing on enhancing reliability and user trust. This aligns with a broader industry movement toward AI trust and safety, recognizing both the transformative benefits and significant risks of AI. Forum AI’s efforts highlight the need for collaborative, expert-driven approaches to ensure AI technologies are both effective and trustworthy.

...