90% of AI chatbot answers about midterm elections are flawed, stunning analysis shows
Forum AI analyzed 12,542 responses from ChatGPT, Claude, Gemini and Grok to 3,136 questions on US politics and foreign policy. It found ~30% of answers had factual errors and ~25% failed a neutrality check. On foreign-policy prompts, state-run outlets were cited 35% of the time; ChatGPT 51%, Grok 44%. Error rates ranged from 9% (ChatGPT) to 43% (Grok).

Primarily reputational/positioning risk for Meta’s AI/news surfaces rather than direct financial impact.
Forum AI is led by Campbell Brown, a former head of news partnerships at Meta, linking the study’s credibility to Meta’s ecosystem.
Low likelihood of near-term price impact; any effect would be indirect via sentiment around AI/news reliability.
Background
Forum AI audited four popular chatbots (ChatGPT, Claude, Gemini, Grok) using 3,136 questions and judged 12,542 responses for accuracy and neutrality.
Why it matters
Quantified findings (e.g., state-run outlet citations and political directional bias) could affect perceived trust in AI news/current-events outputs, influencing adoption and regulatory/PR risk narratives for major AI providers.
Market relevance
This is a trust/accuracy risk narrative for AI news assistants, with the most direct read-across to Google’s Gemini performance perception.
Market effects
Highlights model reliability and political-bias risks for consumer AI assistants and could increase scrutiny of AI-generated news content.
Primarily US-focused election/foreign-policy prompts; broader global trust implications for AI vendors.
Cites state-run media misattribution (China/Russia/Iran), reinforcing geopolitical information-integrity concerns worldwide.
Alternative perspectives
The audit covers election/foreign-policy prompts only; performance on other tasks or with improved retrieval/guardrails may be materially better.
The study is from a startup audit with its own methodology; without replication across versions/time, the market may overreact to a snapshot.
Key entities
- startupForum AI
Conducted the audit of chatbot responses for factual accuracy and political neutrality.
- AI chatbotGemini
Google’s chatbot evaluated; reported higher error rate and neutrality failures on foreign-policy prompts.
- AI chatbotChatGPT
OpenAI’s chatbot evaluated; reported 9% error rate and directional bias patterns on election prompts.
- AI chatbotClaude
Anthropic’s chatbot evaluated; reported 41% error rate and left-leaning directional failures.
- AI chatbotGrok
xAI’s chatbot evaluated; reported 43% error rate and right-leaning directional failures.




