LMArena Blog
  • Home
  • About
  • Leaderboard Changelog

News

LMArena's Ranking Method

LMArena's Ranking Method

Since launching the platform, developing a rigorous and scientifically grounded evaluation methodology has been central to our mission. A key component of this effort is providing proper statistical uncertainty quantification for model scores and rankings. To that end, we have always reported confidence intervals alongside Arena scores and surfaced any
LMArena Team 14 Nov 2025
The Next Stage of AI Coding Evaluation Is Here

The Next Stage of AI Coding Evaluation Is Here

Introducing Code Arena: live evals for agentic coding in the real world AI coding models have evolved fast. Today’s systems don’t just output static code in one shot. They build. They scaffold full web apps and sites, refactor complex systems, and debug themselves in real time. Many now
Aryan Vichare 12 Nov 2025
New Product: AI Evaluations

New Product: AI Evaluations

Today, we’re introducing a commercial product: AI Evaluations. This service offers enterprises, model labs, and developers comprehensive evaluation services grounded in real-world human feedback, showing how models actually perform in practice.
LMArena Team 16 Sep 2025
  • LMArena.ai
  • Terms of Use
  • Privacy Policy
  • Cookie Policy
LMArena Blog © 2025. Powered by Ghost