Search Arena & What We’re Learning About Human Preference Search Arena on LMArena goes live today, read more about what we've learned so far about human preference with the search-augmented data.
Hello from LMArena: The Community Platform for Exploring Frontier AI At LMArena, everything starts with the community. There have been a lot of new members joining us in the past few months so we thought it would be a good time to reintroduce ourselves! Created by researchers from UC Berkeley’s SkyLab, LMArena is an open platform where everyone can
LMArena and The Future of AI Reliability About a month ago, we announced that LMArena was becoming a company to better support our growing community platform. As we take this next step, we're staying true to our original mission of rigorous, neutral, and community-driven evaluations. Today, we’re excited to share that we’ve raised
Celebrating Community Impact: 3M+ votes, 400+ models, and 300+ pre-release tests To date, the community has evaluated over 400+ public models on LMArena as well as 300+ pre-release tests. Tens of millions of battle pairings have been served to users across the world, and each vote has shaped real-world AI performance and development. Around this time two years ago, the community
Does Sentiment Matter Too? Introducing Sentiment Control: Disentangling Sentiment and Substance Contributors: Connor Chen Wei-Lin Chiang Tianle Li Anastasios Angelopoulos Introduction You may have noticed that recent models on Chatbot Arena appear more emotionally expressive than their predecessors. But does this added sentiment actually improve their rankings on the leaderboard? Our previous exploration revealed
How Many User Prompts are New? We investigate 355,575 LLM battles from May 2024 to Dec 2024 to answer the following questions: 1. What proportion of prompts have never been seen before (aka “fresh”)? 2. What are common duplicate prompts? 3. How many prompts appear in widely used benchmarks?
LMArena is Growing to Support our Community Platform LMArena started as a scrappy academic project from UC Berkeley: just a handful of PhD students and undergrads working day and night on a research prototype. Today, we have two announcements: 1. We are starting a company to support LMArena! LMArena will stay neutral, open, and accessible to everyone. We