{"ID":2824212,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.23213","arxiv_id":"2512.23213","title":"Scoring, Reasoning, and Selecting the Best! Ensembling Large Language Models via a Peer-Review Process","abstract":"We propose LLM-PeerReview, an unsupervised LLM Ensemble method that selects the most ideal response from multiple LLM-generated candidates for each query, harnessing the collective wisdom of multiple models with diverse strengths. LLM-PeerReview is built on a novel, peer-review-inspired framework that offers a transparent and interpretable mechanism, while remaining fully unsupervised for flexible adaptability and generalization. Specifically, it operates in three stages: For scoring, we use the emerging LLM-as-a-Judge technique to evaluate each response by reusing multiple LLMs at hand; For reasoning, we can apply a straightforward averaging strategy or a principled graphical model-based truth inference algorithm to aggregate multiple scores to produce a final score for each response; Finally, the highest-scoring response is selected as the best ensemble output. LLM-PeerReview is conceptually simple and empirically powerful. Our results across four datasets show that the two variants of the proposed approach outperform the advanced model Smoothie-Global by 6.9% and 7.3% points, cross diverse task types including factual recall QA, math reasoning, and instruction following.","short_abstract":"We propose LLM-PeerReview, an unsupervised LLM Ensemble method that selects the most ideal response from multiple LLM-generated candidates for each query, harnessing the collective wisdom of multiple models with diverse strengths. LLM-PeerReview is built on a novel, peer-review-inspired framework that offers a transpar...","url_abs":"https://arxiv.org/abs/2512.23213","url_pdf":"https://arxiv.org/pdf/2512.23213v3","authors":"[\"Zhijun Chen\",\"Zeyu Ji\",\"Qianren Mao\",\"Hao Wu\",\"Jinhuan Song\",\"Junhang Cheng\",\"Bangjie Qin\",\"Zhuoran Li\",\"Jingzheng Li\",\"Kai Sun\",\"Zizhe Wang\",\"Yikun Ban\",\"Zhu Sun\",\"Xiangyang Ji\",\"Hailong Sun\"]","published":"2025-12-29T05:25:49Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false}
