{"ID":2832095,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.09946","arxiv_id":"2512.09946","title":"ELANA: A Simple Energy and Latency Analyzer for LLMs","abstract":"The latency and power consumption of large language models (LLMs) are major constraints when serving them across a wide spectrum of hardware platforms, from mobile edge devices to cloud GPU clusters. Benchmarking is crucial for optimizing efficiency in both model deployment and next-generation model development. To address this need, we open-source a simple profiling tool, \\textbf{ELANA}, for evaluating LLMs. ELANA is designed as a lightweight, academic-friendly profiler for analyzing model size, key-value (KV) cache size, prefilling latency (Time-to-first-token, TTFT), generation latency (Time-per-output-token, TPOT), and end-to-end latency (Time-to-last-token, TTLT) of LLMs on both multi-GPU and edge GPU platforms. It supports all publicly available models on Hugging Face and offers a simple command-line interface, along with optional energy consumption logging. Moreover, ELANA is fully compatible with popular Hugging Face APIs and can be easily customized or adapted to compressed or low bit-width models, making it ideal for research on efficient LLMs or for small-scale proof-of-concept studies. We release the ELANA profiling tool at: https://github.com/enyac-group/Elana.","short_abstract":"The latency and power consumption of large language models (LLMs) are major constraints when serving them across a wide spectrum of hardware platforms, from mobile edge devices to cloud GPU clusters. Benchmarking is crucial for optimizing efficiency in both model deployment and next-generation model development. To add...","url_abs":"https://arxiv.org/abs/2512.09946","url_pdf":"https://arxiv.org/pdf/2512.09946v1","authors":"[\"Hung-Yueh Chiang\",\"Bokun Wang\",\"Diana Marculescu\"]","published":"2025-12-07T18:43:47Z","proceeding":"cs.DC","tasks":"[\"cs.DC\",\"cs.AI\"]","methods":"[\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":606199,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2832095,"paper_url":"https://arxiv.org/abs/2512.09946","paper_title":"ELANA: A Simple Energy and Latency Analyzer for LLMs","repo_url":"https://github.com/enyac-group/Elana","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
