{"ID":2856823,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.10676","arxiv_id":"2510.10676","title":"Bhasha-Rupantarika: Algorithm-Hardware Co-design approach for Multilingual Neural Machine Translation","abstract":"This paper introduces Bhasha-Rupantarika, a light and efficient multilingual translation system tailored through algorithm-hardware codesign for resource-limited settings. The method investigates model deployment at sub-octet precision levels (FP8, INT8, INT4, and FP4), with experimental results indicating a 4.1x reduction in model size (FP4) and a 4.2x speedup in inference speed, which correlates with an increased throughput of 66 tokens/s (improvement by 4.8x). This underscores the importance of ultra-low precision quantization for real-time deployment in IoT devices using FPGA accelerators, achieving performance on par with expectations. Our evaluation covers bidirectional translation between Indian and international languages, showcasing its adaptability in low-resource linguistic contexts. The FPGA deployment demonstrated a 1.96x reduction in LUTs and a 1.65x decrease in FFs, resulting in a 2.2x enhancement in throughput compared to OPU and a 4.6x enhancement compared to HPTA. Overall, the evaluation provides a viable solution based on quantisation-aware translation along with hardware efficiency suitable for deployable multilingual AI systems. The entire codes [https://github.com/mukullokhande99/Bhasha-Rupantarika/] and dataset for reproducibility are publicly available, facilitating rapid integration and further development by researchers.","short_abstract":"This paper introduces Bhasha-Rupantarika, a light and efficient multilingual translation system tailored through algorithm-hardware codesign for resource-limited settings. The method investigates model deployment at sub-octet precision levels (FP8, INT8, INT4, and FP4), with experimental results indicating a 4.1x reduc...","url_abs":"https://arxiv.org/abs/2510.10676","url_pdf":"https://arxiv.org/pdf/2510.10676v1","authors":"[\"Mukul Lokhande\",\"Tanushree Dewangan\",\"Mohd Sharik Mansoori\",\"Tejas Chaudhari\",\"Akarsh J.\",\"Damayanti Lokhande\",\"Adam Teman\",\"Santosh Kumar Vishvakarma\"]","published":"2025-10-12T16:04:11Z","proceeding":"cs.AR","tasks":"[\"cs.AR\",\"cs.CL\",\"cs.RO\",\"eess.AS\"]","methods":"[]","has_code":false,"code_links":[{"ID":608387,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2856823,"paper_url":"https://arxiv.org/abs/2510.10676","paper_title":"Bhasha-Rupantarika: Algorithm-Hardware Co-design approach for Multilingual Neural Machine Translation","repo_url":"https://github.com/mukullokhande99/Bhasha-Rupantarika","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
