{"ID":2845315,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.04321","arxiv_id":"2511.04321","title":"AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM","abstract":"SRAM Processing-in-Memory (PIM) has emerged as the most promising implementation for high-performance PIM, delivering superior computing density, energy efficiency, and computational precision. However, the pursuit of higher performance necessitates more complex circuit designs and increased operating frequencies, which exacerbate IR-drop issues. Severe IR-drop can significantly degrade chip performance and even threaten reliability. Conventional circuit-level IR-drop mitigation methods, such as back-end optimizations, are resource-intensive and often compromise power, performance, and area (PPA). To address these challenges, we propose AIM, comprehensive software and hardware co-design for architecture-level IR-drop mitigation in high-performance PIM. Initially, leveraging the bit-serial and in-situ dataflow processing properties of PIM, we introduce Rtog and HR, which establish a direct correlation between PIM workloads and IR-drop. Building on this foundation, we propose LHR and WDS, enabling extensive exploration of architecture-level IR-drop mitigation while maintaining computational accuracy through software optimization. Subsequently, we develop IR-Booster, a dynamic adjustment mechanism that integrates software-level HR information with hardware-based IR-drop monitoring to adapt the V-f pairs of the PIM macro, achieving enhanced energy efficiency and performance. Finally, we propose the HR-aware task mapping method, bridging software and hardware designs to achieve optimal improvement. Post-layout simulation results on a 7nm 256-TOPS PIM chip demonstrate that AIM achieves up to 69.2% IR-drop mitigation, resulting in 2.29x energy efficiency improvement and 1.152x speedup.","short_abstract":"SRAM Processing-in-Memory (PIM) has emerged as the most promising implementation for high-performance PIM, delivering superior computing density, energy efficiency, and computational precision. However, the pursuit of higher performance necessitates more complex circuit designs and increased operating frequencies, whic...","url_abs":"https://arxiv.org/abs/2511.04321","url_pdf":"https://arxiv.org/pdf/2511.04321v1","authors":"[\"Yuanpeng Zhang\",\"Xing Hu\",\"Xi Chen\",\"Zhihang Yuan\",\"Cong Li\",\"Jingchen Zhu\",\"Zhao Wang\",\"Chenguang Zhang\",\"Xin Si\",\"Wei Gao\",\"Qiang Wu\",\"Runsheng Wang\",\"Guangyu Sun\"]","published":"2025-11-06T12:49:46Z","proceeding":"cs.AR","tasks":"[\"cs.AR\",\"cs.AI\",\"cs.LG\"]","methods":"[\"LoRA\"]","has_code":false}
