When Routers, Switches and Interconnects Compute: A processing-in-interconnect Paradigm for Scalable Neuromorphic AI

cs.NE arXiv:2508.19548
View PDF arXiv JSON

Abstract

Routing, switching, and the interconnect fabric are essential components in implementing large-scale neuromorphic computing architectures. While this fabric plays only a supporting role in the process of computing, for large AI workloads, this fabric ultimately determines the overall system's performance, such as energy consumption and speed. In this paper, we offer a potential solution to address this bottleneck by addressing two fundamental questions: (a) What computing paradigms are inherent in existing routing, switching, and interconnect systems, and how can they be used to implement a Processing-in-Interconnect ($π^2$) computing paradigm? and (b) How to train $π^2$ network on standard AI benchmarks? To address the first question, we demonstrate that all operations required for typical AI workloads can be mapped onto delays, causality, time-outs, packet drops, and broadcast operations, all of which are already implemented in current packet-switching and packet-routing hardware. {We then show that existing buffering and traffic-shaping embedded algorithms can be minimally modified to implement $π^2$ neuron models and synaptic operations. To address the second question, we show how a knowledge distillation framework can be used to train and cross-map well-established neural network topologies onto $π^2$ architectures without any degradation in the generalization performance. Our analysis show that the effective energy utilization of a $π^2$ network is significantly higher than that of other neuromorphic computing platforms; as a result, we believe that the $π^2$ paradigm offers a more scalable architectural path toward achieving brain-scale AI inference.

PDF Viewer