{"ID":3050143,"CreatedAt":"2026-06-04T02:13:16.786527022Z","UpdatedAt":"2026-06-06T08:58:50.400332682Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.04652","arxiv_id":"2606.04652","title":"Rectangular Matrix Multiplication in the Low-Bandwidth Model","abstract":"We study rectangular matrix multiplication in the low-bandwidth model of distributed computing. There are $n$ computers; initially the input matrices are distributed evenly between computers, and in each communication round every computer can send and receive an $O(\\log n)$-bit message. Eventually each computer must output its designated part of the product matrix. While prior work has focused primarily on square $n \\times n$ multiplication under various sparsity assumptions, we study rectangular instances with no sparsity assumption. We denote by $\\langle a,b,c\\rangle$ the task of multiplying an $a\\times b$ matrix by a $b\\times c$ matrix in this model. We concentrate on two natural aspect ratios, $\\langle n,d,n\\rangle$ and $\\langle d,n,d\\rangle$, for $d \\le n$, and we study how the round complexity depends on $n$ and $d$. When $d \\to n$, both $\\langle n,d,n\\rangle$ and $\\langle d,n,d\\rangle$ approach $\\langle n,n,n\\rangle$, which is the usual task of multiplying square matrices. If we consider multiplication over semirings, the current best upper bound in that case is $O(n^{4/3})$ rounds, and there is a trivial unconditional lower bound of $Ω(n)$. We show that for $\\langle d,n,d\\rangle$, we can achieve the complexity of $\\tilde O(d^{4/3})$, which seems like a natural generalization of the upper bound $\\tilde O(n^{4/3})$ when $d=n$. However, the case of $\\langle n,d,n\\rangle$ is fundamentally different, and also exhibits a phase transition. We show that for $d \\le \\sqrt{n}$, the complexity of $\\langle n,d,n\\rangle$ is $Θ(d \\sqrt{n})$; we have matching upper and lower bounds. However, the behavior is genuinely different in the region $d \\ge \\sqrt{n}$, where the upper bound is $O(d^{2/3} n^{2/3})$.","short_abstract":"We study rectangular matrix multiplication in the low-bandwidth model of distributed computing. There are $n$ computers; initially the input matrices are distributed evenly between computers, and in each communication round every computer can send and receive an $O(\\log n)$-bit message. Eventually each computer must ou...","url_abs":"https://arxiv.org/abs/2606.04652","url_pdf":"https://arxiv.org/pdf/2606.04652v1","authors":"[\"Chetan Gupta\",\"Jukka Suomela\",\"Hossein Vahidi\"]","published":"2026-06-03T09:22:49Z","proceeding":"cs.DC","tasks":"[\"cs.DC\"]","methods":"[]","has_code":false}
