{"ID":2859730,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.04548","arxiv_id":"2510.04548","title":"Learning Linear Regression with Low-Rank Tasks in-Context","abstract":"In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common structure. In this work, we address this problem by analyzing a linear attention model trained on low-rank regression tasks. Within this setting, we precisely characterize the distribution of predictions and the generalization error in the high-dimensional limit. Moreover, we find that statistical fluctuations in finite pre-training data induce an implicit regularization. Finally, we identify a sharp phase transition of the generalization error governed by task structure. These results provide a framework for understanding how transformers learn to learn the task structure.","short_abstract":"In-context learning (ICL) is a key building block of modern large language models, yet its theoretical mechanisms remain poorly understood. It is particularly mysterious how ICL operates in real-world applications where tasks have a common structure. In this work, we address this problem by analyzing a linear attention...","url_abs":"https://arxiv.org/abs/2510.04548","url_pdf":"https://arxiv.org/pdf/2510.04548v2","authors":"[\"Kaito Takanami\",\"Takashi Takahashi\",\"Yoshiyuki Kabashima\"]","published":"2025-10-06T07:27:49Z","proceeding":"cond-mat.dis-nn","tasks":"[\"cond-mat.dis-nn\",\"cs.LG\",\"stat.ML\"]","methods":"[\"Transformer\",\"Language Model\"]","has_code":false}
