{"ID":2860074,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.05309","arxiv_id":"2510.05309","title":"Gamma Mixture Modeling for Cosine Similarity in Small Language Models","abstract":"We study the cosine similarity of sentence transformer embeddings and observe that they are well modeled by gamma mixtures. From a fixed corpus, we measure similarities between all document embeddings and a reference query embedding. Empirically we find that these distributions are often well captured by a gamma distribution shifted and truncated to [-1,1], and in many cases, by a gamma mixture. We propose a heuristic model in which a hierarchical clustering of topics naturally leads to a gamma-mixture structure in the similarity scores. Finally, we outline an expectation-maximization algorithm for fitting shifted gamma mixtures, which provides a practical tool for modeling similarity distributions.","short_abstract":"We study the cosine similarity of sentence transformer embeddings and observe that they are well modeled by gamma mixtures. From a fixed corpus, we measure similarities between all document embeddings and a reference query embedding. Empirically we find that these distributions are often well captured by a gamma distri...","url_abs":"https://arxiv.org/abs/2510.05309","url_pdf":"https://arxiv.org/pdf/2510.05309v1","authors":"[\"Kevin Player\"]","published":"2025-10-06T19:20:28Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Transformer\",\"Language Model\"]","has_code":false}
