{"ID":2872953,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.07531","arxiv_id":"2509.07531","title":"FLeW: Facet-Level and Adaptive Weighted Representation Learning of Scientific Documents","abstract":"Scientific document representation learning provides powerful embeddings for various tasks, while current methods face challenges across three approaches. 1) Contrastive training with citation-structural signals underutilizes citation information and still generates single-vector representations. 2) Fine-grained representation learning, which generates multiple vectors at the sentence or aspect level, requires costly integration and lacks domain generalization. 3) Task-aware learning depends on manually predefined task categorization, overlooking nuanced task distinctions and requiring extra training data for task-specific modules. To address these problems, we propose a new method that unifies the three approaches for better representations, namely FLeW. Specifically, we introduce a novel triplet sampling method that leverages citation intent and frequency to enhance citation-structural signals for training. Citation intents (background, method, result), aligned with the general structure of scientific writing, facilitate a domain-generalized facet partition for fine-grained representation learning. Then, we adopt a simple weight search to adaptively integrate three facet-level embeddings into a task-specific document embedding without task-aware fine-tuning. Experiments show the applicability and robustness of FLeW across multiple scientific tasks and fields, compared to prior models.","short_abstract":"Scientific document representation learning provides powerful embeddings for various tasks, while current methods face challenges across three approaches. 1) Contrastive training with citation-structural signals underutilizes citation information and still generates single-vector representations. 2) Fine-grained repres...","url_abs":"https://arxiv.org/abs/2509.07531","url_pdf":"https://arxiv.org/pdf/2509.07531v1","authors":"[\"Zheng Dou\",\"Deqing Wang\",\"Fuzhen Zhuang\",\"Jian Ren\",\"Yanlin Hu\"]","published":"2025-09-09T09:08:44Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.AI\"]","methods":"[]","has_code":false}
