{"ID":2874531,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2509.04011","arxiv_id":"2509.04011","title":"NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings","abstract":"We present NER Retriever, a zero-shot retrieval framework for ad-hoc Named Entity Retrieval, a variant of Named Entity Recognition (NER), where the types of interest are not provided in advance, and a user-defined type description is used to retrieve documents mentioning entities of that type. Instead of relying on fixed schemas or fine-tuned models, our method builds on internal representations of large language models (LLMs) to embed both entity mentions and user-provided open-ended type descriptions into a shared semantic space. We show that internal representations, specifically the value vectors from mid-layer transformer blocks, encode fine-grained type information more effectively than commonly used top-layer embeddings. To refine these representations, we train a lightweight contrastive projection network that aligns type-compatible entities while separating unrelated types. The resulting entity embeddings are compact, type-aware, and well-suited for nearest-neighbor search. Evaluated on three benchmarks, NER Retriever significantly outperforms both lexical and dense sentence-level retrieval baselines. Our findings provide empirical support for representation selection within LLMs and demonstrate a practical solution for scalable, schema-free entity retrieval. The NER Retriever Codebase is publicly available at https://github.com/ShacharOr100/ner_retriever","short_abstract":"We present NER Retriever, a zero-shot retrieval framework for ad-hoc Named Entity Retrieval, a variant of Named Entity Recognition (NER), where the types of interest are not provided in advance, and a user-defined type description is used to retrieve documents mentioning entities of that type. Instead of relying on fix...","url_abs":"https://arxiv.org/abs/2509.04011","url_pdf":"https://arxiv.org/pdf/2509.04011v1","authors":"[\"Or Shachar\",\"Uri Katz\",\"Yoav Goldberg\",\"Oren Glickman\"]","published":"2025-09-04T08:42:23Z","proceeding":"cs.IR","tasks":"[\"cs.IR\",\"cs.AI\",\"cs.CL\"]","methods":"[\"Transformer\",\"Large Language Model\",\"Language Model\"]","has_code":false,"code_links":[{"ID":610152,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_id":2874531,"paper_url":"https://arxiv.org/abs/2509.04011","paper_title":"NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings","repo_url":"https://github.com/ShacharOr100/ner_retriever","is_official":false,"mentioned_in_paper":false,"mentioned_in_github":true,"github_stars":0}]}
