{"ID":2842953,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2511.09529","arxiv_id":"2511.09529","title":"SiDGen: Structure-informed Diffusion for Generative modeling of Ligands for Proteins","abstract":"Structure-based drug design (SBDD) faces a fundamental scaling fidelity dilemma: rich pocket-aware conditioning captures interaction geometry but can be costly, often scales quadratically ($O(L^2)$) or worse with protein length ($L$), while efficient sequence-only conditioning can miss key interaction structure. We propose SiDGen, a structure-informed discrete diffusion framework that resolves this trade-off through a Topological Information Bottleneck (TIB). SiDGen leverages a learned, soft assignment mechanism to compress residue-level protein representations into a compact bottleneck enabling downstream pairwise computations on the coarse grid ($O(L^2/s^2)$). This design reduces memory and computational cost without compromising generative accuracy. Our approach achieves state-of-the-art performance on CrossDocked2020 and DUD-E benchmarks while significantly reducing pairwise-tensor memory. SiDGen bridges the gap between sequence-based efficiency and pocket-aware conditioning, offering a scalable path for high-throughput structure-based discovery.","short_abstract":"Structure-based drug design (SBDD) faces a fundamental scaling fidelity dilemma: rich pocket-aware conditioning captures interaction geometry but can be costly, often scales quadratically ($O(L^2)$) or worse with protein length ($L$), while efficient sequence-only conditioning can miss key interaction structure. We pro...","url_abs":"https://arxiv.org/abs/2511.09529","url_pdf":"https://arxiv.org/pdf/2511.09529v3","authors":"[\"Samyak Sanghvi\",\"Nishant Ranjan\",\"Tarak Karmakar\"]","published":"2025-11-12T18:25:51Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Diffusion Model\",\"Generative Adversarial Network\"]","has_code":false}
