{"ID":2849818,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.23786","arxiv_id":"2510.23786","title":"Relaxed Sequence Sampling for Diverse Protein Design","abstract":"Protein design using structure prediction models such as AlphaFold2 has shown remarkable success, but existing approaches like relaxed sequence optimization (RSO) rely on single-path gradient descent and ignore sequence-space constraints, limiting diversity and designability. We introduce Relaxed Sequence Sampling (RSS), a Markov chain Monte Carlo (MCMC) framework that integrates structural and evolutionary information for protein design. RSS operates in continuous logit space, combining gradient-guided exploration with protein language model-informed jumps. Its energy function couples AlphaFold2-derived structural objectives with ESM2-derived sequence priors, balancing accuracy and biological plausibility. In an in silico protein binder design task, RSS produces 5$\\times$ more designable structures and 2-3$\\times$ greater structural diversity than RSO baselines, at equal computational cost. These results highlight RSS as a principled approach for efficiently exploring the protein design landscape.","short_abstract":"Protein design using structure prediction models such as AlphaFold2 has shown remarkable success, but existing approaches like relaxed sequence optimization (RSO) rely on single-path gradient descent and ignore sequence-space constraints, limiting diversity and designability. We introduce Relaxed Sequence Sampling (RSS...","url_abs":"https://arxiv.org/abs/2510.23786","url_pdf":"https://arxiv.org/pdf/2510.23786v1","authors":"[\"Joohwan Ko\",\"Aristofanis Rontogiannis\",\"Yih-En Andrew Ban\",\"Axel Elaldi\",\"Nicholas Franklin\"]","published":"2025-10-27T19:18:36Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[\"Language Model\",\"LoRA\"]","has_code":false}
