A probabilistic framework for crystal structure denoising, phase classification, and order parameters
Abstract
Atomistic simulations generate large volumes of noisy structural data, yet extracting phase labels and continuous order parameters (OPs) in a robust and general manner remains challenging. Existing tools are often specialized to a limited set of prototypes and split thermal-noise removal, phase classification, and OP construction into separate steps. Here we present a unified probabilistic framework for analyzing noisy atomic configurations with respect to known crystal prototypes. The model predicts per-atom, per-prototype logits and aggregates them into a scalar log-probability (logP) landscape over atomic coordinates. Its gradient defines a conservative denoising field, while the logits provide local phase labels, prototype-resolved OPs, and ambiguity measures through logit margins. We train on AFLOW-mapped crystalline structures from the Materials Project with synthetic positional and elastic perturbations, then test extrapolation to stronger noise, finite-temperature disorder, point defects, water--ice coexistence, binary polymorphs, and shock-compressed Ti. A single differentiable scalar model recovers prototype identity after denoising, tracks smooth transformations such as Bain and Burgers paths, and exposes low-confidence regions near defects and phase boundaries. This provides an integrated and extensible tool for analyzing complex atomistic simulations.