{"ID":2921118,"CreatedAt":"2026-06-02T02:42:49.606572591Z","UpdatedAt":"2026-06-04T06:21:04.369492701Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2606.01811","arxiv_id":"2606.01811","title":"\"I've Seen How This Goes\": Characterizing Diversity via Progressive Conditional Surprise","abstract":"Measuring the diversity of creative outputs is central to evaluating post-training mode collapse, comparing decoding strategies, and quantifying creative behavior in both AI and human writing. We propose a new approach to measuring diversity using in-context learning, of which the ``Decan'' metric, $D_{Ca_n} = C \\times a_n$, is the working instance we evaluate: a per-byte score read off the per-token log-probabilities of a base model $θ$ in a \\emph{single forward pass} per permutation, with no embedding model, no reference corpus, and no human labels. This approach is grounded in information theory, makes use of language model in-context learning to detect a wide range of similarities between any number of inputs, and obviates the need to train a special-purpose model. The same pipeline scores AI samples and human-written response sets, with diversity treated as a property of (responses, prompt, scoring model). On Tevet and Berant's human-grounded McDiv benchmark, $D_{Ca_n}$ reaches OCA 0.846 on the McDiv prompt\\_gen set where it performs best, behind the strongest neural baseline reported in Tevet and Berant (SentBERT, 0.897). On the OLMo-2-7B post-training pipeline, $D_{Ca_n}$ drops monotonically across the base $\\to$ SFT $\\to$ DPO $\\to$ RLVR stages, detecting the type of diversity loss that creative-writing applications care about.","short_abstract":"Measuring the diversity of creative outputs is central to evaluating post-training mode collapse, comparing decoding strategies, and quantifying creative behavior in both AI and human writing. We propose a new approach to measuring diversity using in-context learning, of which the ``Decan'' metric, $D_{Ca_n} = C \\times...","url_abs":"https://arxiv.org/abs/2606.01811","url_pdf":"https://arxiv.org/pdf/2606.01811v1","authors":"[\"Matthew Khoriaty\",\"David Williams-King\",\"Shi Feng\"]","published":"2026-06-01T07:27:43Z","proceeding":"cs.CL","tasks":"[\"cs.CL\",\"cs.AI\",\"cs.LG\"]","methods":"[\"Language Model\"]","has_code":false}
