The Persistence of Retracted Papers on Wikipedia
Abstract
Wikipedia serves as a key infrastructure for public access to scientific knowledge, but it faces challenges in maintaining the credibility of cited sources--especially when scientific papers are retracted. This paper investigates how citations to retracted research are handled on English Wikipedia. We construct a novel dataset that integrates Wikipedia revision histories with metadata from Retraction Watch, Crossref, Altmetric, and OpenAlex, identifying 1,181 citations of retracted papers. We find that 71.6% of the citations were initially problematic and in need of reader-facing repair, defined as those added before the paper's retraction (51.5%) or introduced afterwards without proper warning (20.1%). While many are eventually corrected, our analysis reveals that these citations persist for a median of 3.68 years (1,344 days). Through survival analysis, we find that bot-mediated flagging (RetractionBot), open access availability, pre-existing online visibility (e.g., Twitter/X mention counts), and page-level organization (e.g., number of categories on a Wikipedia page) are associated with a higher hazard of correction. Conversely, a paper's established scholarly authority--a higher academic citation count--is associated with a slower time to correction. Our findings highlight how the Wikipedia community supports collaborative maintenance but leaves gaps in citation-level repair. We contribute to CSCW research by advancing our understanding of this sociotechnical vulnerability, which takes the form of a community coordination challenge, and by offering design directions to support citation credibility at scale.