{"ID":2890258,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2507.20066","arxiv_id":"2507.20066","title":"Studying Disinformation Narratives on Social Media with LLMs and Semantic Similarity","abstract":"This thesis develops a continuous scale measurement of similarity to disinformation narratives that can serve to detect disinformation and capture the nuanced, partial truths that are characteristic of it. To do so, two tools are developed and their methodologies are documented. The tracing tool takes tweets and a target narrative, rates the similarities of each to the target narrative, and graphs it as a timeline. The second narrative synthesis tool clusters tweets above a similarity threshold and generates the dominant narratives within each cluster. These tools are combined into a Tweet Narrative Analysis Dashboard. The tracing tool is validated on the GLUE STS-B benchmark, and then the two tools are used to analyze two case studies for further empirical validation. The first case study uses the target narrative \"The 2020 election was stolen\" and analyzes a dataset of Donald Trump's tweets during 2020. The second case study uses the target narrative, \"Transgender people are harmful to society\" and analyzes tens of thousands of tweets from the media outlets The New York Times, The Guardian, The Gateway Pundit, and Fox News. Together, the empirical findings from these case studies demonstrate semantic similarity for nuanced disinformation detection, tracing, and characterization. The tools developed in this thesis are hosted and can be accessed through the permission of the author. Please explain your use case in your request. The HTML friendly version of this paper is at https://chaytanc.github.io/projects/disinfo-research (Inman, 2025).","short_abstract":"This thesis develops a continuous scale measurement of similarity to disinformation narratives that can serve to detect disinformation and capture the nuanced, partial truths that are characteristic of it. To do so, two tools are developed and their methodologies are documented. The tracing tool takes tweets and a targ...","url_abs":"https://arxiv.org/abs/2507.20066","url_pdf":"https://arxiv.org/pdf/2507.20066v1","authors":"[\"Chaytan Inman\"]","published":"2025-07-26T21:43:18Z","proceeding":"cs.SI","tasks":"[\"cs.SI\",\"cs.CY\",\"cs.ET\"]","methods":"[\"Large Language Model\"]","has_code":false}
