{"ID":2833317,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.03465","arxiv_id":"2512.03465","title":"Tuning for TraceTarnish: Techniques, Trends, and Testing Tangible Traits","abstract":"In this study, we more rigorously evaluated our attack script $\\textit{TraceTarnish}$, which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments -- comments that were later alchemized into $\\textit{TraceTarnish}$ data -- to gain valuable insights. The transformed $\\textit{TraceTarnish}$ data was then further augmented by $\\textit{StyloMetrix}$ to manufacture stylometric features -- features that were culled using the Information Gain criterion, leaving only the most informative, predictive, and discriminative ones. Our results found that function words and function word types ($L\\_FUNC\\_A$ $\\\u0026$ $L\\_FUNC\\_T$); content words and content word types ($L\\_CONT\\_A$ $\\\u0026$ $L\\_CONT\\_T$); and the Type-Token Ratio ($ST\\_TYPE\\_TOKEN\\_RATIO\\_LEMMAS$) yielded significant Information-Gain readings. The identified stylometric cues -- function-word frequencies, content-word distributions, and the Type-Token Ratio -- serve as reliable indicators of compromise (IoCs), revealing when a text has been deliberately altered to mask its true author. Similarly, these features could function as forensic beacons, alerting defenders to the presence of an adversarial stylometry attack; granted, in the absence of the original message, this signal may go largely unnoticed, as it appears to depend on a pre- and post-transformation comparison. \"In trying to erase a trace, you often imprint a larger one.\" Armed with this understanding, we framed $\\textit{TraceTarnish}$'s operations and outputs around these five isolated features, using them to conceptualize and implement enhancements that further strengthen the attack.","short_abstract":"In this study, we more rigorously evaluated our attack script $\\textit{TraceTarnish}$, which leverages adversarial stylometry principles to anonymize the authorship of text-based messages. To ensure the efficacy and utility of our attack, we sourced, processed, and analyzed Reddit comments -- comments that were later a...","url_abs":"https://arxiv.org/abs/2512.03465","url_pdf":"https://arxiv.org/pdf/2512.03465v4","authors":"[\"Robert Dilworth\"]","published":"2025-12-03T05:39:40Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.CL\",\"cs.IR\"]","methods":"[]","has_code":false}
