{"ID":2855452,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2510.14161","arxiv_id":"2510.14161","title":"Data Understanding Survey: Pursuing Improved Dataset Characterization Via Tensor-based Methods","abstract":"In the evolving domains of Machine Learning and Data Analytics, existing dataset characterization methods such as statistical, structural, and model-based analyses often fail to deliver the deep understanding and insights essential for innovation and explainability. This work surveys the current state-of-the-art conventional data analytic techniques and examines their limitations, and discusses a variety of tensor-based methods and how these may provide a more robust alternative to traditional statistical, structural, and model-based dataset characterization techniques. Through examples, we illustrate how tensor methods unveil nuanced data characteristics, offering enhanced interpretability and actionable intelligence. We advocate for the adoption of tensor-based characterization, promising a leap forward in understanding complex datasets and paving the way for intelligent, explainable data-driven discoveries.","short_abstract":"In the evolving domains of Machine Learning and Data Analytics, existing dataset characterization methods such as statistical, structural, and model-based analyses often fail to deliver the deep understanding and insights essential for innovation and explainability. This work surveys the current state-of-the-art conven...","url_abs":"https://arxiv.org/abs/2510.14161","url_pdf":"https://arxiv.org/pdf/2510.14161v1","authors":"[\"Matthew D. Merris\",\"Tim Andersen\"]","published":"2025-10-15T23:17:54Z","proceeding":"cs.LG","tasks":"[\"cs.LG\"]","methods":"[]","has_code":false}
