{"ID":2835008,"CreatedAt":"2026-06-01T04:54:23.091178241Z","UpdatedAt":"2026-06-01T04:54:23.091178241Z","DeletedAt":null,"paper_url":"https://arxiv.org/abs/2512.03462","arxiv_id":"2512.03462","title":"A Hybrid Deep Learning and Anomaly Detection Framework for Real-Time Malicious URL Classification","abstract":"Malicious URLs remain a primary vector for phishing, malware, and cyberthreats. This study proposes a hybrid deep learning framework combining \\texttt{HashingVectorizer} n-gram analysis, SMOTE balancing, Isolation Forest anomaly filtering, and a lightweight neural network classifier for real-time URL classification. The multi-stage pipeline processes URLs from open-source repositories with statistical features (length, dot count, entropy), achieving $O(NL + EBdh)$ training complexity and a 20\\,ms prediction latency. Empirical evaluation yields 96.4\\% accuracy, 95.4\\% F1-score, and 97.3\\% ROC-AUC, outperforming CNN (94.8\\%) and SVM baselines with a $50\\!\\times$--$100\\!\\times$ speedup (Table~\\ref{tab:comp-complexity}). A multilingual Tkinter GUI (Arabic/English/French) enables real-time threat assessment with clipboard integration. The framework demonstrates superior scalability and resilience against obfuscated URL patterns.","short_abstract":"Malicious URLs remain a primary vector for phishing, malware, and cyberthreats. This study proposes a hybrid deep learning framework combining \\texttt{HashingVectorizer} n-gram analysis, SMOTE balancing, Isolation Forest anomaly filtering, and a lightweight neural network classifier for real-time URL classification. Th...","url_abs":"https://arxiv.org/abs/2512.03462","url_pdf":"https://arxiv.org/pdf/2512.03462v1","authors":"[\"Berkani Khaled\",\"Zeraoulia Rafik\"]","published":"2025-11-30T21:25:14Z","proceeding":"cs.CR","tasks":"[\"cs.CR\",\"cs.LG\"]","methods":"[\"Convolutional Neural Network\"]","has_code":false}
