Taxonomy of malicious URL detection techniques

Orozco Fonseca, Diego; Marín Raventós, Gabriela; Lara Petitdemange, Adrián

Taxonomy of malicious URL detection techniques

dc.creator	Orozco Fonseca, Diego
dc.creator	Marín Raventós, Gabriela
dc.creator	Lara Petitdemange, Adrián
dc.date.accessioned	2025-05-05T20:49:03Z
dc.date.issued	2024-02-27
dc.description.abstract	Malicious URLs are often used by phishing campaigns, botnets and other attacks. Indeed, DNS traffic is necessary for the Internet to function correctly, which means that this data flow cannot be blocked. For these reasons, detecting malicious URLs is both important, challenging and still an open research problem. There are two types of techniques used to detect malicious URLs: rules-based and machine-learning based. The traditional, rules-based techniques rely on blacklists and heuristics. These techniques struggle to keep up with a rapidly changing array of malicious URLs. Therefore, machine learning-based techniques have emerged. These techniques rely on URL characteristics such as length, number of vowels and others to classify them as legitimate or malicious. The main contribution of this paper is to propose a taxonomy of detection techniques and to point out which URL characteristics are used by each method. While surveys on the topics exist, a precise mapping between the detection methods and the characteristics is not available and we propose one. We also compare these techniques, highlighting that machine learning-based techniques are more complex to implement but better at keeping up with rapidly incoming new malicious URLs. In contrast, rules-based techniques are simpler and easier to implement, but they struggle to update fast enough to identify new malicious URLs.
dc.description.procedence	UCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ciencias de la Computación e Informática
dc.description.procedence	UCR::Vicerrectoría de Investigación::Unidades de Investigación::Ingeniería::Centro de Investigaciones en Tecnologías de Información y Comunicación (CITIC)
dc.identifier.citation	https://link.springer.com/chapter/10.1007/978-3-031-54235-0_7
dc.identifier.doi	https://doi.org/10.1007/978-3-031-54235-0_7
dc.identifier.isbn	978-3-031-54234-3
dc.identifier.isbn	978-3-031-54235-0
dc.identifier.issn	2367-3389
dc.identifier.issn	2367-3370
dc.identifier.uri	https://hdl.handle.net/10669/102006
dc.language.iso	eng
dc.rights	acceso restringido
dc.source	Information Technology and Systems. ICITS 2024. Lecture Notes in Networks and Systems, 932
dc.subject	malicious URLs
dc.subject	machine learning
dc.subject	blacklist-based classification
dc.subject	URL classification
dc.title	Taxonomy of malicious URL detection techniques
dc.type	comunicación de congreso

Files

Original bundle

Now showing 1 - 1 of 1

Name:: icitsV2.pdf
Size:: 234.91 KB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Computación e informática