Deepsquatting: Learning-based Typosquatting Detection at Deeper Domain Levels

TitleDeepsquatting: Learning-based Typosquatting Detection at Deeper Domain Levels
Publication TypeConference Paper
Year of PublicationIn Press
AuthorsPiredda, P, Ariu, D, Biggio, B, Corona, I, Piras, L, Giacinto, G, Roli, F
Conference Name16th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2017)

Typosquatting consists of registering Internet domain names that closely resemble legitimate, reputable, and well-known ones (e.g., Farebook instead of Facebook). This cyber-attack aims to distribute malware or to phish the victims users (i.e., stealing their credentials) by mimicking the aspect of the legitimate webpage of the targeted organisation.

The majority of the detection approaches proposed so far generate possible typo-variants of a legitimate domain, creating thus blacklists which can be used to prevent users from accessing typo-squatted domains.

Only few studies have addressed the problem of Typosquatting detection by leveraging a passive Domain Name System (DNS) traffic analysis. In this work, we follow this approach, and additionally exploit machine learning to learn a similarity measure between domain names capable of detecting typo-squatted ones from the analyzed DNS traffic.

We validate our approach on a large-scale dataset consisting of 4 months of traffic collected from a major Italian Internet Service Provider.

Citation Key1383
piredda17-AIIA.pdf1.21 MB