PhishReplicant: A Language Model-based Approach to Detect Generated Squatting Domain Names

Koide, Takashi; Fukushi, Naoki; Nakano, Hiroki; Chiba, Daiki

doi:10.1145/3627106.3627111

Computer Science > Cryptography and Security

arXiv:2310.11763 (cs)

[Submitted on 18 Oct 2023]

Title:PhishReplicant: A Language Model-based Approach to Detect Generated Squatting Domain Names

Authors:Takashi Koide, Naoki Fukushi, Hiroki Nakano, Daiki Chiba

View PDF

Abstract:Domain squatting is a technique used by attackers to create domain names for phishing sites. In recent phishing attempts, we have observed many domain names that use multiple techniques to evade existing methods for domain squatting. These domain names, which we call generated squatting domains (GSDs), are quite different in appearance from legitimate domain names and do not contain brand names, making them difficult to associate with phishing. In this paper, we propose a system called PhishReplicant that detects GSDs by focusing on the linguistic similarity of domain names. We analyzed newly registered and observed domain names extracted from certificate transparency logs, passive DNS, and DNS zone files. We detected 3,498 domain names acquired by attackers in a four-week experiment, of which 2,821 were used for phishing sites within a month of detection. We also confirmed that our proposed system outperformed existing systems in both detection accuracy and number of domain names detected. As an in-depth analysis, we examined 205k GSDs collected over 150 days and found that phishing using GSDs was distributed globally. However, attackers intensively targeted brands in specific regions and industries. By analyzing GSDs in real time, we can block phishing sites before or immediately after they appear.

Comments:	Accepted at ACSAC 2023
Subjects:	Cryptography and Security (cs.CR)
Cite as:	arXiv:2310.11763 [cs.CR]
	(or arXiv:2310.11763v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2310.11763
Related DOI:	https://doi.org/10.1145/3627106.3627111

Submission history

From: Takashi Koide [view email]
[v1] Wed, 18 Oct 2023 07:41:41 UTC (698 KB)

Computer Science > Cryptography and Security

Title:PhishReplicant: A Language Model-based Approach to Detect Generated Squatting Domain Names

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:PhishReplicant: A Language Model-based Approach to Detect Generated Squatting Domain Names

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators