PhishDef: URL Names Say It All

15 years 3 months ago

Download www.ics.uci.edu

Phishing is an increasingly sophisticated method to steal personal user information using sites that pretend to be legitimate. In this paper, we take the following steps to identify phishing URLs. First, we carefully select lexical features of the URLs that are resistant to obfuscation techniques used by attackers. Second, we evaluate the classification accuracy when using only lexical features, both automatically and hand-selected, vs. when using additional features. We show that lexical features are sufficient for all practical purposes. Third, we thoroughly compare several classification algorithms, and we propose to use an online method (AROW) that is able to overcome noisy training data. Based on the insights gained from our analysis, we propose PhishDef, a phishing detection system that uses only URL names and combines the above three elements. PhishDef is a highly accurate method (when compared to state-of-the-art approaches over real datasets), lightweight (thus appropriate for...

Anh Le, Athina Markopoulou, Michalis Faloutsos

Real-time Traffic

CORR 2010 | Education | Lexical Features | Noisy Training Data | Phishing |

claim paper

Post Info
More Details (n/a)

Added	22 Mar 2011
Updated	22 Mar 2011
Type	Journal
Year	2010
Where	CORR
Authors	Anh Le, Athina Markopoulou, Michalis Faloutsos

Comments (0)

Sciweavers

PhishDef: URL Names Say It All

CORR 2010 | Education | Lexical Features | Noisy Training Data | Phishing |

Explore & Download

Productivity Tools

Sciweavers