Website categorization via design attribute learning

Doron Cohen, Or Naim, Eran Toch, Irad Ben-Gal

Research output: Contribution to journalArticlepeer-review

Abstract

Malicious software (malware) is a challenging cybersecurity threat, as it is often bundled with legitimate software and downloaded by naïve users. A significant source of malware downloads is via crack websites that are used to circumvent copyright protection mechanisms. Crack websites often change URLs and IPs to avoid automatic detection; however, in many cases, they preserve specific visual designs that signal the website's function to potential users (such as particular colors, text fonts, shapes, and sizes.). Website design features are numerous, have high dimensionality and complicated interactions, making categorization challenging. This study shows that straightforward machine learning models for categorizing Crack and Malicious websites can considerably benefit from using design features. We report on two experiments based on unbalanced datasets and show that classification by using design features can reach a categorization accuracy of over 90% with an F1-score over 77% in some instances. Finally, we discuss the results in the context of developing intelligent security mechanisms.

Original languageEnglish
Article number102312
JournalComputers and Security
Volume107
DOIs
StatePublished - Aug 2021

Keywords

  • Crack websites
  • Cyber security
  • Human computer interaction
  • Malware
  • Online learning
  • Website categorization
  • Website design elements

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • Law

Fingerprint

Dive into the research topics of 'Website categorization via design attribute learning'. Together they form a unique fingerprint.

Cite this