TY - GEN
T1 - A fast and scalable method for threat detection in large-scale DNS logs
AU - Begleiter, Ron
AU - Elovici, Yuval
AU - Hollander, Yona
AU - Mendelson, Ori
AU - Rokach, Lior
AU - Saltzman, Roi
PY - 2013/1/1
Y1 - 2013/1/1
N2 - This paper presents a fast and scalable method for detecting threats in large-scale DNS logs. In such logs, queries about 'abnormal' domain strings are often correlated with malicious behavior. With our method, a language model algorithm learns 'normal' domain-names from a large dataset to rate the extent of domain-name 'abnormality' within a big data stream of DNS queries in the organization. Variable-order Markov Models (VMMs) serve as out underlying algorithmic tool since their running time is linear in the input sequence while their memory requirements are constantly bounded from above, both very appealing characteristics. Our experimental study indicates that the proposed method can detect domain names generated by a genuine Domain Generation Algorithm, used in Advanced Persistent Threat attack scenarios, with less than 5% false-negative and 1% false-positive rates. This detection rate is similar to more computationally intensive methods that are not scalable for big data environments.
AB - This paper presents a fast and scalable method for detecting threats in large-scale DNS logs. In such logs, queries about 'abnormal' domain strings are often correlated with malicious behavior. With our method, a language model algorithm learns 'normal' domain-names from a large dataset to rate the extent of domain-name 'abnormality' within a big data stream of DNS queries in the organization. Variable-order Markov Models (VMMs) serve as out underlying algorithmic tool since their running time is linear in the input sequence while their memory requirements are constantly bounded from above, both very appealing characteristics. Our experimental study indicates that the proposed method can detect domain names generated by a genuine Domain Generation Algorithm, used in Advanced Persistent Threat attack scenarios, with less than 5% false-negative and 1% false-positive rates. This detection rate is similar to more computationally intensive methods that are not scalable for big data environments.
UR - http://www.scopus.com/inward/record.url?scp=84893257670&partnerID=8YFLogxK
U2 - https://doi.org/10.1109/BigData.2013.6691646
DO - https://doi.org/10.1109/BigData.2013.6691646
M3 - Conference contribution
SN - 9781479912926
T3 - Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
SP - 738
EP - 741
BT - Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
T2 - 2013 IEEE International Conference on Big Data, Big Data 2013
Y2 - 6 October 2013 through 9 October 2013
ER -