On extracting session data from activity logs

David Mehrzadi, Dror G. Feitelson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Activity logs from large-scale systems facilitate the study of user behavior, which can be used to improve and tune the user experience. However, the available data often lacks important elements such as the identification of user sessions. Previous work typically compensated for this by setting a threshold of around 30 minutes, and assuming that breaks in activity longer than the threshold reflect breaks between sessions. We show that using such a global threshold introduces artifacts that may affect the analysis, because there is a high probability that long sessions are not identified correctly. As an alternative, we suggest that a suitable individual threshold be found for each user, based on that user's activity pattern. Applying this approach to a large dataset from the AOL search engine leads to a distribution of session durations that is free of artifacts like those that appear when using a global threshold.

Original languageEnglish
Title of host publicationProceedings of the 5th Annual International Systems and Storage Conference, SYSTOR'12
DOIs
StatePublished - 2012
Event5th Annual International Systems and Storage Conference, SYSTOR 2012 - Haifa, Israel
Duration: 4 Jun 20126 Jun 2012

Publication series

NameACM International Conference Proceeding Series

Conference

Conference5th Annual International Systems and Storage Conference, SYSTOR 2012
Country/TerritoryIsrael
CityHaifa
Period4/06/126/06/12

Keywords

  • activity log
  • session
  • user behavior

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'On extracting session data from activity logs'. Together they form a unique fingerprint.

Cite this