Abstract
Internet traffic classification has been intensively studied over the past decade due to its importance for traffic engineering and cyber security. A promising approach to several traffic classification problems is the FlowPic approach, where histograms of packet sizes in consecutive time slices are transformed into a picture that is fed into a Convolution Neural Network (CNN) model for classification. However, CNNs (and the FlowPic approach included) require a relatively large labeled flow dataset, which is not always easy to obtain. In this paper, we show that we can overcome this obstacle by using Contrastive Representation Learning in order to learn from an unlabeled flow dataset a flow representation that can be embedded in a latent space, enabling clustering of flows belonging to the same class together. We then show that by using just a few labeled flows (a few shots) from each class, we can achieve high accuracy in flow classification. We show that common picture augmentation techniques can help, but accuracy improves further when introducing augmentation techniques that mimic network behavior, such as changes in the RTT (Round-trip time). Finally, we show that we can replace the large FlowPics suggested in the past with much smaller mini-FlowPics and achieve two advantages: improved model performance and easier engineering. Interestingly, this even improves accuracy in some cases.
| Original language | English |
|---|---|
| Pages (from-to) | 3054-3067 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Network and Service Management |
| Volume | 21 |
| Issue number | 3 |
| DOIs | |
| State | Published - 2024 |
Keywords
- Internet traffic classification
- application identification
- contrastive representation learning
- few-shot learning
- security management
- self-supervised learning
- traffic
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Electrical and Electronic Engineering
Fingerprint
Dive into the research topics of 'Self-Supervised Traffic Classification: Flow Embedding and Few-Shot Solutions'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver