Bits Learning: User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification

Zhenlong Yuan, Jie Xu, Yibo Xue, Mihaela Van Der Schaar

Research output: Contribution to journalArticle

4 Scopus citations

Abstract

During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP.

Original languageEnglish (US)
Article number7393470
Pages (from-to)704-707
Number of pages4
JournalIEEE Communications Letters
Volume20
Issue number4
DOIs
StatePublished - Apr 1 2016

    Fingerprint

Keywords

  • bits as features
  • ML
  • Traffic classification

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications
  • Modeling and Simulation

Cite this