Bits Learning

User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification

Zhenlong Yuan, Jie Xu, Yibo Xue, Mihaela Van Der Schaar

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP.

Original languageEnglish (US)
Article number7393470
Pages (from-to)704-707
Number of pages4
JournalIEEE Communications Letters
Volume20
Issue number4
DOIs
StatePublished - Apr 1 2016

Fingerprint

Internet Traffic
Privacy
Internet
Discriminators
Learning systems
Signature
Privacy Protection
Learning
Machine Learning
Trade-offs
Discretization
Experiments
Traffic

Keywords

  • bits as features
  • ML
  • Traffic classification

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications
  • Modeling and Simulation

Cite this

Bits Learning : User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification. / Yuan, Zhenlong; Xu, Jie; Xue, Yibo; Van Der Schaar, Mihaela.

In: IEEE Communications Letters, Vol. 20, No. 4, 7393470, 01.04.2016, p. 704-707.

Research output: Contribution to journalArticle

Yuan, Zhenlong ; Xu, Jie ; Xue, Yibo ; Van Der Schaar, Mihaela. / Bits Learning : User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification. In: IEEE Communications Letters. 2016 ; Vol. 20, No. 4. pp. 704-707.
@article{71cd5c7554a541ddb17be736234413e4,
title = "Bits Learning: User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification",
abstract = "During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP.",
keywords = "bits as features, ML, Traffic classification",
author = "Zhenlong Yuan and Jie Xu and Yibo Xue and {Van Der Schaar}, Mihaela",
year = "2016",
month = "4",
day = "1",
doi = "10.1109/LCOMM.2016.2521837",
language = "English (US)",
volume = "20",
pages = "704--707",
journal = "IEEE Communications Letters",
issn = "1089-7798",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

TY - JOUR

T1 - Bits Learning

T2 - User-Adjustable Privacy Versus Accuracy in Internet Traffic Classification

AU - Yuan, Zhenlong

AU - Xu, Jie

AU - Xue, Yibo

AU - Van Der Schaar, Mihaela

PY - 2016/4/1

Y1 - 2016/4/1

N2 - During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP.

AB - During the past decade, a great number of machine learning (ML)-based methods have been studied for accurate traffic classification. Flow features such as the discretizations of the first five packet sizes (PS) and flow ports (FP) are considered the best discriminators for per-flow classification. For the first time, this letter proposes to treat the first n-bits of a flow (BitFlow) as features and compares its overall performance with the well-known ACAS (automated construction of application signatures) that takes the first n-bytes of a flow (ByteFlow) as features. The results show that BitFlow achieves not only a higher classification accuracy but also 1-3 orders of magnitude faster speed than ACAS in training and classifying. More importantly, this letter also proposes to treat the first n-bits of each of the first few packet payloads (BitPack) as features, which enables a user-adjustable tradeoff between user privacy protection and classification accuracy maximization. The experiments show that BitPack can significantly outperform BitFlow, PS, and FP.

KW - bits as features

KW - ML

KW - Traffic classification

UR - http://www.scopus.com/inward/record.url?scp=84963959776&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963959776&partnerID=8YFLogxK

U2 - 10.1109/LCOMM.2016.2521837

DO - 10.1109/LCOMM.2016.2521837

M3 - Article

VL - 20

SP - 704

EP - 707

JO - IEEE Communications Letters

JF - IEEE Communications Letters

SN - 1089-7798

IS - 4

M1 - 7393470

ER -