Handling nominal features in anomaly intrusion detection problems

Mei-Ling Shyu, Kanoksri Sarinnapakorn, Indika Kuruppu-Appuhamilage, Shu Ching Chen, LiWu Chang, Thomas Goldring

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Citations (Scopus)

Abstract

Computer network data stream used in intrusion detection usually involve many data types. A common data type is that of symbolic or nominal features. Whether being coded into numerical values or not, nominal features need to be treated differently from numeric features. This paper studies the effectiveness of two approaches in handling nominal features: a simple coding scheme via the use of indicator variables and a scaling method based on multiple correspondence analysis (MCA). In particular, we apply the techniques with two anomaly detection methods: the principal component classifier (PCC) and the Canberra metric. The experiments with KDD 1999 data demonstrate that MCA works better than the indicator variable approach for both detection methods with the PCC coming much ahead of the Canberra metric.

Original languageEnglish
Title of host publicationProceedings of the IEEE International Workshop on Research Issues in Data Engineering
EditorsJ. Han, H. Kawano
Pages55-62
Number of pages8
StatePublished - Oct 31 2005
Event15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, RIDE-SDMA 2005 - Tokyo, Japan
Duration: Apr 3 2005Apr 4 2005

Other

Other15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, RIDE-SDMA 2005
CountryJapan
CityTokyo
Period4/3/054/4/05

Fingerprint

Intrusion detection
Classifiers
Computer networks
Experiments

Keywords

  • Anomaly detection
  • Indicator variables
  • Intrusion detection
  • Multiple correspondence analysis
  • Nominal features
  • Principal component classifier

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Engineering (miscellaneous)

Cite this

Shyu, M-L., Sarinnapakorn, K., Kuruppu-Appuhamilage, I., Chen, S. C., Chang, L., & Goldring, T. (2005). Handling nominal features in anomaly intrusion detection problems. In J. Han, & H. Kawano (Eds.), Proceedings of the IEEE International Workshop on Research Issues in Data Engineering (pp. 55-62)

Handling nominal features in anomaly intrusion detection problems. / Shyu, Mei-Ling; Sarinnapakorn, Kanoksri; Kuruppu-Appuhamilage, Indika; Chen, Shu Ching; Chang, LiWu; Goldring, Thomas.

Proceedings of the IEEE International Workshop on Research Issues in Data Engineering. ed. / J. Han; H. Kawano. 2005. p. 55-62.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Shyu, M-L, Sarinnapakorn, K, Kuruppu-Appuhamilage, I, Chen, SC, Chang, L & Goldring, T 2005, Handling nominal features in anomaly intrusion detection problems. in J Han & H Kawano (eds), Proceedings of the IEEE International Workshop on Research Issues in Data Engineering. pp. 55-62, 15th International Workshop on Research Issues in Data Engineering: Stream Data Mining and Applications, RIDE-SDMA 2005, Tokyo, Japan, 4/3/05.
Shyu M-L, Sarinnapakorn K, Kuruppu-Appuhamilage I, Chen SC, Chang L, Goldring T. Handling nominal features in anomaly intrusion detection problems. In Han J, Kawano H, editors, Proceedings of the IEEE International Workshop on Research Issues in Data Engineering. 2005. p. 55-62
Shyu, Mei-Ling ; Sarinnapakorn, Kanoksri ; Kuruppu-Appuhamilage, Indika ; Chen, Shu Ching ; Chang, LiWu ; Goldring, Thomas. / Handling nominal features in anomaly intrusion detection problems. Proceedings of the IEEE International Workshop on Research Issues in Data Engineering. editor / J. Han ; H. Kawano. 2005. pp. 55-62
@inproceedings{02df5208d7df4c2dbda3d671f8c28674,
title = "Handling nominal features in anomaly intrusion detection problems",
abstract = "Computer network data stream used in intrusion detection usually involve many data types. A common data type is that of symbolic or nominal features. Whether being coded into numerical values or not, nominal features need to be treated differently from numeric features. This paper studies the effectiveness of two approaches in handling nominal features: a simple coding scheme via the use of indicator variables and a scaling method based on multiple correspondence analysis (MCA). In particular, we apply the techniques with two anomaly detection methods: the principal component classifier (PCC) and the Canberra metric. The experiments with KDD 1999 data demonstrate that MCA works better than the indicator variable approach for both detection methods with the PCC coming much ahead of the Canberra metric.",
keywords = "Anomaly detection, Indicator variables, Intrusion detection, Multiple correspondence analysis, Nominal features, Principal component classifier",
author = "Mei-Ling Shyu and Kanoksri Sarinnapakorn and Indika Kuruppu-Appuhamilage and Chen, {Shu Ching} and LiWu Chang and Thomas Goldring",
year = "2005",
month = "10",
day = "31",
language = "English",
pages = "55--62",
editor = "J. Han and H. Kawano",
booktitle = "Proceedings of the IEEE International Workshop on Research Issues in Data Engineering",

}

TY - GEN

T1 - Handling nominal features in anomaly intrusion detection problems

AU - Shyu, Mei-Ling

AU - Sarinnapakorn, Kanoksri

AU - Kuruppu-Appuhamilage, Indika

AU - Chen, Shu Ching

AU - Chang, LiWu

AU - Goldring, Thomas

PY - 2005/10/31

Y1 - 2005/10/31

N2 - Computer network data stream used in intrusion detection usually involve many data types. A common data type is that of symbolic or nominal features. Whether being coded into numerical values or not, nominal features need to be treated differently from numeric features. This paper studies the effectiveness of two approaches in handling nominal features: a simple coding scheme via the use of indicator variables and a scaling method based on multiple correspondence analysis (MCA). In particular, we apply the techniques with two anomaly detection methods: the principal component classifier (PCC) and the Canberra metric. The experiments with KDD 1999 data demonstrate that MCA works better than the indicator variable approach for both detection methods with the PCC coming much ahead of the Canberra metric.

AB - Computer network data stream used in intrusion detection usually involve many data types. A common data type is that of symbolic or nominal features. Whether being coded into numerical values or not, nominal features need to be treated differently from numeric features. This paper studies the effectiveness of two approaches in handling nominal features: a simple coding scheme via the use of indicator variables and a scaling method based on multiple correspondence analysis (MCA). In particular, we apply the techniques with two anomaly detection methods: the principal component classifier (PCC) and the Canberra metric. The experiments with KDD 1999 data demonstrate that MCA works better than the indicator variable approach for both detection methods with the PCC coming much ahead of the Canberra metric.

KW - Anomaly detection

KW - Indicator variables

KW - Intrusion detection

KW - Multiple correspondence analysis

KW - Nominal features

KW - Principal component classifier

UR - http://www.scopus.com/inward/record.url?scp=27144530995&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=27144530995&partnerID=8YFLogxK

M3 - Conference contribution

SP - 55

EP - 62

BT - Proceedings of the IEEE International Workshop on Research Issues in Data Engineering

A2 - Han, J.

A2 - Kawano, H.

ER -