Florida International University and University of Miami TRECVID 2009-high level feature extraction

Lin Lin, Chao Chen, Mei-Ling Shyu, Fausto Fleites, Shu Ching Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this paper, the details about FIU-UM group TRECVID2009 high-level feature extraction task submission are presented. Six runs were conducted using different feature sets, data pruning approaches, classification algorithms, and ranking methods. A proportion of TRECVID2009 development data were randomly sampled from the whole development data archives (all TRECVID2007 development data and test data), which include all positive data instances (target-high-level feature data) and partial negative data instances (around one-third non-target-high-level feature data) for each high-level feature. Two strategies dealing with the skipping/not-sure shots were also introduced. First four runs treated the skipping/not-sure data instances as positive instances in the training data (ALL), and the last two runs disregarded these skipping/not-sure data instances from the training data (PURE). • FIU-UM-1: KF+ALL+CB+MCA+RANK, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). The RANK method uses the Euclidean distances of two selected features between each testing data instance and the positive training set as additional scores integrated with the scores from MCA-based classifier to obtain the final ranking scores. • FIU-UM-2: KF+ALL+CB+MCA, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and a ranking process used MCA-based scores from the classifier. • FIU-UM-3: SF+ALL+DB+SB, training on partial TRECVID2009 development data with all positive sets (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. • FIU-UM-4: SF+ALL+DB+SB+SVMC, training on partial TRECVID2009 development data with all positive set (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and SVMC ranking method. The SVMC method brings the retrieval results from SVM with chi-square kernel (SVMC) and considers these results as additional scores which are later combined with subspace-based scores to form the final ranking scores. • FIU-UM-5: KF+PURE+CB+MCA+RANK, training on partial TRECVID2009 development data with pure positive set (PURE) and using key-frame based low-level features (KF), correlationbased pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). • FIU-UM-6: SF+PURE+DB+SB, training on partial TRECVID2009 development data with pure positive set (PURE) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. In the TRECVID2009 high-level feature extraction task submission, we are able to improve the framework in several ways. First, more key-frame based visual features (513) were extracted in addition to the 28 old shot-based features, and different normalization methods were applied. Second, all development data (219 videos) and testing data (619 videos) were processed. Third, a key-frame detection algorithm was implemented to extract the key-frames from testing videos, which are not provided by TRECVID. Fourth, different data pruning methods were proposed to solve the data imbalance issue, and from other experimental results, our proposed methods performs well on removing noisy data and selecting the typical positive and negative data instances. Fifth, two new classifiers were proposed in our framework rather than using the existing classifiers like Support Vector Machine, Decision Tree, etc. Finally, in addition to concept detection, we are able to extend our framework to the area of video retrieval. In other words, we proposed several scoring methods to rank the retrieved results. However, we are still facing a lot of challenges. First, as can be seen from the description of each run, three runs by utilizing the CB+MCA model were trained by the key-frame based low/mid-level visual features. By adding some low-level audio features, the extraction performance for some highlevel features would be improved, such as person-playing-a-musical-instrument, people-dancing, and singing. Similarly, more visual features would help the runs trained only by the shot-based feature data. Therefore, how to integrate the audio features with the key-frame based features and add more visual features with shot-based features need to be done. Second, to solve the data imbalance problem, the negative data instances were first randomly sampled. This is very risky since by doing this, the difference of the distribution of the training set and testing set could be enlarged. Then even the training performance is pretty good as in our experiments, the testing results may not be as good as expected. Therefore, more investigations on data sampling and data pruning should be considered. Third, from the results we could see that the ranking methods are not good enough. More research on ranking the retrieved results should be studied.

Original languageEnglish
Title of host publication2009 TREC Video Retrieval Evaluation Notebook Papers
PublisherNational Institute of Standards and Technology
StatePublished - Jan 1 2009
EventTREC Video Retrieval Evaluation, TRECVID 2009 - Gaithersburg, MD, United States
Duration: Nov 16 2009Nov 17 2009

Other

OtherTREC Video Retrieval Evaluation, TRECVID 2009
CountryUnited States
CityGaithersburg, MD
Period11/16/0911/17/09

Fingerprint

Feature extraction
Classifiers
Testing
Musical instruments
Decision trees
Support vector machines
Sampling

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Software

Cite this

Lin, L., Chen, C., Shyu, M-L., Fleites, F., & Chen, S. C. (2009). Florida International University and University of Miami TRECVID 2009-high level feature extraction. In 2009 TREC Video Retrieval Evaluation Notebook Papers National Institute of Standards and Technology.

Florida International University and University of Miami TRECVID 2009-high level feature extraction. / Lin, Lin; Chen, Chao; Shyu, Mei-Ling; Fleites, Fausto; Chen, Shu Ching.

2009 TREC Video Retrieval Evaluation Notebook Papers. National Institute of Standards and Technology, 2009.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lin, L, Chen, C, Shyu, M-L, Fleites, F & Chen, SC 2009, Florida International University and University of Miami TRECVID 2009-high level feature extraction. in 2009 TREC Video Retrieval Evaluation Notebook Papers. National Institute of Standards and Technology, TREC Video Retrieval Evaluation, TRECVID 2009, Gaithersburg, MD, United States, 11/16/09.
Lin L, Chen C, Shyu M-L, Fleites F, Chen SC. Florida International University and University of Miami TRECVID 2009-high level feature extraction. In 2009 TREC Video Retrieval Evaluation Notebook Papers. National Institute of Standards and Technology. 2009
Lin, Lin ; Chen, Chao ; Shyu, Mei-Ling ; Fleites, Fausto ; Chen, Shu Ching. / Florida International University and University of Miami TRECVID 2009-high level feature extraction. 2009 TREC Video Retrieval Evaluation Notebook Papers. National Institute of Standards and Technology, 2009.
@inproceedings{af16102dee5843119c21834595eb573c,
title = "Florida International University and University of Miami TRECVID 2009-high level feature extraction",
abstract = "In this paper, the details about FIU-UM group TRECVID2009 high-level feature extraction task submission are presented. Six runs were conducted using different feature sets, data pruning approaches, classification algorithms, and ranking methods. A proportion of TRECVID2009 development data were randomly sampled from the whole development data archives (all TRECVID2007 development data and test data), which include all positive data instances (target-high-level feature data) and partial negative data instances (around one-third non-target-high-level feature data) for each high-level feature. Two strategies dealing with the skipping/not-sure shots were also introduced. First four runs treated the skipping/not-sure data instances as positive instances in the training data (ALL), and the last two runs disregarded these skipping/not-sure data instances from the training data (PURE). • FIU-UM-1: KF+ALL+CB+MCA+RANK, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). The RANK method uses the Euclidean distances of two selected features between each testing data instance and the positive training set as additional scores integrated with the scores from MCA-based classifier to obtain the final ranking scores. • FIU-UM-2: KF+ALL+CB+MCA, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and a ranking process used MCA-based scores from the classifier. • FIU-UM-3: SF+ALL+DB+SB, training on partial TRECVID2009 development data with all positive sets (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. • FIU-UM-4: SF+ALL+DB+SB+SVMC, training on partial TRECVID2009 development data with all positive set (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and SVMC ranking method. The SVMC method brings the retrieval results from SVM with chi-square kernel (SVMC) and considers these results as additional scores which are later combined with subspace-based scores to form the final ranking scores. • FIU-UM-5: KF+PURE+CB+MCA+RANK, training on partial TRECVID2009 development data with pure positive set (PURE) and using key-frame based low-level features (KF), correlationbased pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). • FIU-UM-6: SF+PURE+DB+SB, training on partial TRECVID2009 development data with pure positive set (PURE) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. In the TRECVID2009 high-level feature extraction task submission, we are able to improve the framework in several ways. First, more key-frame based visual features (513) were extracted in addition to the 28 old shot-based features, and different normalization methods were applied. Second, all development data (219 videos) and testing data (619 videos) were processed. Third, a key-frame detection algorithm was implemented to extract the key-frames from testing videos, which are not provided by TRECVID. Fourth, different data pruning methods were proposed to solve the data imbalance issue, and from other experimental results, our proposed methods performs well on removing noisy data and selecting the typical positive and negative data instances. Fifth, two new classifiers were proposed in our framework rather than using the existing classifiers like Support Vector Machine, Decision Tree, etc. Finally, in addition to concept detection, we are able to extend our framework to the area of video retrieval. In other words, we proposed several scoring methods to rank the retrieved results. However, we are still facing a lot of challenges. First, as can be seen from the description of each run, three runs by utilizing the CB+MCA model were trained by the key-frame based low/mid-level visual features. By adding some low-level audio features, the extraction performance for some highlevel features would be improved, such as person-playing-a-musical-instrument, people-dancing, and singing. Similarly, more visual features would help the runs trained only by the shot-based feature data. Therefore, how to integrate the audio features with the key-frame based features and add more visual features with shot-based features need to be done. Second, to solve the data imbalance problem, the negative data instances were first randomly sampled. This is very risky since by doing this, the difference of the distribution of the training set and testing set could be enlarged. Then even the training performance is pretty good as in our experiments, the testing results may not be as good as expected. Therefore, more investigations on data sampling and data pruning should be considered. Third, from the results we could see that the ranking methods are not good enough. More research on ranking the retrieved results should be studied.",
author = "Lin Lin and Chao Chen and Mei-Ling Shyu and Fausto Fleites and Chen, {Shu Ching}",
year = "2009",
month = "1",
day = "1",
language = "English",
booktitle = "2009 TREC Video Retrieval Evaluation Notebook Papers",
publisher = "National Institute of Standards and Technology",

}

TY - GEN

T1 - Florida International University and University of Miami TRECVID 2009-high level feature extraction

AU - Lin, Lin

AU - Chen, Chao

AU - Shyu, Mei-Ling

AU - Fleites, Fausto

AU - Chen, Shu Ching

PY - 2009/1/1

Y1 - 2009/1/1

N2 - In this paper, the details about FIU-UM group TRECVID2009 high-level feature extraction task submission are presented. Six runs were conducted using different feature sets, data pruning approaches, classification algorithms, and ranking methods. A proportion of TRECVID2009 development data were randomly sampled from the whole development data archives (all TRECVID2007 development data and test data), which include all positive data instances (target-high-level feature data) and partial negative data instances (around one-third non-target-high-level feature data) for each high-level feature. Two strategies dealing with the skipping/not-sure shots were also introduced. First four runs treated the skipping/not-sure data instances as positive instances in the training data (ALL), and the last two runs disregarded these skipping/not-sure data instances from the training data (PURE). • FIU-UM-1: KF+ALL+CB+MCA+RANK, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). The RANK method uses the Euclidean distances of two selected features between each testing data instance and the positive training set as additional scores integrated with the scores from MCA-based classifier to obtain the final ranking scores. • FIU-UM-2: KF+ALL+CB+MCA, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and a ranking process used MCA-based scores from the classifier. • FIU-UM-3: SF+ALL+DB+SB, training on partial TRECVID2009 development data with all positive sets (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. • FIU-UM-4: SF+ALL+DB+SB+SVMC, training on partial TRECVID2009 development data with all positive set (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and SVMC ranking method. The SVMC method brings the retrieval results from SVM with chi-square kernel (SVMC) and considers these results as additional scores which are later combined with subspace-based scores to form the final ranking scores. • FIU-UM-5: KF+PURE+CB+MCA+RANK, training on partial TRECVID2009 development data with pure positive set (PURE) and using key-frame based low-level features (KF), correlationbased pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). • FIU-UM-6: SF+PURE+DB+SB, training on partial TRECVID2009 development data with pure positive set (PURE) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. In the TRECVID2009 high-level feature extraction task submission, we are able to improve the framework in several ways. First, more key-frame based visual features (513) were extracted in addition to the 28 old shot-based features, and different normalization methods were applied. Second, all development data (219 videos) and testing data (619 videos) were processed. Third, a key-frame detection algorithm was implemented to extract the key-frames from testing videos, which are not provided by TRECVID. Fourth, different data pruning methods were proposed to solve the data imbalance issue, and from other experimental results, our proposed methods performs well on removing noisy data and selecting the typical positive and negative data instances. Fifth, two new classifiers were proposed in our framework rather than using the existing classifiers like Support Vector Machine, Decision Tree, etc. Finally, in addition to concept detection, we are able to extend our framework to the area of video retrieval. In other words, we proposed several scoring methods to rank the retrieved results. However, we are still facing a lot of challenges. First, as can be seen from the description of each run, three runs by utilizing the CB+MCA model were trained by the key-frame based low/mid-level visual features. By adding some low-level audio features, the extraction performance for some highlevel features would be improved, such as person-playing-a-musical-instrument, people-dancing, and singing. Similarly, more visual features would help the runs trained only by the shot-based feature data. Therefore, how to integrate the audio features with the key-frame based features and add more visual features with shot-based features need to be done. Second, to solve the data imbalance problem, the negative data instances were first randomly sampled. This is very risky since by doing this, the difference of the distribution of the training set and testing set could be enlarged. Then even the training performance is pretty good as in our experiments, the testing results may not be as good as expected. Therefore, more investigations on data sampling and data pruning should be considered. Third, from the results we could see that the ranking methods are not good enough. More research on ranking the retrieved results should be studied.

AB - In this paper, the details about FIU-UM group TRECVID2009 high-level feature extraction task submission are presented. Six runs were conducted using different feature sets, data pruning approaches, classification algorithms, and ranking methods. A proportion of TRECVID2009 development data were randomly sampled from the whole development data archives (all TRECVID2007 development data and test data), which include all positive data instances (target-high-level feature data) and partial negative data instances (around one-third non-target-high-level feature data) for each high-level feature. Two strategies dealing with the skipping/not-sure shots were also introduced. First four runs treated the skipping/not-sure data instances as positive instances in the training data (ALL), and the last two runs disregarded these skipping/not-sure data instances from the training data (PURE). • FIU-UM-1: KF+ALL+CB+MCA+RANK, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). The RANK method uses the Euclidean distances of two selected features between each testing data instance and the positive training set as additional scores integrated with the scores from MCA-based classifier to obtain the final ranking scores. • FIU-UM-2: KF+ALL+CB+MCA, training on partial TRECVID2009 development data with all positive set (ALL) and using key-frame based low-level features (KF), correlation-based pruning (CB), MCA-based classifier (MCA), and a ranking process used MCA-based scores from the classifier. • FIU-UM-3: SF+ALL+DB+SB, training on partial TRECVID2009 development data with all positive sets (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. • FIU-UM-4: SF+ALL+DB+SB+SVMC, training on partial TRECVID2009 development data with all positive set (ALL) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and SVMC ranking method. The SVMC method brings the retrieval results from SVM with chi-square kernel (SVMC) and considers these results as additional scores which are later combined with subspace-based scores to form the final ranking scores. • FIU-UM-5: KF+PURE+CB+MCA+RANK, training on partial TRECVID2009 development data with pure positive set (PURE) and using key-frame based low-level features (KF), correlationbased pruning (CB), MCA-based classifier (MCA), and ranking method (RANK). • FIU-UM-6: SF+PURE+DB+SB, training on partial TRECVID2009 development data with pure positive set (PURE) and using shot-based low-level features (SF), distance-based pruning (DB), subspace-based classifier (SB), and a ranking process used subspace-based scores from the classifier. In the TRECVID2009 high-level feature extraction task submission, we are able to improve the framework in several ways. First, more key-frame based visual features (513) were extracted in addition to the 28 old shot-based features, and different normalization methods were applied. Second, all development data (219 videos) and testing data (619 videos) were processed. Third, a key-frame detection algorithm was implemented to extract the key-frames from testing videos, which are not provided by TRECVID. Fourth, different data pruning methods were proposed to solve the data imbalance issue, and from other experimental results, our proposed methods performs well on removing noisy data and selecting the typical positive and negative data instances. Fifth, two new classifiers were proposed in our framework rather than using the existing classifiers like Support Vector Machine, Decision Tree, etc. Finally, in addition to concept detection, we are able to extend our framework to the area of video retrieval. In other words, we proposed several scoring methods to rank the retrieved results. However, we are still facing a lot of challenges. First, as can be seen from the description of each run, three runs by utilizing the CB+MCA model were trained by the key-frame based low/mid-level visual features. By adding some low-level audio features, the extraction performance for some highlevel features would be improved, such as person-playing-a-musical-instrument, people-dancing, and singing. Similarly, more visual features would help the runs trained only by the shot-based feature data. Therefore, how to integrate the audio features with the key-frame based features and add more visual features with shot-based features need to be done. Second, to solve the data imbalance problem, the negative data instances were first randomly sampled. This is very risky since by doing this, the difference of the distribution of the training set and testing set could be enlarged. Then even the training performance is pretty good as in our experiments, the testing results may not be as good as expected. Therefore, more investigations on data sampling and data pruning should be considered. Third, from the results we could see that the ranking methods are not good enough. More research on ranking the retrieved results should be studied.

UR - http://www.scopus.com/inward/record.url?scp=84905686395&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905686395&partnerID=8YFLogxK

M3 - Conference contribution

BT - 2009 TREC Video Retrieval Evaluation Notebook Papers

PB - National Institute of Standards and Technology

ER -