Event detection is of great importance in high-level semantic indexing and selective browsing of video clips. However, the use of low-level visual-audio feature descriptors alone generally fails to yield satisfactory results in event identification due to the semantic gap issue. In this paper, we propose an advanced approach for exciting event detection in soccer video with the aid of multi-level descriptors and classification algorithm. Specifically, a set of algorithms are developed for efficient extraction of meaningful mid-level descriptors to bridge the semantic gap and to facilitate the comprehensive video content analysis. The data classification algorithm is then performed upon the combination of multimodal mid-level descriptors and low-level feature descriptors for event detection. The effectiveness and efficiency of the proposed framework are demonstrated over a large collection of soccer video data with different styles produced by different broadcasters.