The exponential growth of multimedia data including images and videos has been witnessed on social media websites like Instagram and YouTube. With the rapid growth of multimedia data size, efficient processing of these big data becomes more and more important. Meanwhile, lots of classifiers have been proposed for a number of data types. However, how to assemble these classifiers efficiently remains a challenging research issue. In this paper, a novel scalable framework is proposed for classifier ensemble using a set of judgers generated based on the training and validation results. These judgers are ranked and put together as a hierarchically structured decision model. The proposed ensemble framework is deployed on an Apache Spark cluster for efficient data processing. Our experimental results on multimedia datasets containing different actions show that our ensemble work performs better than several state-of-the-art model fusion approaches.