Cluster validation for unsupervised stochastic model-based image segmentation

David A. Langan, James W. Modestino, Jun Zhang

Research output: Contribution to journalArticle

55 Citations (Scopus)

Abstract

Image segmentation is an important and early processing stage in many image analysis problems. Often, this must be done in an unsupervised fashion in that training data is not available and the class-conditioned feature vectors must be estimated directly from the data. A major problem in such applications is the determination of the number of classes actually present in an image. This problem, called the cluster validation problem, remains essentially unsolved. In this paper, we investigate the cluster validation problem associated with the use of a previously developed unsupervised segmentation algorithm based upon the expectation-maximization (EM) algorithm. More specifically, we consider several well-known information-theoretic criteria (IC's) as candidate solutions to the validation problem when used in conjunction with this EM-based segmentation scheme. We show that these criteria generally provide inappropriate solutions due to the domination of the penalty term by the associated log-likelihood function. As an alternative we propose a model-fitting technique in which the complete data log-likelihood functional is modeled as an exponential function in the number of classes acting. The estimated number of classes are then determined in a manner similar to finding the rise time of the exponential function. This new validation technique is shown to be robust and outperform the IC's in our experiments. Experimental results for both synthetic and real world imagery are detailed.

Original languageEnglish
Pages (from-to)180-195
Number of pages16
JournalIEEE Transactions on Image Processing
Volume7
Issue number2
DOIs
StatePublished - Dec 1 1998

Fingerprint

Cluster Validation
Exponential functions
Stochastic models
Image segmentation
Image Segmentation
Stochastic Model
Model-based
Image analysis
Segmentation
Processing
Expectation Maximization
Datalog
Model Fitting
Experiments
Expectation-maximization Algorithm
Domination
Likelihood Function
Feature Vector
Image Analysis
Penalty

Keywords

  • Clustering methods
  • Image analysis
  • Image classification
  • Image processing
  • Image segmentation
  • Markov processes
  • Maximum-likelihood estimation
  • Stochastic fields

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Graphics and Computer-Aided Design
  • Software
  • Theoretical Computer Science
  • Computational Theory and Mathematics
  • Computer Vision and Pattern Recognition

Cite this

Cluster validation for unsupervised stochastic model-based image segmentation. / Langan, David A.; Modestino, James W.; Zhang, Jun.

In: IEEE Transactions on Image Processing, Vol. 7, No. 2, 01.12.1998, p. 180-195.

Research output: Contribution to journalArticle

Langan, David A. ; Modestino, James W. ; Zhang, Jun. / Cluster validation for unsupervised stochastic model-based image segmentation. In: IEEE Transactions on Image Processing. 1998 ; Vol. 7, No. 2. pp. 180-195.
@article{30f95f9b85a9493a88006215ecd5710a,
title = "Cluster validation for unsupervised stochastic model-based image segmentation",
abstract = "Image segmentation is an important and early processing stage in many image analysis problems. Often, this must be done in an unsupervised fashion in that training data is not available and the class-conditioned feature vectors must be estimated directly from the data. A major problem in such applications is the determination of the number of classes actually present in an image. This problem, called the cluster validation problem, remains essentially unsolved. In this paper, we investigate the cluster validation problem associated with the use of a previously developed unsupervised segmentation algorithm based upon the expectation-maximization (EM) algorithm. More specifically, we consider several well-known information-theoretic criteria (IC's) as candidate solutions to the validation problem when used in conjunction with this EM-based segmentation scheme. We show that these criteria generally provide inappropriate solutions due to the domination of the penalty term by the associated log-likelihood function. As an alternative we propose a model-fitting technique in which the complete data log-likelihood functional is modeled as an exponential function in the number of classes acting. The estimated number of classes are then determined in a manner similar to finding the rise time of the exponential function. This new validation technique is shown to be robust and outperform the IC's in our experiments. Experimental results for both synthetic and real world imagery are detailed.",
keywords = "Clustering methods, Image analysis, Image classification, Image processing, Image segmentation, Markov processes, Maximum-likelihood estimation, Stochastic fields",
author = "Langan, {David A.} and Modestino, {James W.} and Jun Zhang",
year = "1998",
month = "12",
day = "1",
doi = "10.1109/83.660995",
language = "English",
volume = "7",
pages = "180--195",
journal = "IEEE Transactions on Image Processing",
issn = "1057-7149",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "2",

}

TY - JOUR

T1 - Cluster validation for unsupervised stochastic model-based image segmentation

AU - Langan, David A.

AU - Modestino, James W.

AU - Zhang, Jun

PY - 1998/12/1

Y1 - 1998/12/1

N2 - Image segmentation is an important and early processing stage in many image analysis problems. Often, this must be done in an unsupervised fashion in that training data is not available and the class-conditioned feature vectors must be estimated directly from the data. A major problem in such applications is the determination of the number of classes actually present in an image. This problem, called the cluster validation problem, remains essentially unsolved. In this paper, we investigate the cluster validation problem associated with the use of a previously developed unsupervised segmentation algorithm based upon the expectation-maximization (EM) algorithm. More specifically, we consider several well-known information-theoretic criteria (IC's) as candidate solutions to the validation problem when used in conjunction with this EM-based segmentation scheme. We show that these criteria generally provide inappropriate solutions due to the domination of the penalty term by the associated log-likelihood function. As an alternative we propose a model-fitting technique in which the complete data log-likelihood functional is modeled as an exponential function in the number of classes acting. The estimated number of classes are then determined in a manner similar to finding the rise time of the exponential function. This new validation technique is shown to be robust and outperform the IC's in our experiments. Experimental results for both synthetic and real world imagery are detailed.

AB - Image segmentation is an important and early processing stage in many image analysis problems. Often, this must be done in an unsupervised fashion in that training data is not available and the class-conditioned feature vectors must be estimated directly from the data. A major problem in such applications is the determination of the number of classes actually present in an image. This problem, called the cluster validation problem, remains essentially unsolved. In this paper, we investigate the cluster validation problem associated with the use of a previously developed unsupervised segmentation algorithm based upon the expectation-maximization (EM) algorithm. More specifically, we consider several well-known information-theoretic criteria (IC's) as candidate solutions to the validation problem when used in conjunction with this EM-based segmentation scheme. We show that these criteria generally provide inappropriate solutions due to the domination of the penalty term by the associated log-likelihood function. As an alternative we propose a model-fitting technique in which the complete data log-likelihood functional is modeled as an exponential function in the number of classes acting. The estimated number of classes are then determined in a manner similar to finding the rise time of the exponential function. This new validation technique is shown to be robust and outperform the IC's in our experiments. Experimental results for both synthetic and real world imagery are detailed.

KW - Clustering methods

KW - Image analysis

KW - Image classification

KW - Image processing

KW - Image segmentation

KW - Markov processes

KW - Maximum-likelihood estimation

KW - Stochastic fields

UR - http://www.scopus.com/inward/record.url?scp=0031996422&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0031996422&partnerID=8YFLogxK

U2 - 10.1109/83.660995

DO - 10.1109/83.660995

M3 - Article

C2 - 18267392

AN - SCOPUS:0031996422

VL - 7

SP - 180

EP - 195

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

SN - 1057-7149

IS - 2

ER -