Benchmarking of five commercial deformable image registration algorithms for head and neck patients

Jason Pukala, Perry Johnson, Amish P. Shah, Katja M. Langen, Frank J. Bova, Robert J. Staton, Rafael R. Mañon, Patrick Kelly, Sanford L. Meeks

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

Benchmarking is a process in which standardized tests are used to assess system performance. The data produced in the process are important for comparative purposes, particularly when considering the implementation and quality assurance of DIR algorithms. In this work, five commercial DIR algorithms (MIM, Velocity, RayStation, Pinnacle, and Eclipse) were benchmarked using a set of 10 virtual phantoms. The phantoms were previously developed based on CT data collected from real head and neck patients. Each phantom includes a start of treatment CT dataset, an end of treatment CT dataset, and the ground-truth deformation vector field (DVF) which links them together. These virtual phantoms were imported into the commercial systems and registered through a deformable process. The resulting DVFs were compared to the ground-truth DVF to determine the target registration error (TRE) at every voxel within the image set. Real treatment plans were also recalculated on each end of treatment CT dataset and the dose transferred according to both the ground-truth and test DVFs. Dosimetric changes were assessed, and TRE was correlated with changes in the DVH of individual structures. In the first part of the study, results show mean TRE on the order of 0.5 mm to 3 mm for all phantoms and ROIs. In certain instances, however, misregistrations were encountered which produced mean and max errors up to 6.8 mm and 22 mm, respectively. In the second part of the study, dosimetric error was found to be strongly correlated with TRE in the brainstem, but weakly correlated with TRE in the spinal cord. Several interesting cases were assessed which highlight the interplay between the direction and magnitude of TRE and the dose distribution, including the slope of dosimetric gradients and the distance to critical structures. This information can be used to help clinicians better implement and test their algorithms, and also understand the strengths and weaknesses of a dose adaptive approach.

Original languageEnglish (US)
Pages (from-to)25-40
Number of pages16
JournalJournal of Applied Clinical Medical Physics
Volume17
Issue number3
StatePublished - 2016

Fingerprint

Benchmarking
Image registration
Neck
Head
ground truth
Therapeutics
dosage
Brain Stem
Spinal Cord
MIM (semiconductors)
ground tests
spinal cord
assurance
eclipses
Quality assurance
Datasets
slopes
gradients

Keywords

  • Adaptive radiotherapy
  • Deformable image registration
  • Head and neck cancer
  • Quality assurance
  • Virtual phantoms

ASJC Scopus subject areas

  • Radiology Nuclear Medicine and imaging
  • Radiation
  • Instrumentation

Cite this

Pukala, J., Johnson, P., Shah, A. P., Langen, K. M., Bova, F. J., Staton, R. J., ... Meeks, S. L. (2016). Benchmarking of five commercial deformable image registration algorithms for head and neck patients. Journal of Applied Clinical Medical Physics, 17(3), 25-40.

Benchmarking of five commercial deformable image registration algorithms for head and neck patients. / Pukala, Jason; Johnson, Perry; Shah, Amish P.; Langen, Katja M.; Bova, Frank J.; Staton, Robert J.; Mañon, Rafael R.; Kelly, Patrick; Meeks, Sanford L.

In: Journal of Applied Clinical Medical Physics, Vol. 17, No. 3, 2016, p. 25-40.

Research output: Contribution to journalArticle

Pukala, J, Johnson, P, Shah, AP, Langen, KM, Bova, FJ, Staton, RJ, Mañon, RR, Kelly, P & Meeks, SL 2016, 'Benchmarking of five commercial deformable image registration algorithms for head and neck patients', Journal of Applied Clinical Medical Physics, vol. 17, no. 3, pp. 25-40.
Pukala, Jason ; Johnson, Perry ; Shah, Amish P. ; Langen, Katja M. ; Bova, Frank J. ; Staton, Robert J. ; Mañon, Rafael R. ; Kelly, Patrick ; Meeks, Sanford L. / Benchmarking of five commercial deformable image registration algorithms for head and neck patients. In: Journal of Applied Clinical Medical Physics. 2016 ; Vol. 17, No. 3. pp. 25-40.
@article{f39ad0a2e392409fa4c6bc341e22853e,
title = "Benchmarking of five commercial deformable image registration algorithms for head and neck patients",
abstract = "Benchmarking is a process in which standardized tests are used to assess system performance. The data produced in the process are important for comparative purposes, particularly when considering the implementation and quality assurance of DIR algorithms. In this work, five commercial DIR algorithms (MIM, Velocity, RayStation, Pinnacle, and Eclipse) were benchmarked using a set of 10 virtual phantoms. The phantoms were previously developed based on CT data collected from real head and neck patients. Each phantom includes a start of treatment CT dataset, an end of treatment CT dataset, and the ground-truth deformation vector field (DVF) which links them together. These virtual phantoms were imported into the commercial systems and registered through a deformable process. The resulting DVFs were compared to the ground-truth DVF to determine the target registration error (TRE) at every voxel within the image set. Real treatment plans were also recalculated on each end of treatment CT dataset and the dose transferred according to both the ground-truth and test DVFs. Dosimetric changes were assessed, and TRE was correlated with changes in the DVH of individual structures. In the first part of the study, results show mean TRE on the order of 0.5 mm to 3 mm for all phantoms and ROIs. In certain instances, however, misregistrations were encountered which produced mean and max errors up to 6.8 mm and 22 mm, respectively. In the second part of the study, dosimetric error was found to be strongly correlated with TRE in the brainstem, but weakly correlated with TRE in the spinal cord. Several interesting cases were assessed which highlight the interplay between the direction and magnitude of TRE and the dose distribution, including the slope of dosimetric gradients and the distance to critical structures. This information can be used to help clinicians better implement and test their algorithms, and also understand the strengths and weaknesses of a dose adaptive approach.",
keywords = "Adaptive radiotherapy, Deformable image registration, Head and neck cancer, Quality assurance, Virtual phantoms",
author = "Jason Pukala and Perry Johnson and Shah, {Amish P.} and Langen, {Katja M.} and Bova, {Frank J.} and Staton, {Robert J.} and Ma{\~n}on, {Rafael R.} and Patrick Kelly and Meeks, {Sanford L.}",
year = "2016",
language = "English (US)",
volume = "17",
pages = "25--40",
journal = "Journal of applied clinical medical physics / American College of Medical Physics",
issn = "1526-9914",
publisher = "American Institute of Physics Publising LLC",
number = "3",

}

TY - JOUR

T1 - Benchmarking of five commercial deformable image registration algorithms for head and neck patients

AU - Pukala, Jason

AU - Johnson, Perry

AU - Shah, Amish P.

AU - Langen, Katja M.

AU - Bova, Frank J.

AU - Staton, Robert J.

AU - Mañon, Rafael R.

AU - Kelly, Patrick

AU - Meeks, Sanford L.

PY - 2016

Y1 - 2016

N2 - Benchmarking is a process in which standardized tests are used to assess system performance. The data produced in the process are important for comparative purposes, particularly when considering the implementation and quality assurance of DIR algorithms. In this work, five commercial DIR algorithms (MIM, Velocity, RayStation, Pinnacle, and Eclipse) were benchmarked using a set of 10 virtual phantoms. The phantoms were previously developed based on CT data collected from real head and neck patients. Each phantom includes a start of treatment CT dataset, an end of treatment CT dataset, and the ground-truth deformation vector field (DVF) which links them together. These virtual phantoms were imported into the commercial systems and registered through a deformable process. The resulting DVFs were compared to the ground-truth DVF to determine the target registration error (TRE) at every voxel within the image set. Real treatment plans were also recalculated on each end of treatment CT dataset and the dose transferred according to both the ground-truth and test DVFs. Dosimetric changes were assessed, and TRE was correlated with changes in the DVH of individual structures. In the first part of the study, results show mean TRE on the order of 0.5 mm to 3 mm for all phantoms and ROIs. In certain instances, however, misregistrations were encountered which produced mean and max errors up to 6.8 mm and 22 mm, respectively. In the second part of the study, dosimetric error was found to be strongly correlated with TRE in the brainstem, but weakly correlated with TRE in the spinal cord. Several interesting cases were assessed which highlight the interplay between the direction and magnitude of TRE and the dose distribution, including the slope of dosimetric gradients and the distance to critical structures. This information can be used to help clinicians better implement and test their algorithms, and also understand the strengths and weaknesses of a dose adaptive approach.

AB - Benchmarking is a process in which standardized tests are used to assess system performance. The data produced in the process are important for comparative purposes, particularly when considering the implementation and quality assurance of DIR algorithms. In this work, five commercial DIR algorithms (MIM, Velocity, RayStation, Pinnacle, and Eclipse) were benchmarked using a set of 10 virtual phantoms. The phantoms were previously developed based on CT data collected from real head and neck patients. Each phantom includes a start of treatment CT dataset, an end of treatment CT dataset, and the ground-truth deformation vector field (DVF) which links them together. These virtual phantoms were imported into the commercial systems and registered through a deformable process. The resulting DVFs were compared to the ground-truth DVF to determine the target registration error (TRE) at every voxel within the image set. Real treatment plans were also recalculated on each end of treatment CT dataset and the dose transferred according to both the ground-truth and test DVFs. Dosimetric changes were assessed, and TRE was correlated with changes in the DVH of individual structures. In the first part of the study, results show mean TRE on the order of 0.5 mm to 3 mm for all phantoms and ROIs. In certain instances, however, misregistrations were encountered which produced mean and max errors up to 6.8 mm and 22 mm, respectively. In the second part of the study, dosimetric error was found to be strongly correlated with TRE in the brainstem, but weakly correlated with TRE in the spinal cord. Several interesting cases were assessed which highlight the interplay between the direction and magnitude of TRE and the dose distribution, including the slope of dosimetric gradients and the distance to critical structures. This information can be used to help clinicians better implement and test their algorithms, and also understand the strengths and weaknesses of a dose adaptive approach.

KW - Adaptive radiotherapy

KW - Deformable image registration

KW - Head and neck cancer

KW - Quality assurance

KW - Virtual phantoms

UR - http://www.scopus.com/inward/record.url?scp=84968820354&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84968820354&partnerID=8YFLogxK

M3 - Article

VL - 17

SP - 25

EP - 40

JO - Journal of applied clinical medical physics / American College of Medical Physics

JF - Journal of applied clinical medical physics / American College of Medical Physics

SN - 1526-9914

IS - 3

ER -