Generation and Evaluation of a Generalized Classifier for Prostate Cancer

Data and Method from
Shukla-Dave A, Hricak H, Kattan MW, Pucar D, Kuroiwa K, Chen HN, Spector J, Koutcher JA, Zakian KL, Scardino PT
The utility of magnetic resonance imaging and spectroscopy for predicting insignificant prostate cancer: an initial analysis.
BJU Int. 99(4):786-793 (April 2007)

Since single parameters ("variables"), such as

do not provide sufficient sensitivity and specificity (e.g. in predicting the insignificance of prostate cancers), models have been constructed that combine such single variables. The model parameters are
  1. the weights with which each single variable is chosen to contribute to the lumped parameter (henceforth called "generalized classifier") and
  2. the distribution of the lumped parameter over the investigated population.
The parameters are fitted such that the clinical findings are best reproduced.

A particularly transparent model representation has been published by the above mentioned authors in the form of a nomogram. Here a mathematically identical alternative will be presented: Instead of a nomogram curves specify how much each single variable contributes to the generalized classifier.

Construction of the Model

In the first step the weights are defined with which the single variables enter into the generalized classifier (upper part of figure below, the numerical coefficients in these Taylor expansions are fitted in a calibration process described below).

MRI/MRSI model: Generation and evaluation of a classifier unified from specific investigations
Figure: The MRI/MRSI model, combining the single parameters PSA concentration, prostate volume, percentage of positive biopsies, and overall MRI/MRSI score. The two probability functions at the bottom of the figure are the ones defined earlier: probability density function f = f[x, mu, sig] and the cumulative probability function Cdf = Cdf[x, mu, sig] with mu = 125 and sig = 30.

Example of how -with the help of this figure- a generalized classifier point value x is calculated from a set of single variables:

  1. A (total) PSA concentration of 7.5 ng/mL contributes 23 points to the generalized classifier,
  2. a prostate volume of 40 cm3 contributes 15 points, and
  3. when no biopsy speciment shows cancer this amounts to 35 points,

    which adds up to 73 points.

  4. When the MRSI has as result "probably insignificant" (overall MRI/MRSI score = 1), this amounts to an additional contribution of 88 points to the generalized classifier.
The total classifier point number in the given example is 73 + 88 = 161 points. As one can see, the MRI/MRSI overall score is given a larger weight (88 points) than any of the other diagnostic variables.

The second step of the assessment (lower part of figure) is as empirical as the first one: It is assumed that the classifier is normally distributed (probability density f[x, mu, sig] - lower left hand part of figure, cumulative probability Cdf[x, mu, sig] - lower right hand part) about a maximum at mu and with a width sig.

Calibration of the Model

The model is calibrated by adjusting The coefficients are fitted such that for all point values of the generalized classifier the predicted probabilities come as close as possible to the actual probabilities.

Validation of the Model

The model is validated with the "leave-one-out" (also called "jack-knife") cross validation: a single patient's data from the data set of all patients are used as the validation data, and the remaining data for calibration, i.e. as the data with which the mentioned adjustable coefficients and mu and sig are fitted. This is repeated such that each patient data in the sample are used once as the validation data.

back to the MRI/MRSI model.


version: 4.5.2007
address of this page
home
Joachim Gruber