RadioGraphics
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Figures Only
Right arrow Full Text (PDF)
Right arrow Submit a response
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Rzeszotarski, M. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Rzeszotarski, M. S.
Related Collections
Right arrow Nuclear Medicine
Right arrow Physics and Basic Science
(Radiographics. 1999;19:765-782.)
© RSNA, 1999


IMAGING & THERAPEUTIC TECHNOLOGY

The AAPM/RSNA Physics Tutorial for Residents1

Counting Statistics

Mark S. Rzeszotarski, PhD

1 From the Departments of Radiology, Case Western Reserve University and MetroHealth Medical Center, 2500 MetroHealth Dr, Cleveland, OH 44109-1998. From the AAPM/RSNA Physics Tutorial at the 1997 RSNA scientific assembly. Received January 13, 1999; revision requested February 22 and received March 2; accepted March 5. Address reprint requests to the author.


    Abstract
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
The low radiation dose rates used in nuclear medicine necessitate image formation and measurements that are severely count limited. This limitation may mask our ability to perceive contrast in an image or may affect our confidence in quantitative functional measurements. The randomness of the signal can be described by using the Poisson probability distribution with its associated mean and variance. The validity of a measurement and uncertainties in a re-sult can be determined by examining the count statistics. If multiple measurements are used to derive a result, confidence levels can be determined by examination of the propagation of errors. The statistical properties of the detected signal can also be evaluated to determine if the equipment is functioning properly. For example, the {chi}2 test can be used to determine if there is too much or too little variability in count samples. Finally, image formation with limited numbers of photons results in noisy images that may be difficult to interpret. An understanding of the trade-offs between contrast, noise, and object size is required to set proper image acquisition parameters and thereby ensure that the information required to make a diagnosis is contained in the final image.

Index Terms: Physics • Radionuclide imaging, quality assurance • Statistical analysis


    INTRODUCTION
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
Counting statistics are an important component of nuclear medicine. Poor count statistics can cause increased uncertainties in measurements or obscure the conspicuity of abnormal findings in an image. The objective of this article is to provide an overview of count statistics with a little mathematics included to drive home the importance of the concepts. Topics include errors in measurements, types of probability distributions, frequency distributions, statistical parameters, properties of probability distributions, percent uncertainty, confidence intervals, hypothesis testing, propagation of errors, and image statistics and detection. The latter section examines how visual perception is affected by the limited count statistics in nuclear medicine images.


    ERRORS IN MEASUREMENTS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
An error in a measurement occurs if the measured value deviates from the true value. Although the true value may not be known with absolute certainty in the case of radioactive decay, one can get a good estimate of the truth if precautions are taken to prevent errors from occurring in the measurement. There are three types of errors commonly associated with taking measurements: random errors, systematic errors, and blunders.

Random errors occur because of the natural statistical variability of radioactive decay. This fact is particularly true in nuclear medicine because radioactive decay is a random event. For example, we expect that half of a collection of technetium-99m atoms will undergo radioactive decay every 6 hours, but we never know exactly when any single atom will decay. In fact, the probability of a single atom decaying is very low, on the order of 0.000032 per second for Tc-99m. As a result, when a sample in a well counter is counted or the total counts in a region of interest in an image are calculated, the natural fluctuations in radioactive decay will yield a different measurement each time. This randomness in the data can be controlled to some extent by understanding the relationship between the total number of counts and the deviation from this total, a topic discussed later in this article.

Systematic errors occur when there is some bias in a measurement. This bias introduces an underestimate or overestimate of the true measurement. The classic example of a systematic error involves a warped ruler that has been stretched to an actual length longer than it should be. This ruler provides consistent measurements, but they are always too large. In digital imaging, the electronic calipers used to make measurements may be improperly calibrated; the result is a consistent underestimate or overestimate of the true size. This error is manifested in nuclear medicine images when the pixel or voxel size is in error. Another example of a systematic error occurs if the region of interest is consistently drawn too large, in which case one may overestimate the true size of an organ or make errors in assessing the uptake or subtracting a background count. In addition, an improperly calibrated well counter or gamma camera may acquire too many or too few counts if the windows are inappropriately chosen, dead time is affecting the measurement, or electronic noise is affecting the instrument. Dead time represents the time it takes a detector to process an incoming gamma ray. During this time, other incident gamma rays will not be counted. If the count rate is high, fewer events are counted than are actually incident on the detector; the result is a systematic underestimate of the true count rate. Electronic noise can cause false triggering of the counting electronics and result in an overestimate of the true number of gamma ray events. When systematic errors are present, the measurement is consistently biased with either too many or too few counts produced.

The third type of measurement error is called a blunder. This error can take the form of misreading a number from a multichannel analyzer or an image calculation result, recording the information from the wrong patient study, or using the wrong region of interest in an imaging measurement. It is surprising how often this type of error occurs. To detect these otherwise hidden errors, one should be watchful for results that are not consistent with what is expected on the basis of the patient's clinical information and history.

Figure 1 illustrates the types of measurement errors that can occur. The randomness or dispersion of the data can be reduced by counting the sample for a longer time, which results in a greater number of counts. Bias is introduced when a well counter does not faithfully measure the counts incident upon its detector. The bias may be due to dead time losses, which are common in well counters.



View larger version (8K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 1.  Measurement errors. Graphs show the results of taking 20 hypothetical measurements from the same sample in several different well counters. Four scenarios are shown. Each square represents one measurement. Two properties are demonstrated: randomness and bias. In the upper left graph, the data are random (spread out), and on average the measurements underestimate the true value. The upper right graph shows an equally random set of measurements, but now, on average, the true value is represented, and the bias has been eliminated. The lower left graph demonstrates low variability (good precision), but again a systematic error is introduced, which causes all measurements to be higher than expected, perhaps due to excessive background counts. The lower right graph shows a set of measurements with good precision and without bias, qualities that are our goal. Not shown is a blunder, which was outside the range of the graphs, whereupon it was recognized as an incorrect recording and the measurement was repeated.

 
The concept of accuracy is used to reflect the degree of agreement of a measurement with its true value. Accuracy is a measure of the degree of unbiasedness; thus, an accurate measurement has no systematic error. Precision refers to the degree of randomness or variability of a measurement. A precise measurement is one that has a low degree of randomness. Random errors are caused by the limited count statistics of nuclear medicine images. Precision can be improved by increasing the number of counts in an image, regions of interest, or total sample counts. Inaccuracies due to systematic errors can be minimized by proper calibration and routine quality control of instruments and by drawing regions of interest accurately.


    TYPES OF PROBABILITY DISTRIBUTIONS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
Probability is a statement of the ratio between the number of times a certain result occurs in repeated trials divided by the total number of such trials in an experiment. The probability of a certain event represents the likelihood that the event will occur. Statistics provide the mathematical framework for interpreting statements of probability. For example, an individual radioactive atom can either decay away or not if observed for a specific time. Thus, two possible states exist. The same is true for flipping a coin; either heads or tails will result. Each of these examples can be studied by using probabilities. In the case of radioactive decay, the probability of decay is normally quite small, on the order of one in a hundred thousand to one in a million or so in any 1 second for typical clinical radioisotopes. For the coin, the two possible outcomes are equally likely. The mathematics used to describe both of these processes make use of a statistical property called the binomial distribution. It applies whenever there is a discrete number of possible outcomes. In medicine, the binomial distribution applies especially well to those situations in which two different outcomes are likely: normal or abnormal, malignant or benign, survival or death, positive or negative, and many other applications.

A special form of the binomial distribution, the Poisson distribution, exists for situations in which the chances of one specific outcome occurring are very small. Specifically, this distribution applies to situations in which there is a large number of "participants" but the probability of an event occurring to an individual participant is very low. This is the situation for radioactive decay because the likelihood of any one atom decaying per unit time is typically very small. In this case, the mathematics describing the binomial distribution can be simplified to a form known as the Poisson distribution. Each of these distributions is discrete, which means that they take on only integer values. For example, there can be only zero, one, two, three, or more atoms decaying in a specific interval. It is not possible to have 3.7 decays because there are only two possible states: decay or not decay. This property is characteristic of discrete distribution functions like the binomial and Poisson distributions.

A third distribution, called the Gaussian or normal distribution, is also commonly used in medicine and is also applicable to many phenomena in nature. Examples include height, weight, and age distributions and distributions of total serum cholesterol levels. The Gaussian distribution is continuous rather than discrete, taking on any value rather than only integer values. As a result, the distribution can take on values less than zero, which cannot occur with the binomial or Poisson distribution. The Gaussian distribution is also used to describe the statistical theory of errors, which is explored later in this article.


    FREQUENCY DISTRIBUTIONS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
A gamma camera uniformity flood image may appear noisy due to the natural fluctuation in counts during radioactive decay (Fig 2a). A graph of the measured pixel values across a single row near the center of the image can represent this fluctuation (Fig 2b). One can construct a more informative graph of the fluctuations in the entire image by plotting how often certain measurements are observed. This graph is called a histogram or frequency distribution graph (Fig 2c). A histogram is a pictorial description of a series of measurements. It is a useful tool because it provides a visual interpretation of how often certain ranges of measurements are observed. Histograms provide a quick visual summary of the distribution of measurement values and can be used to look for outliers like blunders or to check for systematic errors like instrument miscalibrations.



View larger version (128K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2a.  Frequency distribution. (a) Gamma camera uniformity flood image shows noise due to the natural fluctuations caused by radioactive decay. The mean pixel count for the 128 x 128 matrix image is 100 (1.6 million total counts). (b) Graph shows the pixel values observed along a single row of the image near the center. The x axis represents the position across the row from left to right. The y axis represents the recorded pixel values across the image. On average, the data are centered around the mean value of 100 but fluctuate in a random manner above and below this value. (c) Graph shows the distribution of pixel values for the entire image. The x axis represents the range of possible pixel values. The y axis represents the frequencies with which particular pixel values occur. Each black bar represents a grouping of data for several possible pixel values. Although the commonest pixel values are near the mean of 100, many pixels have values substantially higher or lower than the mean, a distribution characteristic of radioactive decay.

 


View larger version (11K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 2b.  Frequency distribution. (a) Gamma camera uniformity flood image shows noise due to the natural fluctuations caused by radioactive decay. The mean pixel count for the 128 x 128 matrix image is 100 (1.6 million total counts). (b) Graph shows the pixel values observed along a single row of the image near the center. The x axis represents the position across the row from left to right. The y axis represents the recorded pixel values across the image. On average, the data are centered around the mean value of 100 but fluctuate in a random manner above and below this value. (c) Graph shows the distribution of pixel values for the entire image. The x axis represents the range of possible pixel values. The y axis represents the frequencies with which particular pixel values occur. Each black bar represents a grouping of data for several possible pixel values. Although the commonest pixel values are near the mean of 100, many pixels have values substantially higher or lower than the mean, a distribution characteristic of radioactive decay.

 
The frequency distribution can be normalized so that the y axis represents the probability of occurrence of a particular measurement. In this case, the graph is known as a probability density function. These functions can be discrete or continuous and can be used to graphically or mathematically describe the binomial, Poisson, and Gaussian distributions.


    STATISTICAL PARAMETERS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
It is important to recognize the difference between a parameter and a statistic. A parameter is some characteristic or property intrinsic to a population. A statistic is a characteristic of a sample drawn from the population. Statistics are used to draw inferences about the true population parameters, which might not be known exactly. Such is the case for radioactive decay. The total counts in a region of interest represent a statistical estimate of the true activity of an organ, which cannot be determined exactly. As long as a statistic is representative of the parameter that is being estimated, the measurement is valid. However, random errors introduce sampling variability, and systematic errors may cause overestimation or underestimation of the true parameter.

There are a number of statistics that are commonly used to describe properties of the distributions. The first of these statistics are measures of central tendency. The mean is the arithmetic average of a set of measurements. It is computed by summing up all measurement values and then dividing by the number of measurements. The median is the middle value if there is an odd number of measurements. It is the average of the two middle values when an even number of measurements has been taken. The median separates the data so that half of the measurements are greater than this value and half are less than this value. The mode is the most commonly observed value. Figure 3 provides examples of calculations for the mean, median, and mode.



View larger version (10K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 3.  Measures of central tendency. The mean is the average of all measurements. The median is the middle value when data are sorted in ascending numeric order. If there is an even number of measurements, the median is the average of the two middle values. The mode is the most frequent value.

 
Measures of variability are also important in describing the distribution of a set of measurements. Measures of variability indicate the dispersion of data values; the hope is that the values will be centered about the true value (the mean if no systematic errors are present). Because the true value is never known exactly, the mean is our best estimate of the true value. The range, standard deviation, and variance are used to statistically describe the dispersion of the data. Examples of calculations for these statistics are provided in Figure 4. The range is the difference between the highest and lowest measures. The deviation is calculated by examining the difference between a measurement and the mean of a set of measurements. It provides an indication of the dispersion of the data from the mean. The variance is the sum of the squared deviations of all measurements divided by the number of measurements minus one. We use one less than the total number of measurements to compensate for the fact that one must estimate both the mean and the average deviations about the mean. Thus, there is one less degree of freedom because at least one of the measurements must be used to estimate the mean value. The standard deviation is defined as the square root of the variance. Conceptually, the standard deviation is the root mean square deviation of the data from the mean. Large standard deviations imply that the data are widely dispersed (ie, random). A small standard deviation indicates good precision (reproducibility) in the measurements. In the case of radioactive decay, the randomness or dispersion is governed by the Poisson distribution, which is described next.



View larger version (9K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 4.  Measures of variability. The range is the maximum value minus the minimum value. The variance is the sum of the squared deviations from the mean, divided by one less than the number of measurements used in the calculation. The standard deviation is the square root of the variance.

 

    PROPERTIES OF PROBABILITY DISTRIBUTIONS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
The Poisson distribution function is characterized by a single parameter µ, which is the mean or average value. In the case of radioactive decay, µ is the expected number of radioactive disintegrations recorded during some specified time t. The possible number of decay events recorded is 0, 1, 2, 3, and so on. Given that the mean value is µ, one can compute the probability of recording exactly k events using the formula attributed to Poisson:

where k = X and k! = k x (k - 1) x (k - 2) x . . . x 3 x 2 x 1. The denominator is called a factorial and is the product of all the numbers between 1 and k. For example, if k = 5, then k! = 5 x 4 x 3 x 2 x 1 = 120. A graph of probability versus k for three different distributions is shown in Figure 5. The dispersion of values about µ is such that 68.3% of the time the measurements will be within plus or minus the square root of µ for the Poisson distribution. The standard deviation is given the symbol {sigma} and represents the degree of randomness or variability about the expected value µ. The Poisson distribution is unique because once µ is specified, {sigma} is also implicitly specified because it is always equal to the square root of µ.



View larger version (12K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 5.  Probability density function. Graph shows Poisson probability density functions for three values of µ (mu). The y axis reflects the probability of obtaining a measurement that has the specific value listed on the x axis. For µ = 2, the probability distribution is asymmetric (skewed) with maxima at k = 1 and k = 2 yielding the same probability of 28%. As µ increases, the distributions become broader, more symmetric, and bell shaped and the probability of observing one specific number of events is reduced because the data are more widely dispersed.

 
The Gaussian distribution is characterized by two parameters: µ, the mean value, and {sigma}, the standard deviation. In the case of the Gaussian distribution, {sigma} is unconstrained and can take on any value independent of µ. The formula for computing the probability of observing the measurement x for a specified µ and {sigma} is as follows:

The Gaussian distribution is always symmetric and bell shaped about the mean value, whereas the Poisson distribution is asymmetric if the mean value is close to zero. The Gaussian distribution requires both the mean and standard deviation to describe it. While there is one and only one Poisson distribution at each mean value, there are an infinite number of possible Gaussian distributions because {sigma} can take on any value.

Figure 6 illustrates the difference between the Gaussian and Poisson distributions for two mean values. The Poisson distribution takes on values for 0, 1, 2, 3, and so on because of its discrete nature, whereas the Gaussian function is continuously varying over all possible values, including values less than zero if the mean is small (eg, µ = 4).



View larger version (12K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 6.  Difference between Gaussian and Poisson distributions. Graph shows two Poisson and two Gaussian probability density functions for µ = 4 and µ = 36. The Poisson function is defined only for a discrete number of events, and there is zero probability for observing less than zero events. The Gaussian function is continuous and thus takes on all values, including values less than zero as shown for the µ = 4 case. In this example, the standard deviations for the Gaussian functions have been set equal to the square root of µ so the widths of the curves are similar. Once µ exceeds 20 or so, the Gaussian and Poisson curves are nearly identical, as demonstrated for the µ = 36 example.

 
Once the mean value µ exceeds 20 or so, the Gaussian and Poisson distributions look very similar if the standard deviation for the Gaussian distribution is set equal to the square root of the mean. This fact is illustrated in Figure 6 for a mean value of 36 counts. The total counts in a region of interest of a nuclear medicine image are normally greater than 36, so the Gaussian distribution can be used as long as the standard deviation is appropriately defined and recognition of the discrete nature of the Poisson distribution is used in the analysis.


    PERCENT UNCERTAINTY
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
For a Poisson process like radioactive decay, as the total number of counts N increases, the standard deviation also increases but does so more slowly because the standard deviation grows with the square root of N. One can investigate this effect using the percent uncertainty, also known as the standard error, which is a measure that compares the standard deviation (a measure of the randomness or noise) with the mean (a measure of the signal strength). The percent uncertainty is calculated by dividing the standard deviation by N then multiplying by 100%. For small N, the randomness of radioactive decay produces significant deviations from the mean value when compared with the mean. As N increases, the standard deviation grows more slowly and this effect is lessened. These deviations from the mean are perceived as noise, and the effect of this noise depends on how it affects the result. In nuclear medicine, typical image count densities are on the order of 100–1,000 counts per square centimeter. As a result, the noise is perceived to be great. In conventional radiographs, 10,000–100,000 or more photons are used to create the image in each square millimeter of film (108 photons per square centimeter). In a photograph, there may be 1 million to 10 million photons per square millimeter; thus, the noise due to photon statistics is not perceived, even though a finite number of photons is used to construct the image (but film grain noise may be apparent).

This concept is illustrated in Figure 7. A low total count has wide dispersion relative to a high count on a percent standard deviation basis. The dispersion of data about the mean depends on the standard deviation {sigma}, the square root of N counts, while the mean is N. Because the square root of N grows more slowly than N, the percent deviation gets smaller for large N. As a result, the visual appearance and effect of noise (dispersion of data values) decrease as the number of counts increases. The Table further demonstrates the concept by using typical counts in a region of interest.



View larger version (16K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 7.  Percent uncertainty. Graph shows three Gaussian probability density functions with mean values of 20, 100, and 1,000. The standard deviations have been set equal to the square root of the mean to match the Poisson distribution. The x axis has been normalized to represent the percent deviation from the mean value. As the mean increases, the relative dispersion of values about the mean decreases on a percentage basis.

 

View this table:
[in this window]
[in a new window]

 
Effect of Number of Counts on Percent Uncertainty
 
The perceived noise is less apparent for the case of large N than for small N. This concept is illustrated in Figure 8. The perception of noise in an image is largely determined by the number of counts because the noise is constrained to be the square root of the number of counts for radioactive decay. In conclusion, to reduce the appearance of noise in an image, a high total count is necessary.



View larger version (44K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 8.  Effect of total counts on perceived noise. Gamma camera images (128 x 128 matrix, 2-mm section thickness) of a brain phantom show that as the total number of counts increases, the effect of noise is less obvious. The perceived variability in the image decreases as the total counts increase because, on a percentage basis, the noise is less apparent.

 

    CONFIDENCE INTERVALS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
When measurements are taken from the Poisson distribution with a mean of 100, 68.3% of the time one will experimentally obtain a measurement that is within plus or minus 10 of the mean value (Fig 9). Because the standard deviation is 10 (the square root of 100), one observes that 68.3% of the time the measurement is within the mean plus or minus one standard deviation of the mean. This property is fundamental to the Poisson distribution and to a Poisson process like radioactive decay. If the curve is extended to plus or minus two standard deviations, 95.5% of the time a measurement will fall within this interval. Owing to the natural fluctuations in the data in this Poisson process, one expects to record a value between plus or minus three standard deviations more than 99.7% of the time. This relationship between the range of values and the probability of obtaining a measurement within this range permits one to establish confidence intervals of one, two, or three standard deviations about the mean. This confidence is implicit on accepting the assumption that a Poisson process is being evaluated and that no systematic errors have been introduced into the measurement system.



View larger version (43K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 9.  Confidence intervals. Graph shows the probability density function for the Poisson distribution with a mean of 100. The Poisson distribution has a bell shape for µ > 20 or so and a dispersion or width that is constrained by the mean value. A measurement between the mean plus or minus one standard deviation is observed 68.3% of the time. Similar confidence intervals for plus or minus two or three standard deviations encompass 95.5% and more than 99.7% of the measurements, respectively.

 
Confidence intervals can be used to determine if enough counts have been obtained in a measurement for the result to have validity. For example, a confidence interval can be used to determine how many counts are needed to ensure that two measurements are within a certain percentage of each other on the basis of the randomness of radioactive decay. This concept is helpful in determining how precise a result is. For example, if the ejection fraction is 50%, it is important to know whether the result is ±5% or ±20% on the basis of the count statistics. Figure 10 provides a sample calculation for determining how many counts are necessary to ensure the validity of a calculation. One must also be watchful for systematic errors, which can introduce bias in the result as well.



View larger version (17K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 10.  Application of the confidence interval concept. This formula would be used if we want to be 95.5% sure that two measurements will be within 1% of each other on the basis of the count statistics of our data. The percent uncertainty we expect to observe is 1% with two standard deviations used as the measurement window for 95.5% confidence. The square root of N over the square root of N is 1; this factor is used in the third step to simplify the equation. In addition, the square root of N times the square root of N is N, which cancels with N in the denominator in the fourth step. The result is rearranged to obtain the final answer, 40,000 counts.

 

    HYPOTHESIS TESTING
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
Confidence intervals are useful for evaluating a set of measurements. However, if a systematic error crops up, the data may be erroneous, and it is also useful to know whether the instrumentation is yielding valid results. The {chi}2 test is one of many statistical tests for evaluating the consistency of data. It is particularly relevant to nuclear medicine because it can be used to determine if the equipment is obeying Poisson statistics. Basically, one expects the standard deviation of the measurement to be equal to the square root of the mean. If a number of measurements are examined, one can determine if the standard deviation is within the expected limits for Poisson statistics. It may be too large if there is additional noise getting into the detection system; it may be too small if the detection system is continuously picking up an extraneous periodic signal like line noise or radio-frequency interference. The formula and methods for performing the {chi}2 test are available in many statistics texts and are not given herein. Nevertheless, the {chi}2 test is a useful tool for evaluating the proper performance of radiation counting equipment like gamma cameras, well counters, and uptake probes.

Hypothesis testing also includes presentation of a hypothesis such as "the two measurements are the same" or "the two measurements are not the same." Statistical tests are then applied to evaluate whether the measurements support the hypothesis. The Student t statistic is one of the most commonly used statistical tests for this purpose. This test compares the means of two sets of measurements.


    PROPAGATION OF ERRORS
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
Frequently in nuclear medicine, more than one measurement is taken and the results are combined in some mathematical way. A common example is to collect sample and background counts and then subtract the background value. Each of these measurements has some variability; when two noisy measurements are combined, the resulting calculation will have even more variability. This point is illustrated graphically in Figure 11 and numerically in Figure 12. If by chance we happen to obtain a measurement of exactly 100 for the sample and 25 for the background, we correctly calculate the true net counts of 75. More typically, one will obtain values higher or lower than the mean value, which when subtracted yield a value that has a wider range than either of the individual Poisson distributions. The distribution for the resulting net counts is centered about a mean of 75 as expected, but the width is greater to take into account the increased uncertainty in the result brought on by subtracting two noisy measurements (Fig 11b). This distribution is no longer Poisson because the standard deviation is now greater than the square root of the mean. The important concept to remember is that whenever two noisy measurements are combined in some way, the resulting value has greater variability than the individual values.



View larger version (15K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 11a.  Process of subtracting background from a sample. In this case, the true background value is 25 and the true sample activity is 75. Both the background measurement and the sample measurement represent Poisson processes, and each has an associated dispersion about the true value. (a) Graph shows three typical sample and background calculations. In these examples, the net counts vary from 42 to 97. (b) Graph shows the distribution of the net counts as a dotted line. This line represents the distribution for the true sample activity. The width of the net counts curve is greater than that of the sample or background curve because the variability of each curve must be taken into account during the subtraction process. As a result, the net counts probability density function is no longer Poisson because the standard deviation for this distribution is greater than the square root of the mean.

 


View larger version (14K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 11b.  Process of subtracting background from a sample. In this case, the true background value is 25 and the true sample activity is 75. Both the background measurement and the sample measurement represent Poisson processes, and each has an associated dispersion about the true value. (a) Graph shows three typical sample and background calculations. In these examples, the net counts vary from 42 to 97. (b) Graph shows the distribution of the net counts as a dotted line. This line represents the distribution for the true sample activity. The width of the net counts curve is greater than that of the sample or background curve because the variability of each curve must be taken into account during the subtraction process. As a result, the net counts probability density function is no longer Poisson because the standard deviation for this distribution is greater than the square root of the mean.

 


View larger version (15K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 12.  Example of propagation of errors. One can compute the background-corrected sample count by subtracting the background from the sample. The standard deviation for the net count is computed by taking the square root of the sum of the squares of the standard deviations of the net count and the background count. The standard deviation for the sample activity is higher than either individual standard deviation. Thus, the noise (standard deviation) that results from combining two noisy measurements is greater than the noise in either individual measurement.

 
Whenever multiple measurements are used in a calculation, there is no way of knowing for sure the degree of error in any one measurement. Consequently, we must examine the range of possible errors due to the uncertainties in each of the individual measures. It does not matter whether the two measurements are added or subtracted; the net standard deviation always increases to take into account the uncertainties in each of the measurements.

Next, consider the effect of scaling a measurement by a constant C. The original measurement becomes C times greater, and the standard deviation also scales accordingly so that the new standard deviation is C times the original standard deviation. The constant C can be greater or less than 1 as appropriate. The percent uncertainty remains the same and is independent of scaling in this case. As a result, the image and noise appearance are unchanged.

Scaling becomes important especially when dealing with count rates; one scales the original total count by the factor 1/t, where t is the number of minutes during which the sample was counted. The result is the count rate in units of counts per minute. Conceptually, the effect of converting the result from total counts to a count rate is the same as that of averaging the multiple 1-minute measurements. In this case, uncertainty drops by a factor of 1/t due to averaging. For example, if 10,000 counts are collected in 10 minutes, the count rate is 1,000 counts per minute. The standard deviation is 1/10 times the square root of 10,000, or 10. This value is smaller than the square root of 1,000 (31.6) due to the effect of averaging over the 10 1-minute intervals.

When two measurements are multiplied or divided by each other, the uncertainty in the resulting computation again increases. In these cases, the formulas are not particularly intuitive, but the net effect is an increase in the overall uncertainty in all cases. The formulas for various arithmetic operations are described in some detail in Medical Imaging Physics by Hendee and Ritenour and in NCRP Report no. 58 (see "Suggested Readings").

An even more complex expression results if combinations of arithmetic operations are performed. A common example in nuclear medicine is the calculation of ejection fraction, which relies on counts from multiple regions of interest on diastolic and systolic left ventricular images as well as from a background region of interest. Each of these measurements has inherent uncertainty, which causes the resulting ejection fraction to have greater uncertainty than the individual measurements. The standard deviation expression becomes quite complex because there are two subtractions and a division in the calculation. One can calculate the ejection fraction and its standard deviation using the principles of propagation of errors (Fig 13). A high background reading results in greatly increased uncertainty in the result. If better confidence is needed, a lower background value is required or additional counts are needed in the regions of interest in the ventricles to reduce the uncertainty of the measurement. This is a useful exercise, especially when setting up a protocol for collecting ejection fraction data with a new camera. One must ensure that the data collected are capable of providing meaningful, statistically valid results. If the error bounds are too large, the calculation becomes meaningless.



View larger version (17K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 13.  Calculation of ejection fraction by using propagation of errors concepts. The ejection fraction and its standard deviation can be computed by using the appropriate application of propagation of errors, where N is the number of counts in a region of interest. Let Ndiastole = 2,000, Nsystole = 1,200, and Nbackground = 400 counts. The ejection fraction can be calculated by the reader and found to be 0.50 (50%) with a standard deviation of 0.04 (4%). With 95.5% confidence, one can conclude that the true ejection fraction is between 42% and 58% (EF ± 2 {sigma}EF) on the basis of the count statistics in this example. The high background counts in this example have increased the standard deviation significantly.

 
Figure 14 illustrates the resulting visual appearances when two images with the same signal content are added or subtracted. Adding two such images produces a reduction in the apparent noise. This reduction is due to the fact that the total counts are approximately doubled, while the noise increases by only a factor of the square root of 2. When two such images are subtracted, the noise (variability) in the final image is clearly greater than in either of the initial images.



View larger version (44K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 14.  Addition and subtraction of images. Gamma camera images (128 x 128 matrix, 2-mm section thickness) of a brain phantom show the results of adding (top row) and subtracting (bottom row) two noisy images. The left and middle images each contain approximately 16,000 counts; the final images are on the right. In the subtraction image on the lower right, a medium gray value was used to represent a difference of zero so that negative deviations appear dark gray and positive deviations appear white.

 
Figure 15 demonstrates the effect of multiplying or dividing two images with the same signal content. The resulting images have increased noise compared with the two initial images. Although multiplication and division of images are not as commonly performed in nuclear medicine as addition or subtraction, the former operations are used with regions of interest, such as in ejection fraction calculations. The visual appearance serves as a reminder that any mathematical operation of combinations of noisy images results in an even noisier final image.



View larger version (42K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 15.  Multiplication and division of images. Gamma camera images (128 x 128 matrix, 2-mm section thickness) of a brain phantom show the results of multiplying (top row) and dividing (bottom row) two noisy images. The left and middle images each contain approximately 16,000 counts; the final images are on the right. The final images were rescaled so that the minimum and maximum pixel values use the entire gray scale for display. This rescaling yielded a darker overall image for the multiplication result.

 

    IMAGE STATISTICS AND DETECTION
 Top
 Abstract
 INTRODUCTION
 ERRORS IN MEASUREMENTS
 TYPES OF PROBABILITY...
 FREQUENCY DISTRIBUTIONS
 STATISTICAL PARAMETERS
 PROPERTIES OF PROBABILITY...
 PERCENT UNCERTAINTY
 CONFIDENCE INTERVALS
 HYPOTHESIS TESTING
 PROPAGATION OF ERRORS
 IMAGE STATISTICS AND DETECTION
 CONCLUSIONS
 SUGGESTED READINGS
 
Humans are able to perceive objects on the basis of their size, shape, and contrast relative to their surroundings. If noise is introduced, it further obscures our ability to perceive detail in an image. Hence, there is a trade-off between object size, contrast, and noise. If the image is very noisy, one is unable to perceive small objects unless they have very high contrast. For this reason, an information density box may be used in a nuclear medicine study to ensure that the noise fluctuations in an important part of an image are not excessive. The information density box fixes the number of counts per unit area in a specific portion of the image. Noise also masks our ability to perceive contrast in an image. This fact is particularly evident in nuclear medicine images, in which the number of gamma ray photons used to construct the image is very limited (to minimize radiation dose to the patient). Noise is also quite evident on fluoroscopic images, in which many more photons are used than in nuclear medicine images. In conventional x-ray images, noise is less evident but is still present.

H. R. Blackwell studied human perception in the 1940s to evaluate the ability to see small, low-contrast objects. He was trying to determine if a small, bright light or a large, dim light is better for locating a ship at sea from a distance. Some years later, Albert Rose reviewed Blackwell's extensive experimental results and formulated a model to describe the relationship between noise, contrast, and resolution. These results have implications in all aspects of imaging in radiology but apply especially to nuclear medicine images, in which the count statistics are so limiting.

The Rose model can be understood by examining the parameters used to define it. X-ray and nuclear medicine images are limited in the number of photons used to create the image and generally follow Poisson statistics. With these facts in mind, let us first assume that there are B photons in a background or surrounding region of interest next to the object under scrutiny. Let us also assume that there are T photons in the target region of interest. We expect T and B to be quite similar in numeric value because we are interested in low-contrast detectability. However, we assume that T is not equal to B so that there is some contrast difference between the two regions of interest. The signal S is equal to the difference between the counts in the target area and those in the background area:

The contrast C is defined by normalizing the signal to the background counts with the following formula:

The signal could alternatively be defined in terms of contrast:

The noise N in the image follows Poisson statistics:

The square root of B is approximately the same as the square root of T because T and B are assumed to be similar if the contrast between them is low. One can define a constant K, which will represent the signal-to-noise ratio for our imaging system:

The fact that B is equivalent to the square root of B times the square root of B is used to simplify the expression. If we think back to how we are measuring the signal, it is really a certain number of counts in a specified region of interest with some area A. As a result, we can equivalently define the signal-to-noise ratio with the following formula:

where {Phi} is the photon fluence (the number of gamma ray photons per unit area) and A is the area of the region of interest, or more specifically, the size of the smallest object we wish to detect. This final equation is the Rose model. Note that {Phi} is the definition of an information density. It represents the number of counts per unit area, just like the information density box discussed earlier. This equation states that the signal-to-noise ratio depends on image contrast, object size, and information density. The information density is dependent on the matrix size, and a doubling of matrix size reduces the information density by a factor of four if other image acquisition parameters are held constant. A single pixel becomes four pixels when the matrix size is doubled, and the same total counts are now distributed among four pixels instead of just one. Noise increases in this case, as does resolution.

The Rose model describes the relationship between noise, object size, and contrast. It does not, however, make any statements regarding the limiting resolution of an imaging system. In gamma cameras, the collimator determines the smallest detectable object size independently of whether the object can be perceived or not due to noise in the image. If the object cannot be resolved due to imaging equipment limitations, the contrast is zero and the object cannot be perceived no matter what the noise level is. A high-resolution collimator provides improved small object contrast at the expense of fewer detected counts (increased noise). A high-sensitivity collimator increases the counts relative to a high-resolution collimator at the expense of reduced contrast in small objects. However, if only medium-size objects are important in the image, the high-sensitivity collimator affords a less noisy image and may be applicable.

Rose observed that humans cannot perceive an object unless the signal-to-noise ratio K is at least 5–7. This range represents some minimum signal-to-noise ratio that is required to distinguish between a low-contrast disk and a noisy background. If K is fixed to set some minimum detectable object contrast, the factors C, {Phi}, and A are all responsible for an object being perceived. If contrast C is low, the object must be large (a large A) or the number of photons used to evaluate the object must be large for the object to be perceived. Hence, there is a trade-off between contrast C, noise {Phi}, and resolution A. To improve any one of these factors, other factors must be negatively affected for minimum object detectability to be retained.

Figure 16 illustrates the effect of information density on perception. Although each image of a study has the same resolution, noise masks our ability to perceive details if insufficient counts are acquired in the study. Even large structures such as the ventricular blood pool may be difficult to identify if too few counts are obtained in the image. Figure 17 illustrates the effect of image



View larger version (97K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 16.  Effect of information density on perception. Four pairs of 128 x 128 images from an 18-frame multiple gated blood pool study show the effect of total counts per frame on the visual appearance of the blood pool at end diastole (top row) and end systole (bottom row). The total counts in the study varied from 9,000 counts to 9 million counts from left to right when the total counts from each of the 18 frames were taken into account.

 


View larger version (120K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 17.  Effect of image subtraction on perception. End-systolic images (middle) were subtracted from end-diastolic images (left) to obtain stroke volume functional images (right) from a 9 million-count study (500,000 total counts per frame) (top row) and a 900,000-count study (50,000 total counts per frame) (bottom row). The images demonstrate the principle that subtracting two noisy images results in an image that is even noisier. This principle is especially evident in the low-total-count images (bottom row). The final images were adjusted so that a medium gray value represents a difference of zero, negative deviations appear dark gray, and positive deviations appear white; the acquisition parameters were the same as in Figure 16.

 
subtraction on perception. The increased noise due to the subtraction of two noisy images greatly affects the perception of the resulting image, especially if the total counts are insufficient. Clearly, there is some minimum count level below which noise begins to obscure our ability to perceive detail in an image.

Figure 18 illustrates the concept behind a contrast detail diagram. In nuclear medicine, the activity in abnormal tissue may take the form of either increased or decreased counts compared with the activity in normal tissues; both situations are illustrated in Figure 18. In images with fixed noise, the minimum object that is detectable changes in size as the contrast is varied. Large or high-contrast objects are easily recognized, but only large objects can be distinguished if the contrast is low.



View larger version (93K):
[in this window]
[in a new window]
[Download PPT slide]
 
Figure 18.  Concept behind a contrast detail diagram. Four 256 x 256 simulated contrast detail images show disks with diameters of 2, 4, 8, 16, and 32 pixels. These disks represent objects superimposed on normal tissue. The contrast varies from 5% in the top row of each image to 75% in the bottom row. The left images are noise free, and the right images follow Poisson statistics with a mean information density of 100 counts per pixel (6.5 million total counts). In the top images, the disks have higher activity (ie, are "hotter") than their surroundings; in the bottom images, the disks have lower activity (ie, are "colder") than their surroundings. The trade-off between object size (disk area) and required contrast is clearly seen: If the size is large or the contrast is high, the object is more easily recognized. There is a diagonal line of minimum object detectability that follows the Rose model.

 
Figure 19 illustrates the effect of total counts on perception. The perceived noise in an image is strongly dependent on total counts. The contrast in a nuclear medicine image is determined by the differential uptake of t