Perceptual uniformity of commonly used color spaces Ali Avanaki a, Kathryn Espig a, Tom Kimpe b, Albert Xthona a, Cédric Marchessoux b, Johan Rostang b, Bastian Piepers b a Barco Healthcare, Beaverton, OR; b Barco Healthcare, Kortrijk, Belgium ABSTRACT Use of color images in medical imaging has increased significantly the last few years. Color information is essential for applications such as ophthalmology, dermatology and clinical photography. Use of color at least brings benefits for other applications such as endoscopy, laparoscopy and digital pathology. Remarkably, as of today, there is no agreed standard on how color information needs to be visualized for medical applications. This lack of standardization results in large variability of how color images are visualized and it makes quality assurance a challenge. For this reason FDA and ICC recently organized a joint summit on color in medical imaging (CMI) [link]. At this summit, one of the suggestions was that modalities such as digital pathology could benefit from using a perceptually uniform color space (T. Kimpe, Color Behavior of Medical Displays, CMI presentation, May 2013). Perceptually uniform spaces have already been used for many years in the radiology community where the DICOM GSDF standard [link] provides linearity in luminance but not in color behavior. In this paper we quantify perceptual uniformity, using CIE s E 2000 as a color distance metric, of several color spaces that are typically used for medical applications. We applied our method to theoretical color spaces Gamma 1.8, 2.0, & 2.2, standard srgb, and DICOM (correction LUT for gray applied to all primaries). In addition, we also measured color spaces (i.e., native behavior) of a high-end medical display (Barco Coronis Fusion 6MP DL, MDCC-6130), and a consumer display (Dell 1907FP). Our results indicate that srgb & the native color space on the Barco Coronis Fusion exhibit the least non-uniformity within their group. However, the remaining degree of perceptual non-uniformity is still significant and there is certainly room for improvement. Keywords: Perceptually uniform color space 1. PURPOSE It was suggested that some applications of color in medical imaging could benefit from using a perceptually uniform color space and display [6]. Perceptual uniformity for luminance is already being used for many years in radiology applications and has become the calibration standard for radiology displays. In a DICOM-calibrated gray scale display, the perceived luminance is proportional to the display s input (i.e., digital drive levels; we use this term interchangeably with RGB). Our ultimate goal is to draw a parallel between DICOM GSDF used with grayscale medical images and a perceptually uniform color space that could potentially be beneficial for color medical images. In such a perceptually uniform color space, perceived color differences are evenly distributed across the gamut and therefore color differences will look equally large independent of the exact position in the gamut. A perceptually uniform color space also will reduce quantization errors and maximize visibility of subtle color differences. As a first step toward our goal, we want to quantify perceptual uniformity of several color spaces that are typically used for medical applications. For this purpose we use E 2000, a widely accepted color (in the sense that includes luminance as well as hue & saturation) difference metric [3, 4] that correlates well with human s perception [7, 8]. 2. METHODS For each color space, E 2000 between two consecutive points on primaries, and gray is calculated. This allows for quantifying the perceptual uniformity of the neutral grey values from dark to white, as well as of the individual primary colors when driven from minimum to maximum intensity.
2.1 Uniformity across the whole gamut A second way of quantifying perceptual uniformity but within the entire color gamut, is to calculate E 2000 between (r,g,b) & (r, g-1, b), (r,g,b) & (r, g, b-1), and (r,g,b) & (r-1, g, b), for points on a 256 3 color cube (depth of 8-bit/channel). We take the maximum of these values, as the worst-case granularity (WCG) for the space and use it in comparing spaces. Granularity is inversely related to bit depth. One can determine if a certain distance threshold of E 2000 color points may be reached at given bit depth using the corresponding WCG. To show (non-)uniformity of color space, we illustrate E 2000 granularity across the gamut using slice diagrams meeting at the point which yields WCG value for the space. We also calculated the minimum, maximum, average and standard deviation over the entire color gamut. Table 1. Statistics of E 2000 between consecutive points (8 bit/channel) for primaries & gray in some common color spaces. Color space DICOM srgb γ1.8 γ2.0 γ2.2 Dell 1907FP Barco MDCC- 6130 Channel Average Std. dev. Min Max Red 0.2661 0.283 0.1459 1.4758 Green 0.3486 0.3729 0.1699 2.4155 Blue 0.2083 0.2866 0.0957 1.5739 Gray 0.2947 0.1563 0.1635 0.8724 Red 0.2707 0.0982 0.2035 0.5485 Green 0.3452 0.1156 0.2053 0.693 Blue 0.1957 0.108 0.1185 0.5147 Gray 0.2947 0.0647 0.1511 0.3986 Red 0.2661 0.1613 0.062 0.7643 Green 0.3487 0.2032 0.1034 1.046 Blue 0.2083 0.1703 0.0662 0.7868 Gray 0.2947 0.0993 0.0241 0.4762 Red 0.2661 0.1358 0.0205 0.6679 Green 0.3487 0.1676 0.0342 0.8939 Blue 0.2083 0.1461 0.0219 0.6762 Gray 0.2947 0.09 0.008 0.4084 Red 0.2661 0.1186 0.0068 0.607 Green 0.3487 0.1433 0.0113 0.7989 Blue 0.2083 0.1288 0.0072 0.6035 Gray 0.2947 0.0892 0.0026 0.4104 Red 0.6034 0.4918 0.0022 2.207 Green 0.7184 0.8637 0 5.3558 Blue 0.4579 0.2907 0 2.538 Gray 0.7801 0.5193 0.007 2.9623 Red 0.681 0.6708 0.2133 4.4679 Green 0.7291 0.7625 0.2196 4.6445 Blue 0.4969 0.258 0.2219 1.8581 Gray 0.6696 0.5556 0.2119 4.2618 CIE LAB values are necessary to calculate E 2000. CIE LAB values are calculated from CIE XYZ values and the white point (i.e., XYZ for maximum drive of all primaries) [1]. For measured spaces (displays native behaviors), XYZ values are measured using a Minolta CA-210 colorimeter on a uniform (with respect to drive level) 52 3 grid and a 15 3 grid with sides on 15 levels (out of 256) of each primary. The values of XYZ on 256 3 grid will be interpolated on 52 3 grid, except for those fall within 15 3 grid; those will be interpolated on 15 3 grid. Note that a direct 256 3 measurement is not feasible (52 3 measurement takes 3 days), and we found that models of LCD behavior (e.g., [2]) are not sufficiently accurate for our purposes. For theoretical spaces, XYZ values, relative to white point luminance, can be readily calculated from RGB values. Such formulas for standard srgb are listed in [5]. The non-linearity therein which is applied to RGB values (scaled to 0 to 1 range) is given by
12.92, 0.04045. 1, 0.04045 with 0.055 For Gamma spaces, this non-linearity is replaced by, with γ = 1.8, 2.0, or 2.2, and the rest of XYZ calculation is similar to that of srgb. For DICOM, we first calculate the correction look up table (LUT) for luminance of gray line following standard srgb with L min = 0.4 and L max = 400 cd/m 2. The LUT calculated as such will be the nonlinearity for DICOM space and the rest of XYZ calculation is similar to that of srgb. 3. RESULTS Table 1 compares the color spaces we studied in terms of E 2000 between two consecutive points on primaries and gray, for 8 bit/channel. Lower standard deviation means better uniformity. Table 2. Statistics of E2000 between adjacent red, green & blue points in some common color spaces. WCGs are shown in bold. See 2.1 for details. Color Space DICOM srgb γ1.8 γ2.0 γ2.2 Dell 1907FP Barco MDCC -6130 Channel Average Std. dev. Min Max Red 0.1771 0.0808 0.0243 2.103 Green 0.3169 0.136 0.0754 3.4403 Blue 0.2147 0.1106 0.0501 2.1436 Red 0.1945 0.108 0.0064 0.8247 Green 0.3207 0.1548 0.0192 1.1857 Blue 0.2134 0.11 0.0121 0.7563 Red 0.1781 0.0955 0.001 1.1866 Green 0.3202 0.1579 0.0031 1.8907 Blue 0.2138 0.114 0.002 1.1931 Red 0.1809 0.102 0.0003 1.0374 Green 0.3234 0.1654 0.001 1.616 Blue 0.2147 0.1174 0.0007 1.0272 Red 0.1836 0.1087 0.0001 0.9361 Green 0.3262 0.1737 0.0003 1.4361 Blue 0.2155 0.1217 0.0002 0.916 Red 0.373 0.3524 0 7.1293 Green 0.5491 0.4957 0 13.0058 Blue 0.3855 0.3549 0 5.394 Red 0.364 0.3293 0.0032 5.7052 Green 0.6318 0.5481 0.003 9.2338 Blue 0.447 0.3938 0.0026 5.9314 Statistics that are calculated similarly (cf. 2.1) for the whole gamut are reported in Table 2. A question that may arise here is that why the maximum E 2000 step on gray is shorter than space s WCG (e.g., for Gamma 1.8, WCG is 1.8907, while max gray step is 0.4762, cf. Table 1)? After all, a step in gray is equivalent to 3 steps, one along each primary. The answer stems from the fact for all points on gray line a* = b* = 0 (no chromaticity), and that all other points in the gamut have some chromaticity that contributes to E 2000 distance with their neighbors. From another perspective, the edges of gamut corresponding to the primaries are all longer than the gray line in LAB space. Note that the lengths of primary edges are different in LAB space (green, red, blue, in decreasing order of length). This means, for example, that there are many more perceptually different shades of green than there are of blue. Hence, the average E 2000 between green values are systematically higher than those of red or blue (see Average column of
Table 2). In terms of WCG (see last column of Table 2), while srgb offers the most uniform color, there is still a lot of room for improvement: the in-between E 2000 values are far from being the same as their standard deviation is about 50% of their average. Figure 1, compares Table 2 data for theoretical color spaces in terms of WCG and standard deviation for green. If we use standard deviation, which also measures dispersion of in-between E 2000 values, DICOM seems to be slightly more uniform than srgb. This also holds, for all primaries, when the standard deviations are normalized by the corresponding values in the average column. 3,5 3 2,5 2 1,5 1 0,5 0 DICOM srgb γ1.8 γ2.0 γ2.2 std. dev. WCG Figure 1. Comparison of WCG and standard deviation (for green), both normalized to those of srgb, for theoretical color spaces. Data is from Table 2. Standard deviation also gives an indication of non-uniformity (higher std. dev. means less uniformity). 0.8 srgb 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 50 100 150 200 250 300 Figure 2. Visualization of perceptual uniformity for srgb color space. See text for more details. E 2000 between adjacent points on primaries and gray as well as between adjacent red, green & blue points across the whole gamut (slice diagrams described in 2.1) are given for srgb (Figure 2) and DICOM (Figure 3). It may be
observed that while in terms of WCG srgb offers more overall perceptual uniformity (no red in Figure 2 slice diagrams; color scale is same as Figure 3), DICOM has the advantage that its non-uniformity is localized to lowluminance part of gamut (curves in top-left of Figure 3 flatten after drive level of 50). 2.5 DICOM 2 1.5 1 0.5 0 0 50 100 150 200 250 300 Figure 3. Visualization of perceptual uniformity for DICOM color space. See text for more details. In Figure 4, E 2000 between adjacent points on the most saturated edges of the color cube are plotted for srgb for depth of 8 bit/channel. The path includes edges (1,0,0) to (1,1,0) to (0,1,0) to (0,1,1) to (0,0,1) to (1,0,1), back to (1,0,0) in that order. Note that ideally one cannot expect a flat line replacing the curve in Figure 4. That is because the green edge of the gamut is perceptually the longest and when divided by 256 levels, it yields larger in-between E 2000 distance than other primaries or gray. That said, from Figure 4, one can observe that the way srgb distributes the in-between levels seems arbitrary, from perceptual uniformity viewpoint, and far from ideal in that sense. 4. CONCLUSION We have quantified perceptual uniformity of several color spaces that are typically used for medical applications. For this purpose we used E 2000, a widely accepted color difference metric.
0.4 de2000 beween adjacent srgb points on gamut circumference (p = 256) 0.35 0.3 0.25 de2000 0.2 0.15 0.1 0.05 0 Figure 4. E 2000 between adjacent points on most saturated edges of 256 3 srgb color cube. Corresponding colors are shown in the horizontal bar. None of the color spaces under consideration is fully uniform in terms E 2000 granularity. In terms of WCG (lower is better), srgb leads theoretical spaces with 1.1857 vs. 1.4361, 1.6160, & 1.8907 for Gamma 2.2, 2.0, & 1.8, and 3.4403 for DICOM spaces. WCGs for Barco MDCC-6130 and Dell 1907FP are 9.2338 and 13.0058. The remaining degree of perceptual non-uniformity is still significant and there is certainly room for improvement. REFERENCES [1] http://en.wikipedia.org/wiki/lab_color_space#cielab-ciexyz_conversions, accessed Aug 2013. [2] N. Tamura, N. Tsumura, and Y. Miyake, Masking model for accurate colorimetric characterization of LCD, JSID 2003. [3] http://en.wikipedia.org/wiki/color_difference#ciede2000, accessed Aug 2013. [4] http://www.ece.rochester.edu/~gsharma/ciede2000/, Matlab implementation, accessed Aug 2013. [5] http://en.wikipedia.org/wiki/srgb#the_reverse_transformation,, accessed Aug 2013. [6] T. Kimpe, et al., Does the choice of display system influence perception and visibility of clinically relevant features in digital pathology images?, submitted to SPIE MI 2014. [7] R. Ramanath, and M.S. Drew, Color: Color Models, Wiley Encyclopedia of Computer Science and Engineering, vol. 1, 2009. [8] R. Ramanath, and M.S. Drew, Color: Color Perception, Wiley Encyclopedia of Computer Science and Engineering, vol. 1, 2009.