Flat panel display signal processing

Size: px

Start display at page:

Download "Flat panel display signal processing"

Arline Palmer
5 years ago
Views:

1 Flat panel display signal processing Klompenhouwer, M.A. DOI: /IR Published: 01/01/2006 Document Version Publisher s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: A submitted manuscript is the author's version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. The final author version and the galley proof are versions of the publication after peer review. The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Download date: 26. Aug. 2018

4 Flat Panel Display Signal Processing Analysis and Algorithms for Improved Static and Dynamic Resolution Michiel A. Klompenhouwer

5 The work described in this thesis was carried out at the Philips Research Laboratories Eindhoven, the Netherlands, as part of the Philips Research Programme. Printed by: Eindhoven University Press, the Netherlands Cover design: Henny Herps, Floris van der Haar and Michiel Klompenhouwer. The image is taken from a video sequence that is used to investigate motion artifacts. The screens depict the static and dynamic resolution improvements described in this thesis. CIP-DATA LIBRARY TECHNISCHE UNIVERSITEIT EINDHOVEN Klompenhouwer, Michiel A. Flat panel display signal processing : analysis and algorithms for improved static and dynamic resolution / by Michiel Adriaanszoon Klompenhouwer. - Eindhoven : Technische Universiteit Eindhoven, Proefschrift. - ISBN-10: ISBN-13: NUR 959 Trefw.: digitale beeldverwerking / display / elektronische beeldtechniek ; beeldkwaliteit. Subject headings: video signal processing / flat panel displays / image resolution. c 2006 Royal Philips N.V. All rights are reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission from the copyright owner.

6 Flat Panel Display Signal Processing Analysis and Algorithms for Improved Static and Dynamic Resolution PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Eindhoven, op gezag van de Rector Magnificus, prof.dr.ir. C.J. van Duijn, voor een commissie aangewezen door het College voor Promoties in het openbaar te verdedigen op woensdag 20 december 2006 om uur door Michiel Adriaanszoon Klompenhouwer geboren te Purmerend

7 Dit proefschrift is goedgekeurd door de promotoren: prof.dr.ir. G. de Haan en prof.dr.ir. R.H.J.M. Otten Copromotor: dr.ir. G.J. Hekstra

8 voor Muriëlle

10 Contents 1 Introduction Basic display functions The TV chain The opto-electronic effect Scanning and multiplexing Display properties Addressing Spatial addressing Temporal addressing Spatio-temporal addressing Image reconstruction Spatial reconstruction Temporal reconstruction Color Color capture Color reproduction Color spaces and transformations Color transmission Color synthesis Goal of this thesis Thesis outline Display types and properties The CRT display The opto-electronic effect Addressing Image reconstruction vii

11 Contents Color reproduction Flat panel displays: LCD and PDP Matrix addressing Opto-electronic effect of LCDs Opto-electronic effect of PDPs Spatial reconstruction LCD temporal reconstruction PDP temporal reconstruction Interlace in FPDs Color reproduction Impact of different display properties Image quality System aspects Display- and video processing Video format conversion Display system image quality Video processing for flat panel displays Conclusions Static display resolution Resolution FPD spatial display signal chain Sampling Addressing Reconstruction The resolution of a matrix display Frequency spectrum of images on a matrix display The Kell factor for matrix displays Spatial quality of matrix displays vs. CRTs Perception aspects The resolution of a color matrix display Color matrix display addressing Analysis of images displayed on a color matrix display Color image perception: luminance and chrominance Analysis of the displayed color image frequency spectrum The subpixel resolution of a color matrix display Subpixel sampling Displaced sampling Subpixel addressing in the display model Spectrum analysis in YUV space Analysis of the subpixel sampling and display spectrum Increased Kell factor with subpixel sampling Signal processing Subpixel arrangements Vertical stripe subpixel arrangement Delta-Nabla subpixel arrangement PenTile subpixel arrangements viii

12 Contents Parameters for comparison of subpixel arrangements Resolution of 2D subpixel arrangements D subpixel arrangements with pixel sampling Subpixel sampling on 2D subpixel arrangements Resolution comparison of different SPAs Conclusions Subpixel image scaling The spatial addressing format of FPDs Spatial format conversion Polyphase scaling filters Video scaling Subpixel rendering Subpixel filtering Subpixel filtered displayed image spectrum Subpixel filter design Subpixel image scaling D subpixel rendering and scaling Higher order 2D filtering: sharpening Results and discussion Natural images Text and graphics Subpixel scaling vs. CRT addressing Two-dimensional subpixel arrangement comparison Subjective comparison Simple 2D subpixel scaling Flat response 2D subpixel scaling Dependence on image content Virtual pixel comparison Conclusions Dynamic display resolution CRT response and flicker Eye tracking Temporal display model and motion artifacts Motion artifacts due to temporal addressing characteristics LCD temporal characteristics LC response time improvement LCD temporal response in relation to motion blur The temporal aperture The motion aperture The temporal MTF and dynamic resolution Temporal frequency analysis of the display chain Frequency spectrum of displayed images Frequency spectrum of displayed moving images Motion blur as a perceived spatial filter Motion blur metrics ix

13 Contents Motion picture response time (MPRT) MPRT directly from impulse response Temporal bandwidth Discussion Motion blur reduction methods Higher frame rates Black and gray frame insertion Scanning backlight MPRT sensitivity Motion blur reduction method comparison PDP motion artifacts PDP motion artifact reduction by alternative drive schemes Conclusions Video processing for motion artifact reduction Video processing for temporal delay compensation Color sequential display compensation Delay compensation for higher frame rates PDP motion artifact reduction Subfield delay compensation Preventing rounding errors Accumulate and switch subfields Results LCD temporal response compensation Inverse filtering Motion compensated inverse filtering Basic implementation Temporal MTF compensation Results Robust MCIF Noise suppression Adaptive MCIF Results MCIF and MPRT Conclusions Discussion, conclusions and further work Discussion Conclusions Static display resolution Dynamic display resolution Further work References 191 A Scanning 201 x

14 Contents B Displaced sampling versus sampling displaced signals 203 C Display simulation 205 D General subpixel sampling and display spectrum 209 List of symbols and abbreviations 211 Summary 215 Samenvatting 217 Acknowledgments 221 Biography 223 Stellingen 227 xi

15 Contents xii

16 CHAPTER 1 Introduction The display: from video signal to light Televisions (TVs) have been around since decades. The screens have become such common items in our living rooms that many of us do not even remember a life without them. Since the introduction of monochrome TVs in 1950 and the addition of color in the 1960s, there has been a continuous system evolution. The number of available programs and channels has increased steadily, and we witnessed many developments in the functionality of the TV, such as stereo sound, remote control, teletext and automatic install functions. The images on the screen have improved considerably in brightness, contrast, sharpness, etc. Also, the design of the set has followed both fashion and technology. Where the first TVs resembled small, obscure fishbowls in wooden cabinets, their present day counterparts are large, bright, colorful, square windows in plastic frames (Figure 1.1). However, one property of the TV that hardly changed until recently is its bulky appearance. A 90cm diagonal TV easily measures 50cm in depth and can weigh up to 80kg. The weight of the screens not only puts a practical limit on their size, but also their volume caused the dream of having a large, flat TV that you can hang on the wall like a painting, to remain elusive. To make the dream come true, the steady evolution of the TV technology had to be broken by a revolution in the display industry: the replacement of the Cathode Ray Tube (CRT) with other types of displays. Since the introduction of TV, the CRT turned out to be the only commercially viable way to electronically generate moving images, i.e. video, on a screen. It uses an accelerated electron beam to excite a layer of phosphor particles on the screen, from which visible light is emitted. To deflect the fast moving electrons to the edges of the screen, a certain distance is required, which accounts for the depth of the system. The weight is caused by the thick glass needed to safely support the vacuum that is required to let the electron beam pass. Until recently, alternative display 1

Chapter 1 Introduction Figure 1.1 Television display past and present: a CRT around 1950 and 2000, and an LCD in 2006.

And since the CRT was able to offer this also at much lower cost, it has dominated the TV market.

At the turn of the century, the CRT finally had to face serious challenge from display principles such as Plasma Display Panels (PDP) and Liquid Crystal Displays (LCD).

17 Chapter 1 Introduction Figure 1.1 Television display past and present: a CRT around 1950 and 2000, and an LCD in principles have not been able to match the performance of the CRT in a number of areas, such as brightness, contrast, efficiency, and color reproduction. And since the CRT was able to offer this also at much lower cost, it has dominated the TV market. However, despite the huge technological advances of the CRT, it is not able to provide the picture on the wall. At the turn of the century, the CRT finally had to face serious challenge from display principles such as Plasma Display Panels (PDP) and Liquid Crystal Displays (LCD). By substituting the electron beam with different methods of creating light, these displays are basically able to generate images on a thin sheet of glass. For this reason, these displays are generally called Flat Panel Displays (FPDs). With their price-performance ratio quickly approaching that of the CRT, the advantages of such perfectly flat, thin, square and light-weight screens, will eventually imply the end of the product life-cycle of the CRT. At the moment, this life-cycle is only being prolonged by a cost advantage. However, besides the unmistakable advantage of their size and shape, FPDs turn out to have problems that the CRT never had. Since the TV system and transmission formats have always been strongly coupled to the CRT principle, a number of additional components in the system are required. Moreover, these emerging displays also caused new image quality artifacts 1, requiring further measures to be taken. Some of these measures implied improvements in the display panel itself, such as in the materials, production methods and the driving system. However, the signal processing chain that delivers the video signal to the display can also be adapted to improve these new panels. This thesis presents an analysis of the properties of flat panel displays, their relation to image quality, and video signal processing algorithms to improve the displayed image quality. In order to discuss the properties of different types of displays (in Chapter 2), the properties are first divided into a number of categories, that follow from the basic functionality of a display: creating visible images. The next sections provide this introduction, that can be read as a basic 1 An (image quality) artifact is, in the context of this thesis, defined as: any property of the displayed (motion) image that inhibits the viewer from appreciating the quality of the image, either in relation to another displayed (motion) image, or in relation to (a notion of) the originating real life scene. 2

1.1 Basic display functions Figure 1.2 Simple picture of TV chain: camera, transmission, reception, and display. The chain basically converts light into electronic signal and back to light.

18 1.1 Basic display functions Figure 1.2 Simple picture of TV chain: camera, transmission, reception, and display. The chain basically converts light into electronic signal and back to light. introduction into display principles. This Chapter ends with the goal of this thesis in Section 1.5, followed by the thesis outline in Section Basic display functions In order to describe the impact of flat panel displays on image quality and on the video processing chain, we first describe the functions that a display must have by considering the TV signal chain [80]. These are the basic TV and display functions, because we can describe them without a particular display technology as an example. The word television literally means to see from far away. A TV allows us to look at things that happened elsewhere. This basically means that the image, i.e. the visible light, of a real-life scene is captured, converted into an electronic signal, transmitted, and converted back to visible light at the display (Figure 1.2) The TV chain Figure 1.2 illustrates the signal flow in a TV system, which consists of three main 2 parts: Registration or creation Converting the visible light from a scene into an electronic signal, the video signal. The type of scene and method of conversion can be diverse. For example, real scenes can be registered by an imaging device, as in a TV camera. On the other hand, virtual scenes can be created by computer, and rendered as a video signal. 2 This thesis deals with image quality. Other functions of a TV-system, such as audio, user interface and (interactive/broadcast) services, are considered outside the scope of this thesis. 3

19 Chapter 1 Introduction Transmission and storage Sending this video signal to the display. This may involve different media (cable, air, satellite, disk, tape, etc), and different protocols (video formats and standards). Often the transmission is not direct, and the signal is stored for some time. Reception and display Receiving the video signal, and converting it back to visible light, allowing the viewer to see the images of the original scene on the display. These basic functions have not changed over the decades, but the implementation technologies have and probably will continue to change, as they are still not perfect. The finally displayed images are only an approximation of the original scene. Recent technological advances in display, transmission and recording may improve the quality, but as long as we can distinguish the approximation from the original, there is room for improvement. Figure 1.2 also shows that the chain is based on the following principles that are fundamental in electronic image reproduction: Opto-electronic effect This relates to the transformation from light to electronic signals. Scanning This relates to the transformation from multidimensional (space-time) image intensity to one-dimensional video signals Color reproduction This relates to the transformation from continuous spectral distributions of light to color signals The opto-electronic effect and scanning will be described in Sections and 1.1.3, leading to a number of properties that each display must have, which are further described in Sections 1.2 to 1.3. Color reproduction is described in Section The opto-electronic effect The conversion from visible light to an electronic signal and vice versa requires an opto-electronic effect. At the camera, this describes the conversion from light to an electronic signal such as voltage or current. The display opto-electronic effect concerns the conversion from electronic signal back to light. The light intensity on the screen, I is driven by a signal V, which results in a certain opto-electronic transfer Γ: I = Γ(V ) (1.1) The Γ-transfer can vary between different display types, and is generally nonlinear. It is an important display characteristic, because it directly relates to the intensity of images on the display. 4

20 1.1 Basic display functions a b Figure 1.3 The conversion between space-time dependent image intensity and video signal. a) Analog view: scanning. b) Digital view: multiplexing Scanning and multiplexing Only converting between light and voltage is not enough to transmit and display a scene. The light of the original scene varies as a function of space and time, and also the light has a certain spectral distribution, related to its color. This multi-dimensional signal has to be reduced to a signal that can be transmitted electronically. The first reduction is the projection of the three-dimensional scene to an image in the camera 3. Nevertheless, an image still represents intensity in a four-dimensional space: I( x, t, λ), where x = [x, y] are the two spatial dimensions, t is time, and λ is the wavelength. The time dimension is quite important in TV, since we deal with moving images 4. The subject of color capture, transmission and reproduction is discussed in Section 1.4. In the following we describe monochrome images I( x, t). The image must be transformed into a 1D video signal, I v (t). This inherently involves a discretization of the ( x, t) volume, to allow a folding into one dimension, as shown in Figure 1.3. In continuous space, this is achieved by scanning [130]: I( x, t) I v (t) = I( x s (t), t) (1.2) where the function x s (t) traces out a path in space and time: the scanning raster (see Appendix A for a mathematical description). A scanned video signal represents a discretization in two dimensions, i.e. into separate lines and fields, as illustrated in Figure 1.3, but is still continuous in t, i.e. it is still analog video. The video signal can be further discretized into I v (n), i.e. into separate 3 This already presents one of the biggest limitations of current displays: the original scene is not reproduced, but only its image. 4 Sometimes the distinction between the 2D signal (I( x) or still images ) and the 3D signal (I( x, t) or moving images ) is denoted by image and video, respectively. In this thesis, we choose to use image for the visible light signal, either moving or still, and video for the (electronic) signal that conveys the moving image. 5

21 Chapter 1 Introduction picture elements, better known as pixels. If we regard the continuous, original image, I c ( x, t), as consisting of discrete samples, I s (n x, n y, n t ), at coordinates [ x(n x, n y ), t(n t )], the process corresponds to a multiplexing of image samples into a discrete ( digital 5 ) video signal I v (n) = I(n x (n), n y (n), n t (n)) [119]. The multiplex order [n x (n), n y (n), n t (n)] corresponds to the scanning raster. Since multiplexing can also be seen as a scanning of the image samples, we will refer to the process, both digital and analog, as scanning. The important aspect, both for scanning and multiplexing, is that the video signal is discrete in at least two dimensions, and that each point in the video signal has a corresponding place and time [ x(n), t(n)], associated with it, corresponding to the raster or multiplex order. This encoding scheme forms an important part of the video format, since the video signal cannot be interpreted by the display without it (see Figure 1.3). Therefore, important characteristics of the video format are the field period T, the number of lines per image, N y, and the number of pixels per line, N x. Note also, that we deal with video and displays that can only handle images that are bounded in space, typically by a rectangular border (the frame ). The image size is denoted by its width W and height H, which define the image aspect ratio, AR = W H. The aspect ratio is also an important characteristic of a display or a video signal, but it is not extensively discussed in this thesis. Usually, spatial image coordinates are normalized to the image width and height to be independent of image size. The display converts the signal back to a light intensity on the screen, i.e. it uses the video signal to control the intensity of each point on the screen. This corresponds to a de-multiplexing of the video signal, translating I v (n) back to I d ( x, t), using the multiplex order, [n x (n), n y (n), n t (n)], and the coordinates of each pixel, ( x(n x, n y ), t(n t )). In other words, scanning describes how to read the image in space and time at the camera, and therefore also how to write the image when converting from signal to light at the display Display properties The fundamental display principles of opto-electronic effect and scanning lead to a number of properties that each display must have. Table 1.1 shows these properties, which are further described in Sections 1.2 to 1.4. This list of display properties will be used throughout this thesis, to describe and compare a number of display technologies, and develop video processing algorithms. 1.2 Addressing At the display, and in particular for new display types, scanning and de-multiplexing are often called addressing. This reflects the process that takes place 5 In digital video transmission, pixels are not directly sent in this order, but reordered (in blocks), compressed and encoded [119]. Since this process is reversed at reception, we consider it outside the scope of this thesis. 6

22 1.2 Addressing Property Opto-electronic effect Scanning Spatial addressing Spatial reconstruction Temporal addressing Temporal reconstruction Color synthesis Function Convert between electronic signal and light Convert multi-dimensional space-time (image) signal to one-dimensional (video) signal Distribute the video signal over the entire screen. Convert the space-discrete video signal to a spacecontinuous image. Generating light takes a certain area. Perform the spatial addressing repeatedly over time, creating a sequence of fields to reproduce moving images. Convert the time-discrete video signal to a timecontinuous image. Generating light takes a certain time. Produce three different colors of light, each controlled by a separate signal, at any position on the screen, to reproduce color images. Table 1.1 Basic display properties in the display: each position on the screen is addressed, after which the optoelectronic effect at that position is driven by the video signal. To describe display addressing in the remainder of this thesis, we separate addressing into spatial and temporal addressing Spatial addressing The display screen can only be addressed at certain positions. These positions can be discrete in one dimension (scanned lines) or in both (pixels). Spatial addressing is characterized by the number of lines on the screen and pixels per line. Spatial addressing is an important display characteristic, because it determines the spatial resolution of the display (See also Chapter 3). More detailed images can be reproduced, i.e. the resolution is higher, if more lines and pixels can be addressed. An extreme example is illustrated by Figure 1.4, which shows images on a 30-line display 6, and a standard definition (SD) display with 576 (active) lines display. The recently introduced High definition (HD) standards provide even more lines, thereby further increasing spatial resolution. 6 The first TV transmissions [147, 40] in the 1930s used this format. These TVs used mechanical scanning and were still monochrome. Figure 1.4 simply illustrates the effect of spatial resolution, not what these TVs actually looked like. 7

Chapter 1 Introduction a b Figure 1.4 The spatial addressing format: more resolution with more lines. a) A display with 30 lines (as were the first TVs around 1930).

23 Chapter 1 Introduction a b Figure 1.4 The spatial addressing format: more resolution with more lines. a) A display with 30 lines (as were the first TVs around 1930). b) A display with 576 lines (today s standard definition TV) Temporal addressing To reproduce moving images, the light intensity on the screen must be changed over time. However, by scanning, the temporal dimension is sampled to a sequence of separate images, or fields (also see Appendix A). The number of fields per second characterizes the temporal addressing format. Each position on the screen can only be addressed, i.e. updated, at discrete points in time. Therefore, the illusion of continuous motion is generated from a series of separate images. This is already possible from around 12 images per second [4, 161], but works better if the field rate is higher [151]. Moreover, some displays suffer from large area flicker if the field rate is lower than typically 75Hz [5, 10]. Temporal addressing is therefore another important display characteristic Spatio-temporal addressing Spatial and temporal addressing are not always separable. The main reason for this is interlace. Interlace was introduced to reduce the video signal bandwidth [47]. With interlaced scanning, the lines in subsequent fields are not at the same vertical position. With an interlace factor of two, the spatial addressing format has twice the number of lines of each individual field. An interlaced display doubles the perceived number of lines compared to when it would use a non-interlaced format, because the viewer cannot distinguish the individual fields at high enough field rate. Although the interlace factor is also an important characteristic of displays and video formats, we do not discuss this topic in detail in this thesis (only in Section 2.2.7). For an extensive overview, see [9]. 1.3 Image reconstruction As introduced in Section 1.1.3, a basic characteristic of the video signal, is that it is a discrete version of a spatially and temporally continuous image. A discrete signal in itself has no spatial extent, it just represents an intensity at a single position in space. The display converts this discrete signal to light in the 8

24 1.3 Image reconstruction Figure 1.5 Spatial reconstruction: from left to right, the light emission profile becomes wider. This reduces line structure, but this also reduces resolution. real, continuous world. The display, therefore performs a reconstruction of the originally continuous image from the discrete signal. The purpose of the reconstruction process is to approximate the original signal as closely as possible, without visible artifacts caused by the discrete signal representation along the chain. The reconstruction has a spatial and a temporal component Spatial reconstruction A typical artifact of imperfect spatial reconstruction is line- or pixel structure. To reconstruct a uniform image area, the display must fill in between the addressed positions. This requires that light is produced over a certain area surrounding an addressed position. Spatial reconstruction is characterized by the spatial light emission profile of each addressed position. When the profile is too narrow, line structure can appear. However, when the profile is too wide, image details are lost (see Figure 1.5). Spatial reconstruction is therefore also an important display characteristic. Besides the requirements of the reconstruction process, it is also inevitable for physical reasons that the opto-electronic effect produces light over a certain area after it is addressed. Producing the same amount of light from a smaller area will often decrease the efficiency of light generation Temporal reconstruction Just as in the spatial dimension, light generation has a finite temporal extent. It simply takes a certain time to generate an amount of light. And just as in the spatial dimension, imperfect temporal reconstruction can result in artifacts. When the temporal reconstruction profile is too short, the time between fields is not filled with light, which can result in visible image flicker if the field rate is too low (or the intensity is too high). On the other hand, the temporal reconstruction can reduce temporal resolution if the light emission takes too long. Reduced temporal resolution will affect moving images, for example resulting in motion blur. Temporal resolution is very different from spatial resolution, and is a complicated matter that will be extensively further discussed in Chapter 5. 9

25 Chapter 1 Introduction 1.4 Color Light is not only characterized by its intensity, but also by its color, i.e. its wavelength distribution (spectrum) I(λ). Besides the opto-electronic effect and scanning, color is another main principle in the electronic image chain. TVs were first introduced as monochrome ( black-and-white ) displays, and the addition of color was a major improvement. Color reproduction was introduced in the 1950s [129] and became widely accepted in the 1960s and 70s. Although setting a standard for color transmission took some time, the basic principle behind color reproduction and transmission was long available. Color vision and color image reproduction have been studied extensively [161, 59, 60, 133], and it is impossible to cover all aspects in a single section. We only give a short overview for completeness of this chapter. The reproduction of color is based on the tri-stimulus theory [96, 163, 59], which was founded by Maxwell and Grassmann around 1860, and standardized by the CIE committee during the 1930s [36]. This theory is based on the fact that the human eye contains three different types of color sensitive cells ( cones ). These cells are sensitive to short, middle and long wavelengths, corresponding to the blue, green and red parts of the spectrum, respectively. This means that the human eye converts the spectral distribution of the light intensity I(λ) to three separate visual stimuli I = [R, G, B] T as illustrated 7 in Figure 1.6. These three stimuli together create the sensation of color [161]. The tristimulus theory allows an efficient transmission of color information in electronic image reproduction. Figure 1.7 shows that color capture and reproduction both represent transformations between the wavelength-continuous domain and a transmittable three-dimensional color signal 8 [119]. The color chain is similar to the TV chain that was introduced earlier, and it also leads to a number of properties, related to color, that characterize a display. Despite the apparent symmetry in the chain, each part possesses some unique properties, which are discussed next Color capture As a result of the tri-stimulus theory, the perceived color of each physical light source can be represented with only three values. All spectra with the same values appear identical to the human observer, i.e. they are metameric. A color imaging device converts the continuous wavelength distribution to a three valued color signal, i.e. a wavelength distribution I c (λ) maps to a point I c = [R c, G c, B c ] T in a 3D color space. This is achieved by filtering the incoming light with three different color filters before capturing its intensity. The filters should be designed such, that distributions that appear identical to the human 7 The cone responses in the figure are only illustrative, real responses are broader [160]. Also, the RGB signal from the cones is not identical to typical RGB signals used in displays. 8 The figure is simplified for illustrative purpose. In reality, the camera and display color spaces are not identical, so colors must be converted from camera to display color space. (Or cameras need partly negative sensitivities to generate colors in the display color space.) 10

26 1.4 Color Figure 1.6 Color perception: three types of receptive cells in the human visual system convert a continuous spectral light distribution into a discrete (3-valued) color signal, which results in the sensation of a color. observer, also map to identical points in the color space, i.e. the device space is a Human Visual Subspace [133] Color reproduction Also as a result of the tri-stimulus theory, colors can be reproduced by mixing, with different intensities, light from three different color light sources, i.e. the color primaries. With the primaries P (λ) = [R(λ), G(λ), B(λ)], and primary intensities I d = [R d, G d, B d ] T, the displayed color becomes I d (λ) = P (λ) I d. Although the displayed wavelength distribution I d (λ) can be very different from the original I c (λ), the resulting color sensations ( I) can be the same if the intensities of the primaries are set correctly. The range of colors that can be reproduced with a certain set of primaries, i.e. the color gamut, is limited by the constraints that physical light intensities must be positive ( I d 0), and that the primaries have a certain maximum intensity ( I d 1 when normalized to unity). For a display with a large color gamut, the primary colors are in the red, green and blue parts of the spectrum, and have narrow distributions (saturated colors). Note that in general, the captured color signal I c is not equal to the display color signal I d, since both can correspond to different color spaces, as defined by Figure 1.7 The color chain : continuous wavelength distributions are converted to three-dimensional color signals at the camera, transmitted, and converted back at the display. 11

Chapter 1 Introduction Figure 1.8 The display gamut, which is a cube in RGB space (0 RGB 1), plotted in YUV space. the device primaries. 1.4.

Colors can be transformed between different spaces, for example from the capture device space to the reproduction device space. We can distinguish between linear and non-linear color spaces.

Colors can be transformed from one linear space to another by a simple 3x3 matrix multiplication.

27 Chapter 1 Introduction Figure 1.8 The display gamut, which is a cube in RGB space (0 RGB 1), plotted in YUV space. the device primaries Color spaces and transformations A color signal I can only be interpreted correctly if the color space is known. Colors can be transformed between different spaces, for example from the capture device space to the reproduction device space. We can distinguish between linear and non-linear color spaces. The color space corresponding to a display is a linear space, where the coordinate axes of the space are formed by the primaries. Colors can be transformed from one linear space to another by a simple 3x3 matrix multiplication. In principle, any linear 3D to 3D transform defines a color space, and a corresponding set of primaries. The constraint that the primaries must be physically realizable only holds for a display color space. A linear color space without physical primaries is for example the CIE standard XY Z color space [163] that is used as a device-independent space. More generally, color spaces can also be defined using a variety of non-linear transforms (from linear spaces), which is for example required to make perceptually uniform color spaces [163]. Non-linear spaces are not characterized by a set of primaries alone, but need the complete transform, typically from a standard space. Note that color gamuts and color spaces are different entities. Only a display color space, i.e. with physical primaries, directly corresponds to a gamut. a b c d Figure 1.9 Color synthesis methods: a) spatial, b) temporal, c) projective d) layered. 12

28 1.4 Color Color transmission Color image transmission is enabled by the reduction of the infinite dimensional spectral intensity distribution to a color signal with three components. There is a distinct difference, however, between transmission and reproduction of a color image. The color space of the video signal is not restricted by the physical limit 0 I 1, so any color space can be used to describe the colors in the video signal. For compatibility with monochrome systems, but also for transmission efficiency, a luminance - chrominance (Y/C) space, like the Y UV of the PAL standard, is commonly used. The luminance of the image is represented in the Y component. The two chrominance, or color difference components, C = [U, V ] describe the color of the image. For example for the PAL system, the RGB to Y UV transformation 9 is given by [119]: Y U V = R G B (1.3) Figure 1.8 shows the display gamut (0 RGB 1) in Y U V space, to illustrate the relation between the RGB and Y U V spaces Color synthesis To reproduce color images, the light from the three primary colors must be mixed, at a certain position on the screen. An ideal display would produce all colors at the same position, but this is not always possible. There are a number of methods to synthesize color on a display: spatial, projective, temporal and layered synthesis (Figure 1.9). With spatial color synthesis, the primary colors are distributed over the screen in separate, closely spaced dots or stripes. The mixing occurs in the eye of the viewer: when the viewing distance is large enough with respect to the color dot pitch, adjacent dots are no longer visible, and the three primary colors blend together into the color corresponding to the mixture. It is also possible to produce colors at different positions in time. Temporal color synthesis requires that the display can switch to a different color in each field, which is for example possible by adjusting the backlight color for transmissive displays. Projection displays, which we will not extensively cover in this thesis, use Projective color synthesis, which enables reproduction of the three primary colors at the same position. This is possible because the light of each primary color is projected through an optical system that allows combination at the same location on the screen surface. In theory, it would also be possible to generate color by Layered color synthesis, using different layers on the screen, i.e. spatially separated in a direction 9 For further efficiency and compatibility, the transform is applied to gamma corrected RGB signals (Section 2.1.1), making the PAL Y UV space non-linear, although the transformation is linear. 13

29 Chapter 1 Introduction Figure 1.10 Besides the display itself, video processing is also part of the display system. The required processing depends on the properties of the display and the video signal. perpendicular to the screen surface. This requires that each color layer is transparent to the other two colors. Although examples exist [144, 60], up to today no successful displays have been developed based on this principle. 1.5 Goal of this thesis The previous sections provided an introduction into basic display principles, based on the display signal chain. Nevertheless, we are interested in finding how the specific properties of flat panel displays influence image quality, and how image quality can be improved. The display itself is not the only part of a display system. Incoming video must usually be processed before it can be input to the display (Figure 1.10). The properties of, both the display, and the video signal determine what type of video processing is needed. We are also interested in finding the impact of the specific properties of flat panel displays on the video signal processing chain. Therefore, we investigate how video processing can be adapted to improve image quality, given the specific properties of a flat panel display. Moreover, we are interested in developing new video processing algorithms for image quality improvement, taking into account flat panel display properties. Therefore, the goal of this thesis is twofold: To describe flat panel display properties, based on video signal processing principles, and relate these to image quality with regard to static and dynamic resolution. To develop video signal processing algorithms that improve static and dynamic resolution of flat panel displays, by taking into account their specific properties. 1.6 Thesis outline In this chapter, we introduced a number of basic properties, found from a highlevel analysis of the display signal chain, that characterize a display. The re- 14

30 1.6 Thesis outline mainder of this thesis is structured as follows (see Figure 1.11): In Chapter 2, three display types are described in terms of these properties: the Cathode Ray Tube (CRT), which is the most successful display over the past decades, and the currently most successful flat panel display (FPD) types: Liquid Crystal Displays (LCDs), and Plasma Display Panels (PDPs). Some general aspects of the differences between these display types are discussed, while a more detailed analysis of the impact of different display characteristics in the spatial and temporal dimensions is the topic of Chapters 3 and 5, respectively. Chapter 2 also presents a general introduction of video signal processing for FPDs. Chapters 4 and 6 deal with video processing algorithms for improving, respectively, the static and dynamic resolution of FPDs. Chapter 3 deals with the spatial properties of FPDs in relation to static display resolution. A model of the display signal chain in the spatial dimension is developed to analyze the static resolution of FPDs. In particular, it is investigated how the static resolution of FPDs can be improved, when the color synthesis method is taken into account. Chapter 4 deals with signal processing related to the spatial characteristics of FPDs. An algorithm is developed, that combines image scaling and the spatial color synthesis method of FPDs into subpixel scaling. Chapter 5 focuses on the relation between the temporal properties of FPDs and the quality of moving images, i.e. dynamic display resolution. A model of the display signal chain in the temporal dimension is developed to analyze the dynamic resolution of FPDs. In particular, it is investigated how temporal display properties relate to motion artifacts, and how these can be reduced by modifying the display. In Chapter 6, we investigate video processing algorithms for reducing motion artifacts on FPDs. Specifically, we develop an algorithm to reduce dynamic false contouring on PDPs, and an algorithm to reduce motion blur on LCDs. Finally, in Chapter 7 we discuss some of the results from this thesis, summarize the main conclusions, and discuss interesting further work on these topics. Figure 1.11 Structure of this thesis. 15

31 16

32 CHAPTER 2 Display types and properties Displays from a video signal processing perspective In the previous chapter, we introduced a number of basic properties that characterize a display, such as opto-electronic effect, spatial and temporal addressing and reconstruction, etc. (Table 1.1). In this chapter, three display types are described in terms of these properties. The Cathode Ray Tube (CRT), which is the most successful display over the past decades, is discussed first in Section 2.1. In Section 2.2, the currently most successful flat panel display types are discussed: Liquid Crystal Displays (LCDs), and Plasma Display Panels (PDPs). Developments on these display types have been ongoing for decades [72, 112, 61]. Therefore, these displays are well known, and there is no need to describe them in detail. Nevertheless, displays are mostly described from a display technology perspective [152, 17, 18]. In this thesis we present a description based on signal processing principles. Similar descriptions exist [130, 7, 138, 118], but in this chapter, we introduce a unified method of describing displays, and apply it to flat panel displays. It will become clear that FPDs and CRTs perform the same display functions, but in quite different ways. This description will form the basis of the rest of this thesis, that works out these principles for flat panel displays, regarding static and dynamic resolution. The consequences of the differences in characteristics between these display types have two aspects. First, the characteristics clearly have an impact on the resulting image quality. Some general aspects of this are discussed in Section 2.3. A detailed analysis in the areas of static and dynamic resolution is a major part of this thesis (Chapters 3 and 5). Second, video signal processing is also part of the display signal chain, and it is even more important for FPDs. Because display characteristics will in general no longer match the source, video format conversion is required. A general introduction to this topic is given in Section 2.4. Nevertheless, traditional video format conversion is not all that is needed to 17

33 Chapter 2 Display types and properties Figure 2.1 Basic principle of the CRT display. reach the best image quality on FPDs. Specific display properties of FPDs, especially related to spatial and temporal addressing and reconstruction, can result in modifications of the video processing functions in the chain, or even require addition of new video processing functions. This topic forms the other major part of this thesis (Chapters 4 and 6), for which Section 2.5 provides an introduction. Finally, this chapter is concluded in Section The CRT display Figure 2.1 shows the basic construction of a CRT display. In a CRT, the generation of an electrically controlled amount of light at each part of the screen is realized with an electron gun ( cathode ) at the back of the tube, and a screen coated with phosphorescent material ( phosphors ) in the front. The gun produces electrons that are accelerated toward the screen with a strong electric field ( 27kV). The phosphors are excited by the impinging electrons and emit light with an intensity that is proportional to the number of electrons per unit time. The electron beam is focused on the screen, and a magnetic field generated by coils around the neck of the tube deflects it to the desired position on the screen. In the following subsections, we will describe the CRT in further detail. The description is categorized according to the functions of a display: spatial addressing, spatial reconstruction, temporal addressing, temporal reconstruction, spatio-temporal addressing, and color reproduction. For a more extensive overview of CRT technology, we refer to [152] The opto-electronic effect As governed by the laws of electron optics and phosphor opto-electronics, the light intensity, I, generated by the CRT is proportional to the number of electrons per unit time, which is approximately a power-law function of the driving 18

34 2.1 The CRT display Figure 2.2 CRT opto-electronic transfer: the gamma-curve. voltage at the gun, V. The total opto-electronic transfer from V to I becomes [117]: I = Γ(V ) V γ (2.1) where γ is typically in the range of 2.2 to 2.8. Figure 2.2 shows this transfer, which is called the gamma characteristic or simply gamma of the display. Although the opto-electronic transfer of other display types can be very different, they are usually also referred to as the display gamma. Since an imaging device generally does not have the same opto-electronic transfer as the CRT, the gamma law must be corrected somewhere in the chain to ensure that all light levels from black to white, i.e. the gray levels, are rendered with correct intensity. To keep the TV-sets simple, it was standardized to precorrect the video signal before transmission. This required only a few correction circuits at the studio side, instead of one in each TV-receiver. The gammacorrection, or source-gamma produces a video signal of which the amplitude is non-linearly related to the light intensity that should appear on the screen. Although this non-linearity is dictated by the physics of the CRT display, the source-gamma is also beneficial for transmission efficiency. As it turns out, the light-sensitivity of the human visual system follows a power-law (the Weber- Fechner -law, [161]) that is approximately the inverse of the CRT gamma. This makes a gamma-corrected video signal approximately perceptually linear. This, for example, makes the transmission more noise robust, because noise will have approximately the same perceived amplitude in dark and light image parts. A video signal that is linear with light intensity will be vulnerable to noise in dark image parts, because of the non-linear human visual sensitivity [53] Addressing In a CRT, addressing is achieved by scanning the electron beam over the screen, in horizontal lines from left to right and top to bottom (see Figure 1.3a). The electron beam can be deflected to a certain position on the screen by a magnetic 19

35 Chapter 2 Display types and properties field that is generated by coils around the neck of the tube. At the end of each scanned line, and at the end of each scanned field, the electron beam flies back to the opposite border of the screen. (See also Appendix A). The scanning principle, to convert images to video (Section 1.1.3), matches very well with the scanning electron beam display principle of the CRT. In fact, display and imaging technology were both originally based on scanning electron beam technology. Therefore, during early TV developments, the chain could be kept simple by operating the camera and the display in a synchronized manner. The display is addressed using exactly the same scanning format as the camera. The scanning parameters, i.e. the number of lines per image and the number of images per second, are standardized in the video format. In a CRT, spatial (vertical) addressing and temporal addressing formats are not fixed. The scan parameters can be changed to give various number of lines per field and fields per second. This results in a trade-off between spatial and temporal addressing format, which also makes the CRT ideally suited for interlaced formats [10]. In CRTs, the line frequency, f l, is usually the limiting factor, so the choice of field frequency, f f, determines the number of lines that are written: N l = f l /f f. The horizontal dimension in the CRT is continuous, because the electron gun addresses every point on the scan line. The CRT spatial addressing format is therefore not characterized by a number of pixels Image reconstruction The electron beam is focused to a small spot on the phosphor screen, which approximately has a Gaussian intensity distribution [152, 72]: I(x) = e 1 2 (x/σ)2 σ 2π (2.2) where σ is a measure of the spot size. The spot size is limited by the laws of electron optics: electrons are mutually repelling, so an electron beam tends to diverge. Moreover, in practical CRT systems, the spot size will increase toward the edges of the screen due to imperfections in the deflection and focusing systems. Moreover, the spot size will also increase when the brightness of the display is increased. Figure 2.3 Line structure and resolution. Images on a CRT with, from left to right, an increasing spot size (copy of Figure 1.5). 20

36 2.1 The CRT display Figure 2.4 Temporal response of a CRT: Phosphor decay. For good image reconstruction (Section 1.3), there is an optimal spot size [51], as illustrated by Figure 2.3. When the electron spot is too small in the vertical dimension, the individual scan-lines will become visible. Good reconstruction, therefore, requires some overlap between the spots of adjacent lines. On the other hand, a spot that is too large will blur small details in the image, i.e. it will decrease the resolution of the display (see Section 3.1) The optimal spot size will also depend on the number of lines: more lines in the same area require/allow a smaller spot. In practice, CRTs never reach this optimum for TVs, because the brightness needed for this application has a negative effect on spot size. The horizontal reconstruction profile is also determined by the bandwidth limitations of the video signal and electron gun. Because the horizontal dimension in the CRT is continuous, the number of pixels of the spatial addressing format is replaced by the video bandwidth of the spatial reconstruction. While the electron beam in the CRT causes a relatively large spatial light profile, the temporal light profile is very short. After being excited by the electron beam, the phosphor emission decays to 1% intensity in approximately 1 ms, as shown in Figure Color reproduction In a CRT, the three different light sources needed to reproduce color, are made with three different phosphor materials. Figure 2.5 shows that the spectral distributions of the light emitted by the phosphors after excitation by the electron beam, are in the red, green and blue part of the spectrum. To reproduce color images, the intensities of the light from each phosphor type must be individually controllable at each point on the screen. Although there are plenty alternatives [146, 129], the shadow-mask is the most successful method to achieve this, based on spatial color synthesis. The shadow-mask (Figure 2.5) is a metal sheet with a grid of small holes, at a short distance from the phosphor screen. The tube contains three separate electron guns, one for each primary color. The guns are placed under slightly different angles, such that each gun sees a different part of the phosphor screen through the shadow-mask. Because this part is only coated with the corresponding phosphor color, electrons from that gun only excite this phosphor type. The three guns are aimed at the same point on the screen, and their scanning is 21

37 Chapter 2 Display types and properties a Figure 2.5 b a) The spectral distribution of the three CRT phosphor types. b) The shadow-mask prevents the electron beam from hitting phosphors of the wrong color. identical 1. As a result, the reproduced color at a certain position on the screen is determined by the signals applied to the three guns, at the corresponding time during the scan. 2.2 Flat panel displays: LCD and PDP There are almost as many display types as there are methods to make or manipulate light with electronic signals [41]. However, despite many attempts to make a thin, flat display based on the CRT opto-electronic principle of electrons and phosphors [131], flat panel displays use totally different display principles. The currently most successful FPDs, Liquid Crystal Displays (LCDs) and Plasma Display Panels (PDPs) are introduced, according to their opto-electronic effect, in Sections and 2.2.3, respectively 2. Sections to describe LCDs and PDPs further, using the spatio-temporal addressing-reconstruction-color formalism. A detailed description of these display technologies is considered outside the scope of this thesis, but can for example be found in [17, 95, 155, 18]. Contrary to their opto-electronic effects, flat panel displays show much less variation with respect to addressing. They almost always use a matrix of row and column electrodes to address each part of the screen. This is why these displays are also called matrix displays. First, the matrix addressing principle will be introduced in the next section, followed by a description of the other properties of LCD and PDP. 1 the electrons have no color, and are deflected identically by the magnetic field. 2 A third principle, Organic Light Emitting Diodes (OLED) is emerging [122, 157], but we shall not discuss this display technology. Many of its aspects, however, are easily described using the methods from this thesis. 22

2.2 Flat panel displays: LCD and PDP Figure 2.6 Matrix addressing: A pixel on the display is addressed via a row and column electrode. 2.2.1 Matrix addressing In a matrix display, the opto-electronic

38 2.2 Flat panel displays: LCD and PDP Figure 2.6 Matrix addressing: A pixel on the display is addressed via a row and column electrode Matrix addressing In a matrix display, the opto-electronic effect is sandwiched in between two substrates (glass plates), and the addressing is performed with a matrix of row and column electrodes deposited on one or both of these substrates. At each position on the screen, corresponding to the crossing point of a row and column electrode, the opto-electronic effect can be controlled by applying driving signals on the corresponding electrodes (Figure 2.6). This creates a light intensity in a small part of the picture, called a picture element, or pixel. The matrix principle is crucial in the behavior of flat panel displays, and very different from the addressing in CRTs. The intensity of each pixel on a CRT is driven from a single point, namely the electron gun. The addressing, i.e. the (electric) connection between drive signal and screen, is formed by the electron beam itself, which can be directed in a very flexible way. No matter how many pixels are written on the display, a CRT has only one connection to the outside world. In matrix displays, however, the connection to each pixel is hard-wired on the display. The number of connections depends on the number of pixels per line (N x ) and number of lines per picture (N y ). If each pixel would be connected individually, the number of connections equals N x N y, which grows very fast with increasing resolution. Matrix addressing requires only one connection per row and one per column, so the number of connections equals N x + N y. This makes it possible to address a high-resolution panel, without covering the whole panel in connections. A matrix display does not have individual connections to each pixel, so not all pixels can be addressed at the same time, and therefore scanning is needed to address the panel. A single row of pixels is activated by applying a selection signal to one of the row electrodes, and the driving signals that determine the pixel intensities on that row are applied to the column electrodes. This process is then repeated over all rows in the screen to address a complete image. There are two types of matrix addressing: passive and active. With passive addressing, the driving signal is only present when the row is selected. With active addressing, each pixel incorporates some form of memory, that allows 23

39 Chapter 2 Display types and properties a Figure 2.7 b The Liquid Crystal opto-electronic effect (a) and -transfer (b). the driving signal to be applied to a pixel even when the row is not being addressed. Passive addressing cannot achieve high contrast and high resolution [2, 86]. Typical resolutions for a TV panel, i.e. at least several hundreds of pixels horizontally and vertically, require an active matrix. Temporal addressing in a matrix display is simply the repetition of the addressing sequence for each frame. The temporal format is determined by the times at which each pixel is addressed, i.e. how many times each pixel is addressed per second and in which order the pixels are addressed. In matrix displays, the temporal format is variable at least to some extent (See also Section 2.2.7). With matrix addressing, the spatial and temporal addressing format are separated. The spatial addressing format is defined by the number of rows and columns, i.e. the number of pixels, in the matrix. The spatial addressing format is therefore fixed on the display, and it cannot be changed by choosing a different trade-off between spatial and temporal addressing 3, as in the CRT Opto-electronic effect of LCDs Liquid Crystal (LC) materials can change the polarization state of passing light, depending on the orientation of the molecules [95]. The opto-electronic effect is based on the tendency of the (rod-shaped) LC molecules to align with an applied electric field. An LC cell (Figure 2.7a), which consists of a layer of LC material in between polarizers and electrodes, can therefore vary its transmittance depending on the cell voltage (and depending on the orientation of the two polarizing layers). To make a display from this opto-electronic effect, an LCD consists of the stack shown in Figure 2.8: a backlight, a polarizing layer, an electrode patterned with address electronics on a glass substrate, an LC layer, another electrode and glass substrate patterned with color filters, and another polarizer plus some optical foils. The LC molecules are aligned to the substrates. The molecules respond to the applied electric field by balancing two forces: a force proportional to the 3 Although a matrix display with less lines can usually be addressed at more frames per second, but this is a trade-off in display design and manufacturing, not in display addressing. 24

40 2.2 Flat panel displays: LCD and PDP Figure 2.8 Construction of a Liquid Crystal Display. electric field strength, that drives the molecule orientation in the direction of the field, and an elastic force that drives the molecules to their initial position. Various alignment methods and electric field configurations exist, resulting in different LCD-types such as Twisted Nematic (TN), In Plane Switching (IPS) or Vertically Aligned (VA) types. The LC opto-electronic effect does not produce light, but modulates the transmittance of light from a separate light source. LCDs are therefore called transmissive displays 4. The resulting opto-electronic transfer is determined by the relation between the applied voltage and the transmission of the LC cell, multiplied by the backlight intensity. A typical opto-electronic transfer, which takes the form of an S -curve 5, is shown in Figure 2.7b. LCDs can be normally white or normally black, depending on the relative orientations of the polarizers and LC at zero voltage. The display in Figure 2.7b is normally white, because it has a transmissive state at low voltage Opto-electronic effect of PDPs Plasma Display Panels (PDPs) are emissive display types. Unlike LCDs, but like CRTs, the opto-electronic effect directly generates light. The plasma display effect is based on gas discharge and phosphors, much like the fluorescent tubes used for lighting purposes. A PDP is built from two glass substrates, with a gas mixture between them (Figure 2.9). A barrier structure between the glass plates divides the panel into separate cells or channels that are coated with 4 The light source does not have to be at the back of the panel, it can also be in front, if the LC display modulates the light reflectance. These reflective displays are able to work without internal light source by reflecting ambient light. Typically, these displays are better suitable for outdoor use. Low brightness in indoor conditions makes them less suitable for TV applications 5 The transfer is symmetric around zero voltage. LC molecules respond to the magnitude of the electric field, the sign is not important. 25

This discharge emits UV-light, which is converted to visible light by phosphorescent materials.

41 Chapter 2 Display types and properties Figure 2.9 Construction of a Plasma Display. phosphorescent material. In the gas, usually based on Neon and Xenon, an ion discharge can be induced by applying a voltage above the ignition threshold (typically around 100V). This discharge emits UV-light, which is converted to visible light by phosphorescent materials. The basic opto-electronic effect of PDPs can only produce two light levels ( on or off ), because a discharge produces a fixed amount of light. For grayscale reproduction, the number of discharges in a field period is varied: Pulse number modulation [154]. This results in a linear opto-electronic transfer for PDPs, i.e. the light intensity is proportional to the number of discharges per field period. Although the use of simple pulse number modulation provides basic grayscale capability, it did not result in high-quality PDPs. Major breakthroughs for PDPs to make it to the TV market, were the address-display separation driving scheme (ADS) [154], and weighted subfield addressing (Figure 2.10). These enable efficient use of pulse number modulation for a large number of gray levels. With subfield addressing, each field period is divided in a number of subfields, each with a different number of light pulses. In the ingenious ADS scheme, each subfield is driven in three phases: erase, address and sustain. First, during the erase phase, all charge is removed from the pixel. Then, during the Figure 2.10 Subfield addressing in PDPs. Each subfield has its own erase, address and sustain phase. Each subfield can have a different weight, by choosing the number of pulses in the sustain phase. 26

42 2.2 Flat panel displays: LCD and PDP address phase, the rows of the display are addressed one at a time. A discharge is induced by applying a voltage to the column electrode only for those pixels that have to emit light. This leaves behind a priming charge that will lower the ignition threshold for later discharges, which accounts for an active matrix memory effect. In the sustain phase, all pixels in the panel are simultaneously driven with an alternating voltage. Only the primed pixels discharge, because the applied voltage is above the ignition threshold of the primed pixels, but below that of the unprimed ones. The number of discharges in the sustain phase determines the light intensity, i.e. the weight, of the subfield. This leaves only two choices for the intensity of each pixel in each subfield: no light, or a fixed amount of light. After the sustain phase, the erase-address-sustain sequence is repeated for the next subfield. The human viewer cannot distinguish individual subfields in a field, because these occur too fast for the eye too follow. Therefore, the viewer only perceives the total light intensity that is produced in the field: N s N s I = I p N p (n)s(n) = W (n)s(n) (2.3) n=1 where I is the total intensity, I p is the intensity of a single pulse, N p (n) is the number of pulses for subfield n, S(n) is the state (0 or 1) of subfield n, N s is the number of subfields in a field, and W (n) is the weight of subfield n. Typically, the sustain phases have a different number of pulses for each subfield in a field. For a maximum number of gray levels, the weights of the subfields are given a binary distribution. This corresponds to weight ratios of 1, 2, 4, 8,..., 2 Ns 1, as shown in Figure N binary subfields can produce 2 N different gray levels, from 0 (all subfields off, black) to 2 N 1 (all subfields on, white). Intermediate gray levels are produced by switching on the appropriate subfields. For example for gray level 26 the subfields with weights 2, 8, and 16 are switched on. In Chapter 5 it is explained that the rather awkward generation of gray levels in PDPs has a negative effect on the quality of moving images. n= Spatial reconstruction A matrix display reconstructs the original space-continuous image using a 2D mosaic of pixels. The term pixel can have a number of meanings [16]. First, the value of a pixel corresponds to the intensity at a single point in the image. Also, a pixel corresponds to a point on the screen where there is a connection between a row and column electrode, i.e. where the display can be addressed with a driving signal. Furthermore, a pixel corresponds to the area of the screen that is associated with an addressed point, i.e. to the spatial reconstruction profile. In matrix displays, the spatial reconstruction profile is usually called the (pixel) aperture. We will use the term aperture from here on, instead of reconstruction profile, also for non-matrix displays (the CRT has a Gaussian aperture ), and also for the temporal reconstruction (Chapter 5). 27

Chapter 2 Display types and properties a b c d Figure 2.11 Spatial apertures for LCD (a,b) and PDP (c,d). Single pixels (a,c) and as part of the pixel matrix (b,d).

43 Chapter 2 Display types and properties a b c d Figure 2.11 Spatial apertures for LCD (a,b) and PDP (c,d). Single pixels (a,c) and as part of the pixel matrix (b,d). Good reconstruction and high light output require that each pixel on the display is not a point source. Moreover, it is physically impossible to create true point-sources, so the pixel intensity profile extends over a certain area. This is similar to the spatial reconstruction in CRTs (Section 2.1.3), but the spatial reconstruction of matrix displays differs in two main aspects. Firstly, the image on a matrix display is discrete in vertical and horizontal dimension, whereas the scan lines on the CRT represent a continuous horizontal dimension. Secondly, pixels do not overlap in matrix displays 6. Therefore, a constant intensity area in the image is only represented correctly if the pixels exactly touch each other, without dark area in between. In practice, this is not the case, since there is some area needed for address lines, pixel separation, logic, etc. These two properties are common for all matrix displays, but the details of the pixel aperture will vary from type to type, as explained next for LCD and PDP. A further discussion on the quality of the spatial reconstruction in relation to the number of pixels and the pixel aperture will follow in Chapter 3. For LCDs, a pixel is defined by the electrodes on both substrates, which are transparent to transmit light from the backlight. The LCD spatial reconstruction profile is determined by the shape of the transparent part of the matrix. A transistor switch is present in each pixel to perform the selection process needed for addressing, so the LCD spatial reconstruction profile typically has a shape as shown in Figure 2.11a. A part of the pixel matrix is shown in Figure 2.11b (See also Section for the relation between pixels and color). Because it represents transmission of light, the LCD spatial reconstruction profile is simply called the pixel aperture. For a high brightness display, it is important to make the pixel aperture area as large as possible 7. In the remainder of this thesis, we will use the simple term aperture also to describe reconstruction profiles in general, so the apertures in this section are the spatial apertures. In this definition, an aperture represents light intensity as a function of space and/or time. The PDP spatial reconstruction process is characterized by the PDP spatial pixel aperture. Figure 2.11c shows a typical aperture, which follows from the 6 At least not in non-projection (direct view) displays. If there is some form of optical manipulation, such as a projection onto a separate screen or an optical scattering foil on top of the screen, pixels can overlap. However, we shall focus on direct view displays 7 With fixed size pixel circuitry, this means that the brightness of an LCD display will decrease when the number of pixels increases. 28

44 2.2 Flat panel displays: LCD and PDP Figure 2.12 LCD temporal step response. construction of a PDP (Figure 2.9): PDPs contain separate cells or channels, of which the walls are coated with phosphors. The walls separate pixels in the horizontal direction, giving the aperture a sharp edge. In the vertical direction, the pixel aperture is defined by the profile of the gas discharge, which takes place between the two row electrodes in each pixel LCD temporal reconstruction FPD temporal reconstruction is quite different from that in the CRT. In the CRT, the light emission decays rapidly after the addressing has taken place. A pixel in an active matrix display can emit light also at times during the frame period when the pixel is not addressed. This requires some form of memory in the pixel to store the driving signal. Temporal reconstruction in LCDs and PDPs are so much different that they are treated separately, LCD in this section, and PDP in the next section. As explained in Section 2.2.1, an LCD is addressed by establishing a connection between rows and columns in the matrix, which selects a row for addressing. All the pixels on the selected row can then be driven via the column electrodes to the voltage corresponding to their desired gray level. In reaction to a new voltage, the LC molecules in each pixel will re-orient. This changes the pixel transmission, which modulates the light from the backlight to create a certain intensity at the pixel. When the addressing moves to other rows, the pixels are disconnected from the columns again, to prevent that the pixels receive driving signals intended for other rows in the matrix. Otherwise, the pixels would react to a different (average) voltage, and it is impossible to reach sufficient contrast for high resolution panels [2]. High resolution LCDs use an active matrix, which stores the voltage on a pixel while it is disconnected from the driving signal, i.e. the voltage on the pixel follows a sample-and-hold characteristic [114]. The required memory effect is realized by including a capacitor and a switch, usually based on Thin Film Transistor (TFT) technology [17, 116], in each pixel. These are visible in Figure 2.11a,b as a dark notch at the corner of each pixel. The LC cell voltage represents only the first step in the LCD temporal display process. The total temporal response of the LCD pixel is also determined by the LC response and the backlight. The LC molecules in the pixel typically need several ms to re-orient after a change in voltage, so the cell transmission changes much slower than the cell voltage. The temporal response is completed by the 29

45 Chapter 2 Display types and properties a Figure 2.13 The PDP temporal response has a complicated profile that depends on the gray level. a) The PDP response as a function of time and input image intensity (normalized gray level) for a PDP with 6 binary distributed subfields, b) intensity scale for reference. b backlight intensity, which is modulated by the LC cell transmission. Figure 2.12 shows the step response of a typical LCD pixel. In this section, we only outline LCD temporal response characteristics. In Chapter 5, we will further model the LCD temporal response, e.g. how to derive the LCD temporal aperture. In the example of Figure 2.12, the backlight intensity is constant, so the resulting temporal characteristic is mostly determined by the LC-response: it takes more than a frame period for the light intensity to reach the maximum value of the step response. In Chapter 5 we will further discuss the consequences of the LCD temporal response characteristic PDP temporal reconstruction The active matrix memory effect is realized in PDPs with the charge that is deposited after the plasma discharge. The PDP charge and discharge procedure was introduced in Section It explains the opto-electronic effect and the gray level transfer curve, and leads to the concept of subfields. However, the subfields cause the temporal addressing and reconstruction of PDPs to be intertwined in a complicated manner. Temporal addressing in a PDP can be performed on a subfield-basis. Each subfield represents a separate time instance in which an image can be displayed. A subfield image is reconstructed with a constant light emission with the duration of the sustain phase 8. The subfield image, however, only contains two gray levels: on or off. It is difficult to consider a subfield as a truly separate image, because the interpretation of its gray levels heavily depends on the other 8 We can neglect the influence of the individual sustain pulses because they are short compared to the phosphor decay time. 30

46 2.2 Flat panel displays: LCD and PDP subfields in the field (see Equation 2.3). PDP temporal addressing can also be considered on a field-basis, i.e. the more traditional view where the display is addressed with a new image signal once in each field. The subfields are then considered as part of the opto-electronic transfer, which allows the image to be displayed with a larger range of gray levels. In any case, the temporal response of a PDP has a complicated profile that changes with gray level. Figure 2.13 shows the temporal response for a PDP with 6 binary distributed subfields, as a function of time and input image intensity (normalized gray level). The complications surrounding temporal addressing and reconstruction in PDPs will be discussed in more detail in Chapter Interlace in FPDs As mentioned in Section 2.1.2, interlace addressing in CRTs allows a doubling of the (perceived) number of lines. In matrix displays, however, the use of interlaced addressing is much less common. In this thesis, the topic of interlace, or spatio-temporal addressing, will not be discussed in detail. A short discussion on the relation between interlace and FPDs is nevertheless appropriate. There are two main reasons for using a non-interlaced, i.e. progressive, addressing format in matrix displays. First, the spatial addressing format is fixed by means of the row and column electrodes. The (perceived) number of lines on the display cannot be increased by addressing only half of the lines in each field. This is a major difference with CRTs, where the spatial and temporal addressing format is determined electronically, by means of the deflection system. The spatial and temporal addressing format can be freely chosen in CRTs, without major changes to the display design. In particular, interlace does not increase the line frequency, which determines much of the complexity of the deflection system. It has been shown that in CRTs, an interlaced addressing format provides the best quality at a given line frequency [10]. Another reason for using progressive addressing in matrix displays, is an increased visibility of interlace artifacts relative to the CRT. Opposing the advantage of increased resolution, interlace addressing also causes artifacts like line flicker, line crawl and resolution loss in moving image parts [47, 130]. Line crawl, for example can be reduced in CRTs by increasing the spot size, i.e. changing the spatial reconstruction. In matrix displays the spatial reconstruction is fixed, and particularly the non-overlapping pixels aggravate line crawl. As an exception to this rule, matrix addressing is combined with interlace in ALiS type PDPs [70]. In these displays, a discharge can take place on either side of a row electrode, by applying appropriate voltages to either the upper or lower row electrode. This allows interlace addressing, without increasing the number of physical rows on the panel. Although interlace may not seem a good addressing format for matrix displays, it is still advantageous for reducing transmission bandwidth [47], and in use in many transmission standards. A consequence of the fact that FPDs are addressed progressively, is that there is no difference between a field and a frame, each frame consists of only 31

47 Chapter 2 Display types and properties a Figure 2.14 In a color matrix display (a) each pixel consists of several primary color subpixels, here in the vertical stripe arrangement (b) (repeated in color in Figure 3.13). b one field. Normally, frame refers to a full image (it originates from film technology), where field refers to a part of the image that is scanned/addressed as a continuous group, e.g. the subset of lines with interlace. Either of the two terms can be freely used to describe temporal addressing related topics in progressive FPDs, and this loose terminology is applied throughout the display industry. This can lead to some confusion, especially since subfields are used when a field is further divided into parts (as with PDP and temporal color synthesis). But as long as it is clear whether the display is addressed progressively, this should be no problem. We will therefore use frames and fields at our convenience Color reproduction Color tri-stumulus theory (Section 2.1.4) requires that each point of the screen can generate a variable mixture of three different primary colors, which is achieved by the color synthesis method of the display. Both spatial and temporal color synthesis can be applied in matrix displays. Just as the CRT, most matrix displays reproduce color by spatial color synthesis. Each pixel on the display contains three separate subpixels, each of which produces one of the primary colors. When the subpixels are close enough, color blending occurs, and each pixel can reproduce any color in the display gamut by driving its subpixels with the appropriate mixture of primary driving signals. Chapter 3 provides a more detailed discussion, as well as some exceptions to this simple view. Note that spatial color synthesis in matrix displays complicates the definition of what is a pixel, which is also discussed in Chapter 3. Figure 2.14 shows the most widely used arrangement of color subpixels: the vertical stripe arrangement. It consists of a rectangular array of, more or less square, pixels, each consisting of three, horizontally adjacent, vertically elongated subpixels. The method of generating three colors of light is different for LCDs and PDPs. LCDs incorporate a color filter in the optical stack of each subpixel [143]. This color filter only transmits a part of the light spectrum that corresponds to one of the three primary colors. The lay-out of the (sub)pixel matrix, 32

2.3 Impact of different display properties Figure 2.15 Color sequential display, or temporal color synthesis. Each field consists of three subfields, that each produce a different primary color.

48 2.3 Impact of different display properties Figure 2.15 Color sequential display, or temporal color synthesis. Each field consists of three subfields, that each produce a different primary color. The viewer cannot discern the individual fields, and blends the three fields into a full color image. in combination with the lay-out of the color filter, determines the resulting color subpixel arrangement. PDPs use different types of phosphors, that emit light corresponding to the primary colors, in adjacent cells or channels. The combination of the cell or channel lay-out and electrode lay-out (determining the discharge position), determines the resulting subpixel arrangement. LCD displays can also mix the three primary colors in time in stead of space, i.e. temporal color synthesis, which is more commonly called color sequential display [153]. In this method, each pixel on the screen is able to produce the three primary colors at the same position, but not at the same time. Instead of dividing a pixel into spatial subpixels with the three primary colors, each field is divided into three subfields, each with one of the primary colors, as illustrated in Figure This is achieved by having the backlight produce only one of the primary colors in each subfield, which has the advantage that no color filters are needed for direct view displays. In projection displays, color sequential operation is often used, because for these displays it gives the advantage of using only a single display panel in stead of three. By driving each pixel to the correct transmission state at each color subfield, the whole gamut can be reproduced at this single pixel in each field. At video field rates, the viewer is not able to distinguish the individual primary color fields, and blends them into the intended color mixture, much like with spatial color blending. Nevertheless, as will be explained in Chapter 5, color sequential displays suffer from artifacts that can only be prevented using much higher field rates. 2.3 Impact of different display properties Table 2.1 summarizes the main differences between the CRT, LCD and PDP display principles that were discussed in the previous sections, categorized ac- 33

49 Chapter 2 Display types and properties Property CRT LCD PDP Opto-electronic effect Opto-electronic transfer Spatial addressing Spatial reconstruction Temporal addressing Temporal reconstruction Color reproduction Electron beam + phosphor Liquid Crystal + polarizers, modulating a backlight Gas discharge + phosphors Gamma curve S-curve Linear transfer Scanned electron beam Gaussian spot Scanning beam: vertical flyback TFT active matrix Transparent part of matrix Repeated addressing of all rows < 1ms pulse Sample & hold + Liquid Crystal response Shadowmask + phosphors Color filters Passive matrix Vertical: discharge profile, horizontal: column channel Address-display separated subfields Subfield distribution Phosphor-coated cells Table 2.1 The main differences between CRT, LCD and PDP. cording to the display functions as introduced in Section [80]. Considering the implications of these differences, the Section first looks at the image quality that can be attributed to the display itself. We call this display image quality, which represents an upper limit to the total image quality that can be expected from the system, and which is fully determined by the display characteristics. However, the display is part of the whole video signal chain, which has two major consequences. First, image quality is not necessarily improved by improving the display itself. The final image quality also depends on the other parts in the system, notably the source and signal processing for transmission, as is briefly discussed in Section Second, the display properties may influence the functionality in the rest of the chain. Because display properties are diverging from source properties, signal processing is required to convert source format to display format. The underlying signal processing algorithms also have a big impact on image quality. A short overview on this topic is given in Section 2.4, leading up to the detailed investigation of the relation between display properties and video processing algorithms in the remainder of this thesis Image quality The differences between the properties of CRT, LCD and PDP have an impact on image quality. The topic of image quality on displays has been widely investigated in the past, e.g. in [59, 136, 64, 54, 91, 168, 123]. In this thesis, we 34

50 2.3 Impact of different display properties focus on spatial and temporal resolution, and how to improve it using the display specific properties as introduced in this chapter. To put these two aspects in the context of the other display properties, we discuss some general aspects of display image quality in this section, structured according to several attributes of image quality [34]: Luminance, black level and contrast The opto-electronic effect of a display produces light between a minimum level ( black ), and a maximum level ( white ). The peak luminance of a display, i.e. the intensity of the white level, is an important aspect of its quality. Combined with the black level, it also determines the maximum contrast of the display, which is defined as the ratio of white level over black level. Opto-electronic transfer and number of gray levels The opto-electronic transfer determines the intensity of gray levels between the black level and white levels. In a display, it is related to image quality via the number of gray levels, i.e. to the quantization scale (number of bits) of the driving signal. With fewer gray levels, the distribution of their intensities from black to white becomes more important. The closer this distribution is to the perceptual optimum, which is close to a gamma curve (see Section 1.1.2), the less visible quantization artifacts will appear. The opto-electronic transfer also influences the perceived brightness of the display, when the transfer is not accurately compensated earlier in the video chain. For example, with an overall γ-exponent (equation 2.1) smaller than 1, the image appears brighter. Geometry and uniformity The image on the display should be geometrically undistorted. This means for example that the edges of the image form a perfect rectangle. Also, display characteristics such as brightness and color, should vary as little as possible over the screen. This also holds in the temporal dimension: display characteristics should not vary over time, since image instability is detrimental to image quality. Color gamut The larger the color gamut, i.e. the more saturated the primary colors, the better the display quality. The gamut, however, does not need to be larger than the colors of the input image. This defines an important trade-off in display design, since saturated primaries are usually less bright. Spatial resolution The spatial addressing format and reconstruction determine the resolution of the display. The higher the resolution, the better the display quality. Increasing the number of lines and pixels on the display increases quality, whereas decreasing the size of the spatial reconstruction only increases resolution up to the limit where pixel structure artifacts cause a decrease of quality. Chapter 3 will deal with this in detail. 35

51 Chapter 2 Display types and properties Temporal resolution The relation between temporal addressing, temporal reconstruction and display image quality is quite complicated. Chapter 5 will deal with this in detail. Other factors There are many more display properties that determine display image quality, most of which we do not cover in this thesis because they do not directly relate to video processing. It is however appropriate to mention a few here. For example, many displays (such as LCDs) show a large variation of characteristics over the viewing angle. The brightness, contrast and color gamut can decrease substantially at large viewing angles, which has a large impact on the display quality. Also, the reflection of ambient light on the display can reduce the contrast of the display. Finally, an important aspect of display image quality, is the display size. With everything else being equal, a larger display looks better, i.e. is more impressive, than a small display, which can also be regarded as an aspect of image quality. Next, we will briefly explain the effects of the differences between CRTs and FPDs (Table 2.1) on these image quality attributes. Some of the biggest differences in display properties between CRTs and FPDs are found in the areas of spatial and temporal resolution, and these topics will be covered in detail in the remainder of this thesis (Chapters 3 and 5). First of all, there is a large difference in the opto-electronic effects. CRTs and PDPs use emissive effects, while LCDs use a transmissive effect. In LCDs, the light-generating function is separated from the image intensity modulation function, which in principle allows light to be generated very efficiently, but this light must also be blocked if the local image intensity is low. Emissive displays only generate light where it is needed. Therefore, a major difference between LCDs on one side, and CRTs and PDPs on the other, is that power consumption of the first is almost independent of image brightness, while the latter require more power for brighter images. In CRTs, for example, the maximum intensity decreases when the average intensity on the screen increases, because there is a power limitation in the electron gun. LCDs are limited by the maximum intensity of the backlight, independent of the image. Therefore, transmissive displays show low contrast in dark images, because the light cannot be blocked completely, which raises the black level. Emissive displays typically show low contrast in bright images, because the white level decreases. On the other hand, CRTs and PDPs can generate very high peak intensities if the average is low, which results in very high contrast in dark images. The opto-electronic transfer in CRTs is close to the perceptual optimum, which is not the case for LCDs and PDPs. Particularly in PDPs, the limited number of gray levels combined with a linear transfer, gives visible noise and contouring artifacts in dark image parts. There is also a main difference in the localization of the image intensity modulation effect. In CRTs, the intensity is modulated at one point, i.e. once 36

52 2.3 Impact of different display properties per display, at the electron gun. LCDs and PDPs create the modulation in the display matrix, i.e. once per pixel, on the display screen. This has some major consequences for display image quality. LCDs and PDPs are for example more susceptible to image non-uniformities caused by pixel-to-pixel variations, or even defect pixels. Non-uniformities in a CRT are typically more gradual over the display. On the other hand, the picture on a CRT can vary geometrically, because the electron beam deflection can vary for example with the average image intensity. The picture on an LCD or PDP is much more stable, under varying conditions, because its geometry is fixed by the matrix. Moreover, the resolution of a CRTs can decrease at high intensities or at the edges of the screen, due to electron spot size variations. The fixed pixels in FPDs do not change shape over the display or over time. Pixels in FPDs can be much smaller than typical spot sizes in CRTs, making FPDs perform much better in the areas of spatial resolution. Moreover, the use of spatial color synthesis in FPDs means that spatial addressing and reconstruction is coupled to color reproduction. This has major consequences for spatial resolution, as will be further explained in Chapter 3. The temporal responses of CRTs, LCDs and PDPs also show remarkable differences. CRTs create very short light emissions, whereas LCDs and PDPs distribute light generation over much longer times. This has severe consequences for temporal resolution, e.g. resulting in FPDs having problems with rendering fast moving images, as is discussed in detail in Chapter System aspects The major drive in the display industry has always been to push the envelope of display properties; higher resolutions, larger color gamuts, larger contrast and brightness, more gray levels, and of course also larger sizes. And as long as we can still distinguish the original from the displayed image, there seems to be room for improvement. It makes no sense to increase the capabilities of the display (far) beyond what can be recorded and/or transmitted. Therefore, each display property can only be related to overall image quality, when the whole signal chain is taken into account. The display can be the limiting factor in one characteristic, whereas in another the transmission or camera can determine the image quality. The human viewer can also be seen as part of the chain. In a display system with ideal image quality, the human viewer would be the limiting factor. In this respect, brightness and contrast are different from the other properties, because the display is nearly always the limiting factor for brightness and contrast. This is because the actual black and white level of the original scene are not transmitted, but only the gray levels between an essentially undefined maximum and minimum. This gives the display freedom to put the black and white levels at as low and high levels as possible, which has a direct impact on displayed image quality. There is also a difference when we consider computer graphics: these are abstract representations of an image and, given enough compute resources, can be rendered at more or less any format. Consequently, the display is always the 37

53 Chapter 2 Display types and properties a b 720x576 50Hz interlace YUV color space gamma pre-corrected 8 bits per color De- interlacing Frame-rate conversion Spatial scaling Spatio-temporal (scanning) format conversion Color space conversion El.Opt. transfer and quantization Color & tone conversion 854x480 60Hz progressive EBU primaries linear transfer 6 bits per color Figure 2.16 Video and display processing chain. a) General function: convert from source to display format. b) Conversion chain for PAL to PDP. limiting quality factor in such applications. It is therefore not surprising that the drive for ever increasing resolution has mainly come from computer applications, whereas (high-definition) TV applications have followed much slower. 2.4 Display- and video processing A display system does not stand alone, because it needs a video signal from the preceding (TV) signal chain. Usually, the video signal has to be processed before it can be connected to the display. Typically, the video can be in a format that the display cannot handle 9, which implies that the video signal must undergo video format conversion processing. The next section presents a short overview, in the context of FPDs, of common format conversions, and Section discusses some of the consequences of this processing for the resulting image quality Video format conversion Whenever the display format and video input format differ, video format conversion processing in the display chain is needed, simply to allow the display to be connected to the input signal (see Figure 2.16). Therefore, the display processing chain grows in complexity depending on the ability of the display to adapt its format to the source. For CRTs, the display processing chain is relatively simple, not only because the (spatial and temporal) display format of a CRT can vary, but also because display format and video signal format were developed simultaneously. This is definitely not the case for the display format of FPDs, since their characteristics differ in almost all aspects from the CRT, and therefore also from the majority of video formats. The introduction of new digital video formats [119] has actually not decreased this, but just introduced the need for more video processing in CRTs. 9 Video can also be in an encoded format after reception, e.g. in composite or digital compressed format [119], so it must be decoded first. This topic is outside the scope of this thesis. 38

54 2.4 Display- and video processing Property Input (SD) Display (PDP) Required algorithm Spatial addressing (pixels N x x N y) Temporal addressing (field rate) 720x x480 Image scaling / spatial sample rate conversion 50 Hz 60 Hz Frame rate conversion Interlace factor 2 1 De-interlacing Opto-electronic transfer Gamma corrected Linear; 6 binary subfields gamma correction; subfield generation Color space Y UV RGB Color space conversion Number of gray levels Quantization / halftoning Aspect ratio 4:3 16:9 Aspect ratio conversion Table 2.2 Format parameters and related video format conversion algorithms: from digital SD to an example PDP. The properties of the video format and the display format define the required conversion chain. Figure 2.16 and Table 2.2 show the example for rec. 601 digital SD video [119] to an example PDP 10 chain. The video format conversion chain represents a gradual transition from input video format to display format. Parts of the chain can be called display processing, but there is no definite boundary between video and display. The whole system is targeted at a single function: convert video signal to light. Table 2.2 also lists the most common format conversions. In this chapter, we will not discuss any of these algorithms in detail. For an extensive overview, see e.g. [46, 45, 47, 12, 67, 130, 156]. The next chapters will discuss video format conversion algorithms in their relation to static and dynamic resolution of FPDs Display system image quality Displayed image quality largely depends on the display properties such as contrast, color gamut, and resolution. We can divide the properties of a display in two categories: those that are fixed, and those that can be varied, within limits, according to the input signal. For all properties that are fixed, format conversion is required. These format conversions are equally important for the final image quality, and because FPDs have more fixed properties, this is only more the case for these displays. Figure 2.17 shows some typical artifacts that occur in low-quality format conversions. Whether or not the display is the limiting factor in some respect, image quality can be increased by applying higher quality format conversion algorithms. Display system image quality can therefore be improved in two main ways: by improving the display itself, or by improving the signal processing. 10 These parameters have been chosen as a simple example, this is not a modern PDP. 39

Chapter 2 Display types and properties Conversion Low quality High quality gamma correction: no correction or corrected Quantization: rounding or error diffusion Image scaling: pixel dropping or

55 Chapter 2 Display types and properties Conversion Low quality High quality gamma correction: no correction or corrected Quantization: rounding or error diffusion Image scaling: pixel dropping or polyphase filtering De-interlacing: line repeat or motion compensated interpolation Frame rate conversion: frame repeat or motion compensated interpolation Figure 2.17 Example images resulting from several format conversion algorithms Note, that it is also possible to improve image quality by means of what we will here call image enhancement [46], i.e. by actually making the image look better than the original. For example, viewers generally prefer images with increased sharpness, contrast and color saturation over the original. It is difficult to totally separate format conversion and image enhancement. It is not always clear when an improvement in image quality at the end of the display chain is due to an enhancement (when the difference between original and displayed image is increased), or due to improved format conversion (when the displayed image is actually closer to the original). In fact, it is not even possible to judge an image 40

56 2.5 Video processing for flat panel displays Figure 2.18 Video processing for flat panel displays: how to take into account specific display properties in the video processing chain? enhancement algorithm independently from the display chain, since we can only look at an image when it has been converted to light. Image enhancement is outside the scope of this thesis, but by means of the analysis and video processing described in this thesis, we also hope to contribute some insights in this area. Format conversion processing can also be divided in up-conversion or downconversion, when the output or the input side has better characteristics, respectively. Conversion algorithms are always bounded by the quality of the source and of the display, but should not limit the performance of the total system. Therefore, display processing is more a matter of getting a performance from the display system that approximates the display image quality, i.e. that was already there in potential (otherwise we might call it image enhancement ). 2.5 Video processing for flat panel displays Video format conversion is simply required to match the source format to the display format. However, there are a number of characteristics of FPDs, that are normally not taken into account with traditional format conversion. This thesis focuses on the following question (illustrated in Figure 2.18): What can be improved on image quality when FPD specific characteristics are taken into account in the video processing chain? In order to answer this question, we will make use of the description of FPD properties from a signal processing perspective, as presented in this chapter. Looking at Table 2.2, we can see that format conversion deals with display properties like spatial addressing (number of pixels), temporal addressing (frame rate), interlace factor, opto-electronic transfer (gamma), color space (primary colors), bit-depth and aspect ratio. However, there are many more display properties, for example the color subpixel arrangement, spatial and temporal reconstruction that are not used in traditional format conversion. In the following chapters we will further analyze the FPD properties, and investigate how these can be taken into account in the video processing chain, looking for algorithms that improve image quality. We will see that for some properties, a relatively simple change to existing algorithms can do the job, but for other properties, new algorithms are needed. We will also see that FPD properties can lead to image artifacts that require special algorithms to repair. 41

57 Chapter 2 Display types and properties As a final note in this section, format conversions are not always separate. When doing a format conversion for one property, it can be advantageous to take into account the other display properties. One example that is not covered in detail in this thesis (partly because it is not really FPD specific), is that halftoning can be improved when the display color primaries are taken into account [78]. 2.6 Conclusions This chapter has presented an overview of the most relevant display types, both old (the CRT) and new flat panel displays (LCD and PDP), starting from a signal processing perspective. The main function of a display has not changed, i.e. to turn video signal into visible light, but FPDs use totally different optoelectronic principles, resulting in different characteristics, than the CRT. These characteristics have been described according to the basic display properties of spatial and temporal addressing and reconstruction, opto-electronic transfer and color synthesis, as summarized in Table 2.1. The different characteristics of FPDs have a number of consequences, regarding image quality and system complexity. The image quality inherent to the display ( the display image quality ) critically depends on these properties, for example regarding contrast, uniformity and viewing angle. Different characteristics of LCD and PDP, mainly in terms of handling varying addressing formats and compatibility with opto-electronic transfer, require video format conversion processing in the display signal chain that was not needed with CRTs. Display properties must be seen in the context of the whole (TV-) signal chain, which also includes the human viewer. Display image quality only partly accounts for total image quality, the source and signal processing form important parts too. The format conversions in the display signal processing chain convert the input video format into the display format, and as such there is no definite boundary between video signal processing and display signal processing. Furthermore, these video processing functions do not only have a large impact on the final image quality, but they also provide opportunities for display specific video processing algorithms, to further increase image quality. In the remainder of this thesis, we will further investigate the characteristics of FPDs, how they relate to image quality, and how image quality on these displays can be improved by taking into account these properties. 42

58 CHAPTER 3 Static display resolution Understanding static resolution from display characteristics As discussed in Chapter 2, spatial addressing and reconstruction of FPDs are quite different from CRTs. CRTs have a continuous scan with a Gaussian spot profile, whereas FPDs use matrix addressing which defines distinct pixels with more or less square profiles. This difference has two major consequences. First, spatial addressing in a CRT is controlled electronically, so it can be adapted to the input signal. The spatial addressing format of FPDs cannot be adapted to the input signal, since pixel locations are fixed on the screen. In FPDs, the incoming video must be converted to the addressing format of the display. This requires video signal processing ( scaling ), which is the topic of Chapter 4. Second, the differences in spatial properties between FPDs and CRTs have an impact on display image quality, which is the topic of this chapter. Figure 3.1 illustrates the difference between constructing an image using scan-lines with a Gaussian profile as in a CRT, or from sharply bounded pixels as in FPDs. The spatial display properties determine the display s resolution, or more specifia b Figure 3.1 Simulation of slanted lines on a CRT (a) and FPD (b). 43

Chapter 3 Static display resolution cally, the spatial or static resolution. However, there is no unique definition of what actually is resolution. In Section 3.

59 Chapter 3 Static display resolution cally, the spatial or static resolution. However, there is no unique definition of what actually is resolution. In Section 3.1 we first discuss different ways to describe resolution, such as number of pixels, pixel size, sharpness and perceived resolution. A model of the spatial aspects of the display signal chain is presented in Section 3.2, and applied to matrix displays in Section 3.3 to analyze FPD spatial properties in relation to static resolution. Spatial color synthesis, i.e. the shadowmask for CRTs and color subpixels for FPDs, is closely related to spatial addressing. Therefore, the display resolution is also influenced by the color synthesis method, and taking this into account results in improved resolution for FPDs. This topic forms the major part of this chapter, and it is covered starting from Section 3.4 with the analysis of resolution of a color matrix display. Taking into account the color subpixels results in a different resolution, as analyzed in Section 3.5. Section 3.6 describes several alternative subpixel arrangements, and their resolution is analyzed in Section 3.7. Conclusions are finally drawn in Section Resolution When considering the properties of a display in the spatial dimension, the concept of resolution stands central. The resolution of a display is an important parameter to describe the quality of images on a display, since it describes how well the display can resolve image details. Resolution can be characterized in multiple ways, by using one or more of the following characteristics [119, 64, 7]: the number of pixels on the display, the display size, the viewing distance, and the spatial pixel aperture, each leading to different types of resolution. In this thesis, we will use the term resolution to loosely describe several aspects related to the spatial display properties, and we will use one of the specific types below when appropriate. Pixels A display with more pixels can represent more details (Figure 3.2). Display resolution, in terms of number of details, is often measured by the number of line pairs that can be displayed. A line pair, or a cycle, is defined as a combination of a black and a white line. Two pixels are minimally required Figure 3.2 Increasing resolution by increasing the number of pixels. 44

60 3.1 Resolution Figure 3.3 Increasing resolution by decreasing the pixel pitch at constant number of pixels, i.e. by decreasing display size. to reproduce one cycle 1. Therefore, the number of pixels 2 on the display is an important measure of resolution. A pixel count specification is common practice for computer monitors, where numerous standardized resolutions have appeared [69], e.g. VGA (640x480), XGA (1280x768), UXGA (1600x1200), etc. The introduction of matrix displays and digital transmission has also led to the use of pixel count as display specification in TV applications, e.g. WXGA (1366x768), or several HD formats (1280x720, 1920x1080) [119]. Size Next to the number of details, another measure of resolution is related to the size of the details. This type of resolution is determined by the pixel size, or more precisely, the pixel pitch or number of pixels per length (measured in pixels or dots per inch, i.e. ppi or dpi ). It is not only determined by the number of pixels, but also by the display size. With decreased pitch at the same number of pixels, a display can reproduce smaller details, but not more (Figure 3.3). Viewing distance The viewing distance plays an important role when dealing with resolution as perceived by the viewer (Figure 3.4). A display can seem to have a high resolution from a distance, but only a low resolution when viewed from up close. The number of pixels, or the pixel pitch, are not the determining quantities here, but the number of pixels per degree of visual angle. For a viewer at a viewing distance of d v, a distance d s on the screen subtends a certain visual angle α vis : α vis = 2arctan( d s 2d v ) (3.1) The visual angle α p of one pixel with pitch p of this display gives us a relation between number of pixels N, display width W (or height H), and viewing distance d v : tan(α p ) = p d v = W Nd v (3.2) 1 This corresponds to the highest spatial frequency that the display can reproduce, a.k.a. the Nyquist frequency (Section 3.3.1). 2 Measured in horizontal and/or in vertical direction, or as the total number of (Mega)pixels. 45

Chapter 3 Static display resolution Figure 3.4 Increasing resolution by increasing the number of pixels per degree of visual angle, i.e. increasing the viewing distance at constant number of pixels and pitch.

61 Chapter 3 Static display resolution Figure 3.4 Increasing resolution by increasing the number of pixels per degree of visual angle, i.e. increasing the viewing distance at constant number of pixels and pitch. The viewing distance relative to the pixel pitch of the display, can be used as a visual resolution parameter, R, that describes this third type of resolution: R = Nd v W (3.3) Two displays with different size and number of pixels can visually have equal resolution if R is the same, i.e. by viewing them from a corresponding distance (see also Section 3.3.4). Since this last type of resolution is not strictly related to properties of the display, it is less useful as a display specification. Modulation transfer function Another way to characterize resolution, is via the well known Modulation Transfer Function (MTF) [128, 28, 64, 130]. The MTF describes how well a display (or an optical system in general) can reproduce signals of varying spatial frequency 3, i.e. of varying detail. The MTF is measured by determining the modulation depth, i.e. the minimum and maximum intensity I min and I max, of the original and displayed version of a periodic (sinusoid or square wave) pattern (Figure 3.5): MTF = (I max I min ) display (I max I min ) orig (3.4) with, for example, a horizontal sinusoidal pattern of frequency f: I orig = I min + (I max I min )(1/2 + 1/2 sin(2πxf)) (3.5) The MTF corresponds to the spatial frequency transfer of a display system, i.e. how much each spatial frequency is attenuated by the display. The resolution, as measured in number of cycles (per mm, picture width, etc) is usually determined by the frequency where the MTF drops below a certain limit, for example 10%, where the frequency is considered to be no longer resolved. Therefore, resolution is represented by a single value 4, but the MTF is a function that can convey 3 In Chapter 5, we discuss the MTF in the temporal dimension. 4 Actually, a resolution can be attributed to each dimension, so spatial resolution has two values, or is even a function of angle, as discussed later in this chapter. 46

3.1 Resolution Figure 3.5 The MTF measures the reduction of modulation by the display, as a function of frequency. more information about the resolution-related quality of the display.

62 3.1 Resolution Figure 3.5 The MTF measures the reduction of modulation by the display, as a function of frequency. more information about the resolution-related quality of the display. While the resolution only indicates how many lines a display can resolve, the MTF also shows how well it can resolve them. This corresponds to another aspect related to display resolution: image sharpness. Increasing the resolution of an image, according to any of the previous definitions, will make the image appear sharper, because details will be smaller, and edges are steeper. However, without increasing the resolution, images will also appear sharper when the MTF has higher amplitude at high frequencies. This is illustrated in Figure 3.6, that shows two different MTFs, having equal MTF at 10%, and the corresponding images. The MTF is an objective measure, but the impression of sharpness is very subjective, and as such difficult to define accurately. Sharpness is perhaps better explained by its opposite, i.e. blurring, that corresponds to a low MTF at high frequencies. This also shows that sharpness and resolution are closely related, and we use the concept of perceived resolution to combine them. Perceived resolution is the resolution as perceived by the viewer. Perceived resolution is related to the sharpness as described above, but it also depends on what the viewer can actually see. Therefore, it also depends on the viewing distance and the spatial frequency sensitivity of the human visual system. Perceived resolution is therefore more a concept, than an accurate measure, and it ultimately requires subjective tests to measure. In Sections and 3.3.4, we will discuss other aspects of perceived resolution, and in the remainder of this a b c Figure 3.6 MTFs with different sharpness and corresponding images. a) Two MTFs that are equal at the 10% point. b,c) Images corresponding to the dashed and drawn curves, respectively. 47

63 Chapter 3 Static display resolution Figure 3.7 System model of the display signal chain in the spatial dimension. chapter we will use the concept of perceived resolution in several places. Finally, the spatial reconstruction profile of a display is closely related to the MTF. In Sections (Figure 2.3) and 2.2.4, it was already indicated that a wide spatial reconstruction profile causes loss of details. Especially when adjacent pixels start to overlap, less details can be reproduced, or in the MTF context, details can be reproduced less well. Later in this chapter, we show that this is usually not the case for matrix displays. But in general, the spatial aperture should also be taken into account when determining resolution. 3.2 FPD spatial display signal chain Clearly, a matrix display can only approximate the light intensities corresponding to the original (space-continuous) image, since it generates a space-discrete ( pixelated ) signal. The display process is defined as the process that converts an image signal to a displayed image. To analyze images on a matrix display, in particular regarding resolution, we use a model of the display signal chain [81, 79], as shown in Figure 3.7. The important parts of the model are sampling, addressing and reconstruction. The display receives a sampled image signal (I s ), which is a discrete representation of the original, continuous image (I c ). Each sample in the input signal corresponds to a position x = [x, y] T in the image, as described by the scanning process (Section A.1). We assume that the number of image samples match the number of pixels on the display in both dimensions, i.e. that the signal has been scaled to the display resolution. Chapter 4 deals with the combination of scaling and display resolution in more detail. The addressing directs each sample to a position on the screen. These positions should correspond to those specified by the sampling process 5. Finally, the reconstruction process displays each sample. In other words, the light emitting/transmitting area (aperture) of the pixel at that position converts the sample value to the corresponding intensity of visible light I d. In the following sections, this model will be used to calculate how an original input image is processed by the display, to result in the displayed image. This image is then analyzed in the frequency domain, to reveal the resolution and some more characteristics of matrix displays. We assume a monochrome 5 In general, the sizes of original image and display have no relation, so all positions are usually expressed relative to the screen/image size (x x/w, y y/h) 48

64 3.2 FPD spatial display signal chain display in these sections for simplicity, but the analysis is extended to color displays in Section Sampling We start with an original, continuous image I c ( x). This image is sampled in horizontal and vertical direction before it is received by the display. Although it is not strictly part of the digital display process, we do include the sampling step in the analysis, to be able to indicate the difference between original image and the displayed image. The sampling typically takes place at the camera, but intermediate conversions during transmission, storage, etc., can also re-sample the signal. We simplify this by combining all (re-)sampling into a single sampling process. The sampled signal is a discrete set (a matrix) of intensity values I s ( n) with n = [n x, n y ] and n N 2 and n x < N x and n y < N y. Each value represents the image intensity at one position in the continuous signal: I s (n x, n y ) = I c ( x) if x = [n x p x, n y p y ] (3.6) where p x and p y are the horizontal and vertical pixel pitches, respectively. This discrete set of points can also be represented by intensity as a function of space, I s ( x), which is only defined at the sample points: I s (n x p x, n y p y ) = I c (n x p x, n y p y ) (3.7) Formally, this is described by multiplying the continuous signal with the sampling grid function, S( x), which is a 2D series of Dirac δ-impulses at intervals equal to the sample spacing p x and p y [130]. where the sampling grid function is defined as S( x) = S(x, y) = I s ( x) = S( x) I c ( x) (3.8) n x,n y δ(x n x p x, y n y p y ) (3.9) In Equation 3.8, we have directly sampled the original image I c, without further modeling the effects of the camera (Figure 3.7). This is equivalent to taking the original image to be the image 6 as it is formed on the imaging device in the camera, just before it is sampled. Therefore, the display actually reproduces this image, not the original scene. In the model, the sampled image is a function of a continuous variable, I s ( x). This is related to the samples as a discrete set, I s ( n): I s ( x) = n x,n y δ(x n x p x )δ(y n y p y )I s ( n) (3.10) 49

65 Chapter 3 Static display resolution Figure 3.8 Addressing and reconstruction on a matrix display Addressing The addressing process (Figure 3.8) directs each sample to a pixel on the display. For the discrete set of points I s ( n), this requires interpretation of the signal to determine the sampled image coordinates. Although these coordinates are apparent from I s ( x), they are implicit in the actual sampled signal I s ( n), so knowledge of the scanning standard 7 is required during the addressing process. The process represents a transformation from sampled image coordinates x s to display coordinates x d. In general: x d = D n ( n) = D( x s ) (3.11) i.e. the addressed position is a function of the sampled position, or of the sample number. Ideally, the addressing should put the image on screen without geometric distortions, so that the position (normalized to image size) of a sample in the original image equals the position on the screen: x d = x s D( x) = x, D n ( n) = [p x, p y ] (3.12) so, in this case, addressing has actually no influence on the image signal: I a ( x) = I s ( x) (3.13) where I a ( x) is the signal after addressing, at position x on the screen. When the addressing does not correspond with sampling as in this example, the image signal will be affected, which is discussed in Sections 3.4 to Reconstruction The reconstruction transforms I a ( x) into the light intensities of the displayed image I d ( x). This can be represented by a convolution of the addressing signal 6 In fact, this is the first instance in the chain where there is an actual image, i.e. a projection of the real scene. 7 The scanning format, as defined by the scanning curve x(t), is included in the signal by means of synchronization signals at the beginning of each line and each field, and the convention that scanning is left to right and top to bottom 50

66 3.3 The resolution of a matrix display and an aperture function, A( x): I d ( x) = A( x) I a ( x) (3.14) The aperture is the intensity profile of a single pixel, which is also called the reconstruction profile, spatial impulse response, or point spread function. In the following, we will use the term aperture. For a matrix display, the aperture is approximately a 2-D box function (See Figure 2.11). As an example, we take the width and height of the aperture equal to the pixel pitch, p x and p y, respectively: { 1, x < px /2 y < p A( x) = A(x, y) = y /2 0, otherwise (3.15) This represents a display with pixels that completely fill the screen, i.e. the fill-factor is 100%. This is normally not possible, due to the matrix electrodes or the color synthesis method. We will nevertheless use this aperture for simplicity in this monochrome example, and discuss other apertures in later Sections. Since the aperture function corresponds to the transformation from video signal to light, it also includes the opto-electronic transfer characteristic of the display 8. In this analysis, we assume that a compensation for this transfer is included in the chain before the aperture function. This assures that, apart from the effects of sampling, the overall transfer from original intensity to displayed intensity is linear (see Section 1.1.2, and figure 3.7). 3.3 The resolution of a matrix display The spatial display process model gives us the intensity, I d ( x) of an image that is displayed on a matrix display. By combining Equation 3.8, 3.13 and 3.14, the displayed image becomes I d ( x) = A( x) [S( x) I c ( x)] (3.16) We will next analyze this image, with respect to resolution and other characteristics, by calculating the frequency spectrum Frequency spectrum of images on a matrix display To obtain the frequency spectrum of the displayed image we apply the Fourier transform (F) to Equation 3.16: I f d ( f) = F(I d )( f) (3.17) [ = Ic f ( f) S f ( f) ] A f ( f), (3.18) 8 The reconstruction process is partly located before, during and after the O-E transfer. Before by means of electrode shape, during by means of the spatial extent of the physical light generation c.q. modulation process, and after because of optical scattering. 51

67 Chapter 3 Static display resolution a Figure 3.9 b Frequency spectrum of images on a matrix display: a) in two dimensions, b) horizontal only. The thin lines in (a) indicate the sampling frequencies f sx and f sy. In b), the original image spectrum is plotted as a reference. where f) = [f x, f y ] is the 2D spatial frequency, I f c ( f) is the Fourier transform of the continuous image, and S f ( f) = F(S)( f) = δ(f x kf sx, f y lf sx ) (3.19) k,l Z is the Fourier transform of the sampling lattice, which is also called the reciprocal lattice because S f ( f) = S -1 ( f). Here, f sx = 1 p x and f sy = 1 p y are the horizontal and vertical sampling frequencies, respectively. Finally, A f ( f) = F{A( x)} (3.20) = sin(πf xp x ) sin(πf y p y ) (3.21) πf x p x πf y p y = sinc(πf x p x ) sinc(πf y p y ) (3.22) is the Fourier transform of the pixel aperture. Figure 3.9 shows the resulting displayed image spectrum according to Equation 3.17, in 2D and in horizontal dimension. In order to show the amplitude of the baseband and the repeats, the spectrum (Ic f ) of the original image has been assumed flat and unity inside the baseband, and zero for all frequencies above (and just below) the Nyquist frequency (f N = 1 2 f s). Figure 3.9 shows that a matrix display is not able to perfectly reconstruct the original image. This would be the case if two conditions are satisfied: the frequencies in the baseband ( f < 1 2 f s) should be passed without attenuation, and all repeat spectra ( f > 1 2 f s) should be eliminated. Clearly this is not the case. Firstly, the baseband is attenuated, which corresponds to loss of high frequency contrast in the image, i.e. a lower MTF, resulting in less sharp images. Figure 3.9 shows that the reduction of MTF is caused by the sinc-shape of the pixel aperture in the frequency domain. Furthermore, repeats are present in the displayed image spectrum at all frequencies except the sampling frequency. These repeat spectra correspond to 52

68 3.3 The resolution of a matrix display pixel structure, and when present, they allow a distinction between continuous images and discrete ones. We can roughly distinguish two types of pixel structure, that we will call DC and AC pixel structure. DC pixel structure appears as a raster over the image (the screen door effect ), and it is most visible in flat image areas, which only contain frequencies close to DC (f 0). DC pixel structure does not appear in Figure 3.9, because the pixel fill-factor is 100%. Therefore, for this particular example, there is no modulation for an input at DC, and flat areas are perfectly reconstructed. AC pixel structure appears on image details, i.e. high ( AC ) input frequencies. Figure 3.9 shows that the repeat spectra are non-zero outside the sampling frequency, so any input signal with 0 < f < f N will generate non-zero repeat components at f = kf s + f. These components appear as jagged edges, mosaicking, or pixelation of the image (See for example Figure 3.1 or 3.10). In fact, perfect reconstruction is not possible for any practical display, because the required ideal post-filter corresponds to a pixel aperture with a sincshaped intensity profile. Such an aperture, would not only have to be of infinite size, but would also require negative light intensities. With physically realizable apertures, the transition from pass- to stop-band is finite, so the amplitude of the repeat spectra is always in trade-off with the suppression of the baseband. Varying the aperture, e.g. the size of the pixels, can therefore trade image blurring for pixel structure. This practical limitation is reflected in the so-called Kell factor The Kell factor for matrix displays The Kell factor [58] is based on experiments by Ray Kell in He measured, in a subjective experiment, the number of lines (cycles) that is effectively resolved in a CRT display system, to balance the horizontal bandwidth with the number of scanned lines. The Kell factor is the ratio of resolvable cycles to the theoretic maximum of half the number of display lines, i.e. it represents the fraction of the baseband that is effectively used. Typical Kell factors reported for CRTs are around 0.7. The Kell factor is closely related to what we call perceived resolution. However, there is no unique definition of Kell factor, and we will also use it as a qualitative rather than quantitative measure (it is therefore also called the Kell effect [119]). Moreover, the Kell factor can relate to a complete system, e.g. including the camera, or for a sub-system only, e.g. the display. We will use it to describe the perceived resolution of displays. Figure 3.10 illustrates the signal-theoretic origin of the Kell factor. The figure shows a zoneplate image that is reconstructed with square pixels. A zoneplate is an image that contains a 2D frequency sweep pattern, so each position in the image relates to a frequency. It is very useful for visualizing display effects related to sampling and reconstruction. Figure 3.10 shows a beat pattern, i.e. a low-frequent distortion, around the Nyquist frequency (the pixel-alternating black and white lines). At some point, this beat pattern will dominate the perception, and reduce the practical use of passing this part of the baseband spectrum to the display. This is the origin of the Kell factor. 53

Chapter 3 Static display resolution Figure 3.10 The zoneplate image. On the left the continuous image, on the right the same image sampled and reconstructed with (80 by 80) square pixels.

69 Chapter 3 Static display resolution Figure 3.10 The zoneplate image. On the left the continuous image, on the right the same image sampled and reconstructed with (80 by 80) square pixels. The beat frequency is caused by the imperfect reconstruction of the display. Baseband frequency components (f < 1 2 f s) still have a repeat version at f, which is mirrored in the Nyquist frequency: f = f s f. These two components 9 interfere, resulting in a low frequency beat pattern: cos(a) + cos(b) = 2 cos( 1 2 a b) cos( 1 2 a 1 2 b) cos(2πx(f s f)) + cos(2πxf) = 2 cos(πxf s ) cos(2πx( 1 2 f s f)), (3.23) The frequency of the beat pattern ( 1 2 f s f) decreases when the baseband component (and its mirror) approach the Nyquist frequency. The beat pattern reflects that the highest frequencies cannot be represented with equal modulation in all phases. This is not a form of aliasing, but is due to the imperfect reconstruction, i.e. the post-filtering that is required to reconstruct the continuous signal from the discrete signal is non-ideal. We can see that the imperfect reconstruction also accounts for a Kell factor in matrix displays. Although the Nyquist frequency can even be reproduced without loss of modulation, the baseband is not free of distortions. Therefore the Kell factor is smaller than 1, i.e. the effective resolution is lower than the number of pixels. However, to further analyze and improve FPD performance related to the Kell factor, or to sampling and reconstruction in general, we need to include more display properties, as will be done in Sections 3.4 to 3.5. In CRTs, the Kell factor also includes bandwidth limitations in the video chain before the display. When this bandwidth is chosen appropriately, the effect of the imperfect post-filter is reduced, i.e. bandwidth is not wasted where the display is the limiting factor. Since we aim at maximizing the display resolution, we will assume in the following that the bandwidth in the preceding video chain is not the limiting factor for overall resolution. 9 The two components are assumed to have equal amplitude for simplicity, which is a good approximation around Nyquist. 54

3.3 The resolution of a matrix display a b c Figure 3.11 2D frequency spectrum of images (from Figure 3.

CRTs Let us compare the spatial properties of the matrix display in the example above to a typical CRT. The spectrum of images on a CRT display (see Section 2.1.

70 3.3 The resolution of a matrix display a b c Figure D frequency spectrum of images (from Figure 3.1) on a matrix display (a) and on a CRT, with optimal spot size (b), and with practical TV spot size (c). The box indicates the baseband area Spatial quality of matrix displays vs. CRTs Let us compare the spatial properties of the matrix display in the example above to a typical CRT. The spectrum of images on a CRT display (see Section 2.1.3), and particularly in the vertical dimension, is also described by Equation 3.17, where the aperture is the Gaussian spot profile (Equation 2.2). Figure 3.11 shows a comparison between the matrix display of Figure 3.9, a CRT with a small spot profile (σ 0.4p y ) as is common in CRT desktop monitors 10, and a CRT display with a typical spot dimension for TVs (σ 0.7p y ). With the larger spot, pixel structure is effectively eliminated, because repeat spectra are suppressed, i.e. there is no undesired signal close to the baseband. However, this happens at the cost of severe loss of image detail, because also the baseband signal is suppressed. The smaller spot shows only moderate AC pixel structure, and retains more details than the large spot. The matrix display is able to reproduce even higher contrast details, at the cost of more AC pixel structure. AC pixel structure, however, is almost invisible at normal viewing distances (See also Sections 3.1 and 3.3.4). What remains is the performance in the baseband, where the matrix display shows higher MTF than CRTs 11. Of course, if the spot size of the CRT would be decreased even further, sharpness (MTF) would increase, but also the pixel structure. Moreover, there are many practical reasons why CRTs cannot decrease spot size to the level of FPD apertures (See also Section 3.5.1). The perfect flat areas combined with the high contrast details of a square pixel matrix display cannot be obtained with a trade-off provided by a Gaussian aperture. However, matrix displays in practice do not have a 100% fill factor [7], and the spatial color synthesis method further prevents the display to produce perfect flat areas. We will show in the next sections that this will actually give FPDs further advantage over the CRT in the spatial dimension. 10 A Gaussian spot with (σ 0.4p y) can be regarded as being close to the ideal trade-off for a Gaussian profile [51], with maximum image sharpness and minimum pixel structure 11 Poynton [119] states that a Gaussian reconstruction gives the best quality, but his example is a zoomed-in image, i.e. the viewing distance is much smaller than what is normal for displays 55

71 Chapter 3 Static display resolution Figure 3.12 Matrix display spectrum and HVS response [92] at optimal viewing distance Perception aspects At the end of the display signal chain, the human visual system (HVS) can also be regarded as a filter that processes the image signal [161, 99, 6]. In the spatial domain, the HVS acts as a low-pass filter. The cut-off frequency of this filter corresponds to the detail visibility limit, i.e. the highest spatial frequency that is just visible. The range of visible frequencies depends primarily on the relationship between viewing distance and pixel pitch (Equation 3.2), but also for example on image contrast and brightness [6]. The relation between the display spectrum and its perception therefore also depends on the viewing distance and the pixel pitch [29]. Figure 3.12 illustrates how the sampling frequency, f s, relates to the optimal viewing distance. This frequency, which corresponds to the frequency of the pixel structure itself, will ideally be (slightly) higher than the frequency at the threshold of visibility. On the other hand, the Nyquist frequency ( 1 2 f s), i.e. the highest frequency that still carries valid, non-aliased, image information, should be (well) within the range of visibility. Figure 3.12 illustrates the frequency response of the human visual system [5, 92] at a viewing distance according to this criterion. The resolution-limit of the human visual system is approximately 60 cycles per visual degree [161, 124, 6], i.e. the eye cannot see a cycle smaller than 1 arc minute. Invisible pixel structure therefore corresponds to a pixel pitch that is smaller than 1 arc minute. Using Equations 3.2 and 3.3, this results in R 3400 pixels [119]. This corresponds, for example, to the well known rule of thumb for PAL displays (N = 576 lines), that the viewing distance should be 6 times the screen height. Similarly, for full HD resolution (N = 1080 lines) we find a distance of 3H. In this viewing condition, the display will have, neither too little, nor too much resolution. In other words, the pixel structure is invisible, and all valid image information is visible. The relationship between displayed images and the human visual system characteristics is well studied [161], also with respect to the perceived quality of color matrix displays [84, 137]. A detailed discussion concerning HVS characteristics is considered to be outside the scope of this thesis. We will focus on signal processing aspects, using only the very basic properties of the HVS. 56

72 3.4 The resolution of a color matrix display a Figure 3.13 In a color matrix display (a) each pixel consists of several primary color subpixels, here in the vertical stripe arrangement (b) (copy of Figure 2.14). b Figure 3.14 Addressing and reconstruction using color subpixels. 3.4 The resolution of a color matrix display In the previous section, the spatial display process was discussed for an exemplary monochrome matrix display with square pixels. In this section, we extend this to color matrix displays. As described in Section 2.2.8, most color matrix displays (CMDs) apply spatial color synthesis to reproduce color. This means that a pixel is not what its name suggests: the most elementary part of the picture. Each pixel on the screen actually consists of three primary color subpixels (red, green and blue), so the subpixel really is the atomic element (Figure 3.13). Because of the subpixels, the spatial addressing format of CMDs 12. is different from a monochrome display. Since spatial addressing and resolution are closely related, we can expect that the color subpixel structure has an effect on the display resolution Color matrix display addressing The addressing process for a CMD is illustrated in Figure 3.14 for the most common subpixel arrangement (SPA), which is the vertical stripe (VS) SPA. 12 From here on we will denote spatial color selection CMDs, or subpixel CMDs, simply as CMDs. 57

73 Chapter 3 Static display resolution Figure 3.15 The question arises whether the color subpixels can give extra resolution when the grouping into full color pixels is released, as illustrated by this simple example. Photo taken from vertical stripe LCD: the left line shifts one pixel right per 3 pixels down, the right line shifts one subpixel right per pixel down. During the addressing process, the drive signals for each primary color in a full color pixel are directed to color subpixels that are located at different positions. The colors of the subpixels in each full color pixel blend into the full, three dimensional, color if the number of subpixels per (visual) degree is large enough. At the viewing distance used in Section 3.3.4, this condition is easily satisfied. The spatial offset between the colors can be interpreted as a color misconvergence error of 1 3 p x. Such color misconvergence is very common in CRTs due to misalignment of the three electron beams, which typically occurs toward the edges of the screen. In CMDs, there apparently is also such an error, but it is so small that this is no problem in practice. Therefore, it is common practice to neglect the arrangement of color subpixels inside each pixel. Each pixel is simply considered a full-color pixel, assuming that the display can generate all three primary colors at the same location. Although this pixel addressing does not cause any serious problems, there is a good reason for considering the actual subpixel arrangement inside the pixel, a b Figure 3.16 The RGB input (R c,g c and B c) is sampled (to R s,g s and B s), and finally reconstructed by the display (R d,g d and B d ). 58

74 3.4 The resolution of a color matrix display i.e. to use subpixel addressing. This is illustrated by the following simple example. Constructed with full-color pixels, a diagonal two-pixel wide line may appear on a VS-CMD, as shown in Figure 3.15 (left). However, when the position of each subpixel in the full-color pixel is known, the line may also be drawn as in Figure 3.15 (right). Apparently, the subpixels give the display a higher perceived spatial resolution, than the traditional full-color pixel resolution [3, 125, 13, 21, 24, 42]. For the arrangement shown in Figure 3.15 the pixel size appears three times smaller, so resolution seems to increase by a factor of three in horizontal direction. The following questions immediately arise: Is this extra resolution really there? Is the simple example only a very special case, or is this resolution also present in general? If there is extra resolution, how much can we gain relative to the traditional pixel resolution? How do we process the signal in order to maximally profit from any extra resolution? In order to answer these questions, the next section first extends the frequency domain analysis of Section to include the subpixel structure in the case of pixel addressing. Thereafter, Section introduces the basics of subpixel addressing, followed by further analysis of the display process that shows how this can increase the perceived resolution Analysis of images displayed on a color matrix display The display process, from sampling, via addressing to reconstruction, was discussed for a monochrome matrix display in Section 3.2. In this section, we analyze an image that is displayed on a color matrix display. We will use the example of the VS SPA, but the general approach is also applicable to other arrangements, as discussed in Section 3.6. To deal with color images, the first change to the monochrome system from Section 3.2 (Equations 3.8, 3.13, 3.14), is to substitute the monochrome intensity I with the three dimensional color signal I: I( x) I( x) = [I n ( x)] T = R( x) G( x) B( x) (3.24) The display process for a VS-CMD is illustrated in Figure As in Section 3.2, we analyze images on the display by calculating the effect of each part of the chain. 59

75 Chapter 3 Static display resolution Sampling First, the original signal I c is sampled at positions x = [x, y] that correspond to each full color pixel at the display, as in Equations 3.8 and 3.9: I s ( x) = S( x) I c ( x), (3.25) Addressing The sampled components, I s = [R s, G s, B s ] T, form the input to the addressing process. For a CMD, this process translates into a spatial offset (delay), x, for each color (Figures 3.16 and 3.14). For the VS SPA, the delays are: x R x G x B = 1 3 p x p x 0 (3.26) So the signal after addressing, I a ( x), becomes I a ( x) = I s ( x x) = R s( x [ 1 3 p x, 0]) G s ( x [0, 0]) (3.27) B s ( x [ 1 3 p x, 0]) S( x [ 1 3 = p x, 0]) R c ( x [ 1 3 p x, 0]) S( x [0, 0]) G c ( x [0, 0]) S( x [ 1 3 p x, 0]) B c ( x [ 1 3 p x, 0]) Color delay matrix We can also write I a in Equation 3.27 using a matrix form, by writing the delay in the sampling function as a convolution with a shifted delta impulse: I( x x) = I( x) δ( x x) (3.28) which becomes in matrix form 13 I 1( x x 1 ) I 2 ( x x 2 ) = δ( x x 1) δ( x x 2 ) 0 I 3 ( x x 3 ) 0 0 δ( x x 3 ) I 1( x) I 2 ( x) I 3 ( x) = D( x) I( x) (3.29) where D( x) is a color delay matrix. Using this, I a becomes: I a ( x) = D a ( x) I s ( x) [ = D a ( x) S( x) I ] c ( x) (3.30) 13 The convolution of two matrices A( x) B( x) = C( x), is defined similarly to matrix multiplication: C ij ( x) = k a ik( x) b kj ( x). Note that, just like with scalar convolution and multiplication, in general [A( x) B( x)]c( x) A( x) [B( x)c(x)]. Also note that the convolution used here is two-dimensional, since x is a 2-D variable. Finally, note that D( x) convolutes I( x) from the left, because the latter is a column vector. 60

76 3.4 The resolution of a color matrix display where D a ( x) = δ( x + [ 1 3 p x, 0]) δ( x) δ( x [ 1 3 p x, 0]) (3.31) Reconstruction The reconstruction of the addressed signal into a displayed image is characterized by the subpixel aperture. The subpixel aperture for the VS SPA (Figure 3.14), A vs ( x), is approximately a 2-D box function (Equation 3.15) with width 1 3 p x and height p y for all three components, resulting in the displayed signal: Frequency spectrum I d ( x) = A vs ( x) I a ( x) = A vs ( x) [ D a ( x) [ S( x) I ]] c ( x) (3.32) Computing the Fourier transform of Equation 3.32 results in the frequency spectrum of the displayed image: I f d ( f) = A f vs( f) [ D f a( f) [ S f ( f) I c f ( f) ]] (3.33) where A f vs( f) is the Fourier transform of the vertical stripe pixel aperture: A f vs( f) = sin( 1 3 πfxpx) sin(πf yp y) 1 3 πfxpx πf yp y (3.34) The Fourier transform of a spatial delay matrix D( x) (Equation 3.29) is D f ( f) = F{D( x)} = e 2π f x e 2π f x e 2π f x 3 (3.35) The phase factors (e... ) are the frequency domain equivalents of the spatial delays, conform the space-shifting property of the Fourier transform: F{I(x x)} = I f (f x )e 2πifx x (3.36) Expanding I f d ( f) into the three components gives e 2 3 [S πifxpx f ( f) Rc f ( ] f) I f d ( f) = A f vs( f) [ S f ( f) G f c ( f) ] e 2 3 [S πifxpx f ( f) Bc f ( f) ] (3.37) The display signal chain is easily recognized from Eqs or 3.32, by reading from the inside brackets outward: first the sampling of the input RGB signals, then the RGB-dependent addressing delay, and finally the reconstruction aperture. 61

77 Chapter 3 Static display resolution Color image perception: luminance and chrominance In order to be able to indicate the perceptual impact of the subpixel structure on the displayed image (spectrum), it is not sufficient to analyze the three components of I d or I f d separately. Particularly since we consider the perceived resolution aspects, we must combine the three components in a way that reflects how the HVS reacts to details in color images. It is well known from vision science, that the HVS is much more sensitive to details in luminance, than in chrominance [161, 106]. Therefore we transform the displayed image signal, I d, to a color space that separates luminance and chrominance. A detailed treatment of the perceived resolution of color matrix displays typically involves psychophysical modeling of human vision [29], or subjective testing [24, 84, 137]. We consider this outside the scope of this thesis, and simplify the behavior of the HVS by only considering the luminance-chrominance separation. We choose the Y U V space [66] for simplicity, but similar results are obtained when other luminance-chrominance separated color spaces are used, such as CIE-XYZ or CIE-Lab [59]. The transform from RGB to Y UV space is defined as follows: I y = Y U V = M I = M = R G B R G B (3.38) The displayed image spectrum from Equation 3.33, when transformed to the Y U V space, becomes: I f yd ( f) = M I f d ( f) [ = M A f vs( f) [ D f a( f) [ S f ( f) I c f ( f) ]]] = A f vs( f) [ M D f a( f) [ S f ( f) I c f ( f) ]] (3.39) Equation 3.39 shows the display signal chain, by reading from inner brackets outward. Because the aperture does not depend on the color, the color conversion matrix M can be moved up the chain, until the addressing delay matrix D f a. Equation 3.39 is very similar to the monochrome case of Equation The main difference is the term M D f a( f), which accounts for the influence of the color subpixels. To understand what this means, we further analyze the color image frequency spectrum Analysis of the displayed color image frequency spectrum To discuss the properties of the displayed image spectrum as given by Equation 3.39, we first consider a monochrome input image (R c = G c = B c = Y c 62

78 3.4 The resolution of a color matrix display a Y f x /f s Y c Y s b c Figure 3.17 Horizontal frequency spectra of an image on a vertical stripe display. a) continuous (Yc f ) and sampled (Ys f ) images, b) luminance of displayed image (Y f f d ), c) chrominance of displayed image (UVd ). The non-zero frequencies in Y f d,u f d and V f d outside the baseband demonstrate an imperfect reconstruction (see text). U c = V c = 0), since luminance dominates the perceived resolution and it allows one to easily detect color errors. Using this simplification 14, the spectrum of the displayed image becomes (omitting ( f x ) for compactness): I f yd = Y f d U f d V f d = A f vs = A f vs = A f vs [ [ M D f a S f Y f c Y f c Y f c 0.30e 2 3 πifxpx e 2 3 πifxpx 0.17e 2 3 πifxpx e 2 3 πifxpx 0.5e 2 3 πifxpx e 2 3 πifxpx Φ f Y Y Φ f Y U Φ f Y V ]] (3.40) [S f Y f c ] (3.41) [S f Y f c ] (3.42) where Φ f Y Y, Φf Y U, and Φf Y V represent a cross-talk from Y c to Y d, U d and V d, respectively. Although the input image is monochrome (R c = G c = B c = Y c, and U c = V c = 0), the displayed image will in general contain color (U d 0, V d 0), which already shows that color errors can occur. To further analyze the spectrum, Figure 3.17 shows a plot of the amplitude of the horizontal spectrum of the displayed image, I f yd (f x, 0), according to Equation Figure 3.17 shows the continuous (Yc f ) and sampled (Ys f ) images, and the luminance and chrominance of the displayed image (Y f f d ) and (UVd ). 14 In Section we release this for more generality. 63

Chapter 3 Static display resolution a b c Figure 3.18 a) The VS display (with RGB pixel luminances as illustrated in Figure 3.

This figure is an extreme zoom-in on the image, the corresponding viewing distance (Section 3.3.4) would be about 8 m. Figure 3.

17, the vertical stripe color matrix display (VS-CMD), is compared to two reference displays, shown in Figure 3.18.

79 Chapter 3 Static display resolution a b c Figure 3.18 a) The VS display (with RGB pixel luminances as illustrated in Figure 3.19), and two reference displays: b) reference display 1: full color pixels, and c) reference display 2: Full color vertical stripe. This figure is an extreme zoom-in on the image, the corresponding viewing distance (Section 3.3.4) would be about 8 m. Figure 3.19 Luminance profile of a vertical stripe full color pixel (solid lines) and pixels with non-displaced subpixels, with widths p x and 1 px (dashed 3 lines). In Figure 3.17, the vertical stripe color matrix display (VS-CMD), is compared to two reference displays, shown in Figure The first reference (R1) is the full color version of the square pixel display from Section 3.3.1, i.e. with red, green and blue subpixels at the same position 15. The three color components of the R1 display pass the display process identically. Without showing the full calculation, the displayed image in the R1 display is found by repeating Equation 3.17 for RGB (and consequently also for Y ). The second display (R2) is identical to the R1 display, but the pixel aperture is different: a rectangle with a width of 1 3 p x, i.e. the aperture of a vertical stripe subpixel. From Figure 3.17 we can see that reconstruction is not perfect for the VS- CMD, because the baseband is attenuated and repeats are present. In contrast to the R1 display, the subpixels cause DC pixel structure, as apparent from the non-zero amplitude, both in luminance and in color, at the repeats of the DCcomponent, which are found at (multiples of) the sampling frequency (f/f s = 1). We further observe that the luminance spectrum of the image has a more complex shape, that lies mostly in between the spectra of the R1 and R2 displays. This directly relates to the luminance of the VS aperture, shown in Figure 3.19 together with the R1 and R2 apertures (widths p x and 1 3 p x respectively). The 15 This could e.g. be a projection or a color-sequential display 64

80 3.5 The subpixel resolution of a color matrix display VS aperture also is somewhere in between the R1 and R2 aperture, and it has a more complex shape, because luminances of RGB differ. Figure 3.17 also shows the effects on the chrominance. Since we started with a monochrome image (U c = V c = 0), any non-zero U,V constitutes an error. But in fact, all frequencies except f x = 3nf s (n Z) have non-zero U,V. This relates to the constant misconvergence between RGB of 1 3 p x, which results in a color error that increases with frequency (misconvergence affects details more than large areas). On the other hand, we can expect such a color error for the frequencies close to the pixel sampling frequency, because this corresponds to the highly saturated pattern of the individual subpixels. With the assumption of color blending (see Section 2.1.4), the chrominance signal at these high frequencies is invisible to the human visual system 16. As explained in Section 3.3.1, the two main factors that determine the resolution of a matrix display are the (pixel) sample-rate, and the pixel aperture. Figure 3.17 shows that the resolution of the three displays is comparable. The apertures are different, but so small that the sampling-rate, which is equal for all three displays, is mainly determining the resolution. This is apparent from the observation that the attenuation caused by the aperture is no more than a factor of 0.6 up to half the sampling frequency, which is the maximum frequency that the display can handle without introducing aliasing. This attenuation by the aperture is not enough to suppress these frequencies to the extent that we can consider them lost, so the resolution is limited by aliasing and other sampling ( digital ) artifacts like beat patterns. In other words, the Kell factor is still lower than 1. This is an important observation, since we shall see in the following sections that the sampling artifacts can be reduced on color matrix displays. 3.5 The subpixel resolution of a color matrix display As the previous section showed, there is not much difference in resolution between a subpixel CMD, or a CMD with true full color pixels, besides a clear difference in appearance of pixel structure. Nevertheless, as illustrated in Figure 3.15, a SCMD has a potential gain in resolution by not addressing it as if it had full color pixels, but by providing a signal to the display that is adapted to the subpixel arrangement, i.e. by addressing each subpixel as a separate element. This could be called subpixel addressing, but this is a somewhat confusing term. The process really relates to a change in the display signal chain before the addressing process, because each subpixel is always addressed with a separate signal (Figure 3.14 does not change). This section shows how the display signal chain changes to exploit the subpixels, and calculates the displayed image spectrum to further analyze the effect this has on resolution. The resolution of a CMD, can now be divided in two types: pixel resolution and subpixel resolution, depending on whether the display signal chain is adapted to the subpixel arrangement. 16 The luminance signal can still be visible at these frequencies, so although the color subpixel pattern is invisible, the viewer may still see pixel structure. 65

81 Chapter 3 Static display resolution Subpixel sampling Taking into account the position of the subpixels is equivalent to compensating for the spatial addressing offset, which is related to the sampling process: subpixel sampling. This is illustrated in Figure 3.20, and corresponds to sampling each component according to the positions of the corresponding subpixels on the display. This seems very straightforward, and indeed it has been proposed before [3, 38, 126]. As a matter of fact, this principle has been applied for decades in CRTs, by means of the shadowmask [139, 140]. Yet for CRTs, the potential resolution gain is prevented in practice by the electron spot profile. The size of the electron spot, besides being limited by the gun focus, is also limited by scan Moiré, i.e. interference of the shadowmask and the scan-line structure [65]. However, for matrix displays, these limitations do not apply, and the subpixel structure can be exploited to optimize the resolution of these displays. For the VS-CMD, the subpixels represent a resolution that is increased three times in horizontal direction, relative to the full color pixel resolution. This triple resolution is however not available in practice, as illustrated by the following example: Let s neglect the color of the subpixels, and assume that a vertical stripe matrix display indeed has a resolution that is three times the full-color resolution in horizontal direction. We then would address the display with a monochrome image that has a number of pixels per line that is exactly three times the number of full-color pixels. However, Figure 3.21 shows that this will result in serious color artifacts (see also Section 3.5.5). These color errors limit the resolution gain, which is apparently lower than a factor of three. In the next sections, we extend the display chain analysis for subpixel sampled images, to find the real resolution of CMDs. Our analysis differs from [13, 21, 42] in two main parts. First, we perform the general subpixel sampling on a continuous image, so we do not pose constraints on the input resolution. Second, we make an explicit split between the sampling process and the preceding signal processing. The analysis in the following sections will show that signal processing is necessary to maximize the resolution of CMDs, which is briefly explained in Section Chapter 4 will deal with these video processing algorithms in more detail Displaced sampling To calculate the spectrum of a subpixel sampled image, displayed on a subpixel color matrix display, we start again with a continuous image signal I c ( x) = [R c ( x), G c ( x), B c ( x)] T. The display signal chain from Section is now changed at the sampling process, because the color components are sampled differently, according to Figure 3.20b. This can be described by displaced sampling [13], which changes the sam- 66

3.5 The subpixel resolution of a color matrix display a Figure 3.20 b a) Subpixel sampling and reconstruction of the RGB input, b) system diagram using displaced sampling. a b Figure 3.

b) effect on a text image: from input (left) to displayed image (right, photo from LCD). pling function S( x) from Eqs. 3.8 and 3.

82 3.5 The subpixel resolution of a color matrix display a Figure 3.20 b a) Subpixel sampling and reconstruction of the RGB input, b) system diagram using displaced sampling. a b Figure 3.21 Strong color errors occur when we neglect the color of each subpixel and address the display with a triple resolution monochrome signal. a) schematic sampling process. b) effect on a text image: from input (left) to displayed image (right, photo from LCD). pling function S( x) from Eqs. 3.8 and 3.9: R os ( x) I os ( x) = G os ( x) = R c( x) S( x [ 1 3 p x, 0]) G c ( x) S( x [0, 0]) B os ( x) B c ( x) S( x [ 1 3 p x, 0]) (3.43) Where I os is the displaced sampled signal 17. We can also write I os using a delay matrix: [ I os ( x) = D a ( x) S( x)] Ic ( x) (3.44) where D a ( x) is identical to Equation 3.31, and the term [D a ( x) S( x)] indicates that the sampling function is displaced before the sampling process. Concluding the display chain, the displayed image I ds can directly be obtained from the displaced sampled signal I os, by applying the aperture (Equation 3.32, Figure 3.20b): I ds ( x) = A vs ( x) I os ( x) [[ ] = A vs ( x) D a ( x) S( x)] Ic ( x) (3.45) However, by formulating the subpixel sampling process in this way, we have changed the model of the display addressing process, as it was introduced earlier 17 named os as in offset sampling to avoid confusion with other indices 67

83 Chapter 3 Static display resolution a b x intensity B G R position Figure 3.22 a) subpixel sampling conforming to the display model of Figure 3.16, b) sampling of delayed RGB signals (Equation 3.30, Figure 3.16). After displaced sampling, the addressing delay for RGB is no longer needed, because displaced sampling combines the sampling and addressing process. The description using displaced sampling is useful when we take the display to include sampling, having a continuous signal as input. Then it does not matter whether the addressing delay is included in the sampling, or in the addressing. However, we prefer to not include sampling in the display model Subpixel addressing in the display model We choose to describe an FPD as being truly digital. The display receives a sampled signal input, i.e. an array of intensities, where each (RGB) entry corresponds to one (RGB) pixel on the display. A digital display does not sample the signal, it only distributes the discrete set of component intensities over the screen, as determined by the operation of the drivers and the electrode wiring. We shall therefore describe the subpixel sampling process in such a way that the display model is not changed. This can be achieved by adding a delay before sampling, resulting in the subpixel sampled signal I ss ( x) (Figure 3.22): I ss ( x) = R ss( x) G ss ( x) B ss ( x) = R c( x + [ 1 3 p x, 0]) S( x) G c ( x + [0, 0]) S( x) B c ( x + [ 1 3 p x, 0]) S( x) (3.46) In matrix form, I ss becomes: [ I ss ( x) = S( x) D s ( x) I ] c ( x) (3.47) where D s ( x) = δ( x [ 1 3 p x, 0]) δ( x) δ( x + [ 1 3 p x, 0]) (3.48) After sampling, the display signal chain is continued with the addressing process. 68

84 3.5 The subpixel resolution of a color matrix display This is now identical to Equation 3.30, i.e. a RGB-dependent delay: I as ( x) = R ss( x [ 1 3 p x, 0]) G ss ( x [0, 0]) B ss ( x [ 1 3 p x, 0]) = D a ( x) I ss ( x) [ [ = D a ( x) S( x) D s ( x) I ]] c ( x) (3.49) where D a ( x) was already given in Equation The chain ends with the reconstruction process, as in Equation 3.32, resulting in the displayed image after subpixel sampling, I ds : I ds ( x) = A vs ( x) I as ( x) = A vs ( x) [ D a ( x) [ [ S( x) D s ( x) I ]]] c ( x) (3.50) In Appendix B, it is shown that the description without changing the display model (Equation 3.50 using I as ), is equivalent to displaced sampling (Equation 3.45 using I os ). The difference is a matter of interpretation of the digital image signal and the function of a digital display. We shall use the simpler form of Equation 3.45 for the displayed image after subpixel sampling: I ds ( x) Spectrum analysis in YUV space Taking the Fourier transform of Equation 3.45 results in the spectrum I f ds ( f x ): [[ ] I f ds ( f x ) = A f vs( f x ) D f a( f x )S f ( f x ) I c f ( f ] x ) (3.51) Analysis in RGB color space does not teach us much about the effect of subpixel sampling. Just as in Section 3.3.1, a luminance/chrominance separated space is more useful. We therefore convert the displayed image from RGB to Y UV space, where M is defined in Equation 3.38 on page 62: I f yds ( f) = M I f ds ( f) (3.52) and we also consider the input image, I yc ( x), in Y UV space, i.e.: I f c ( f) = M -1 I f yc ( f) (3.53) The displayed image spectrum in Y U V space becomes: I f yds ( f) [ = M A f vs( f) [ D f a( f)s f ( f) ] [ M -1 I f yc ( f) ]] (3.54) However, the matrices M -1 are not functions of f or x, so the order of calculation can be changed: M[A(x)B(x)] = [MA(x)]B(x), also for the convolution. Furthermore S f ( f) is not a matrix, but a scalar function, so M -1 S f ( f) = 69

85 Chapter 3 Static display resolution S f ( f) M -1, which also holds for A f vs( f). Using this, Equation 3.54 can be simplified to: I f yds ( f) [[ = A f vs M D f a( f)m -1 S f ( f) ] I yc( f f) ] [[ = A f vs Φ f ( f) S f ( f) ] I yc( f f) ] [[ ] = A f vs Φ f S f I ] yc f (3.55) where the variable ( f) has been omitted for compactness. Here, Φ f = M D f am -1 = Φ f Y Y Φ f UY Φ f V Y Φ f Y U Φ f UU Φ f V U Φ f Y V Φ f UV Φ f V V (3.56) is a matrix of complex phase factors. These can be seen as cross-talk factors Φ input display, that define (as a function of frequency) for each input Y UV component, what the amplitude of Y U V component in the displayed image will be. They represent what we saw earlier in Section 3.4.2: the phase difference between RGB, caused by subpixel sampling and display, can result in non-zero chrominance components for a luminance-only input image. In Section 3.4.2, we considered Y -only input images to obtain the displayed spectrum for a pixel sampled input image. This resulted in Equation Also using the U, V component in the input, i.e. the full I f yc gives: I f yd = A f vsφ f [ S f I f yc ] (3.57) This is very similar to the earlier result, but now I yc is a 3-element vector of [Y, U, V ] T, and Φ f is a 3-by-3 matrix. Equation 3.40 only defined the first column in Φ f. In Appendix D, further generalized descriptions of subpixel sampling and addressing are calculated, but we only use Equation 3.57 in this chapter. Comparing Equation 3.55 to Equation 3.57, we can see that subpixel sampling has moved the complex phase factor inside the convolution. This implies that each repeat is changed by the values of this factor at the repeat frequency. This is a key element in the effect of subpixel sampling: changing the repeat spectra corresponds to a difference in sampling artifacts, which relates to a difference in resolution. With pixel sampling, the repeat spectra are not changed. Instead, the RGB subpixels have an effect on the appearance of the pixel aperture of the display, which is much less related to resolution because the aperture is smaller than the pixel pitch Analysis of the subpixel sampling and display spectrum Figure 3.23 shows the frequency spectrum of the displayed image using subpixel sampling, according to Equation 3.55, for luminance and chrominance, where the input is again a flat baseband as in Figure 3.17a. The correspondence between pixel and subpixel sampling is visible at multiples of the sample frequency 70

86 3.5 The subpixel resolution of a color matrix display subpixel sampling pixel sampling subpixel sampling pixel sampling Y d 0.6 U d a f x /f s b f x /f s Figure 3.23 Horizontal frequency spectrum of an image on a vertical stripe display, using subpixel sampling. The input is a flat baseband (Figure 3.17a). a) luminance spectrum, b) chrominance spectrum (U only). (f x /f s Z). These frequencies correspond to a DC-only image, i.e. a uniform area, where it should not matter whether the RGB signals are sampled according to the subpixel positions or not (the DC pixel structure has not changed). Indeed the two spectra are identical there. At other frequencies, the differences between the pixel- and subpixel sampled spectra are substantial. Let s first consider the luminance spectrum. The subpixel sampled luminance spectrum is less attenuated in the baseband and the repeats are more suppressed. The baseband attenuation of the pixel sampled image is caused by the larger reconstruction unit (see Figure 3.19). The lower repeat amplitude in the subpixel sampled luminance spectrum represents a shift of the repeats from the luminance to the chrominance signal, which is expressed by Φ in Equation In the pixel sampled image, the repeats are attenuated after the image has been sampled. The repeats are attenuated by the low-pass filtering corresponding to the pixel aperture, while the RGB-displaced addressing process distributes the repeats between luminance and chrominance. In the subpixel sampled image, the repeats are attenuated by distributing them between luminance and chrominance during the sampling process 18. Contrary to the pixel sampled case, this affects the aliased components in the sampled image, i.e. the frequencies f > f s /2 that appear in the baseband at f s f < f s /2. Reduction of the amplitude of aliased components is related to an increase in resolution in terms of number of pixels. For example, in a situation where the first order repeat is completely removed, there is no aliasing for frequencies up to f = f s. This is equal to doubling the sample-rate, since the first order repeat can be removed by taking an extra sample halfway between each two existing samples. In that case, the resolution in terms of number of pixels has doubled. When the first order repeat is not removed, but only suppressed, the effective number of pixels will be less than doubled. The number is not as easily found, but it depends on the highest frequency than can be displayed with enough modulation, i.e. without loss of amplitude or other distortions. We need to resort to other measures of resolution than the number of pixels to fully describe these effects. 18 The addressing process now has no RGB varying influence, as shown by Equation

87 Chapter 3 Static display resolution Increased resolution through chrominance aliasing From Figure 3.23 it is clear that the subpixel sampling has not completely removed the repeats, so it is not straightforward to conclude that the resolution has increased. Due to the different sampling phases of RGB, the (lowest order) repeats in the frequency spectrum of the displayed image have partly 19 shifted from the luminance to the chrominance. The different spectra for luminance and chrominance account for a number of effects. First, Figure 3.23 shows that subpixel sampling has actually suppressed the chrominance errors inside the baseband. This is explained by the fact that sampling RGB at the subpixel positions corrects for the RGB dependent offset, i.e. the misconvergence that is introduced by addressing. Second, the aliasing in the chrominance will result in false colors for image parts with frequencies outside the baseband, as shown in Figure This figure shows a zoneplate image 20, on a vertical stripe display using both types of sampling. This image is a simulation of an actual VS display of 120x120 pixels, as described in Appendix C. In Figure 3.24, the lowest input frequency (DC, a constant intensity) is at the center of the image, and frequency increases toward the edges. The lines indicate the pixel sampling frequency, i.e. where a cycle from black to white and back has the width of a pixel. The most severe sampling artifacts occur at this frequency, because it results in a DC, i.e. very coarse, alias pattern. When determining resolution, the zoneplate can be used to find all frequencies that can be displayed without being disturbed by artifacts such as pixel structure, (color) alias, etc. The highest of these frequencies corresponds to the resolution in terms of number of pixels. The modulation of each frequency, relative to the input and to distortions [159], determines the MTF. Note, that this definition of resolution actually heavily depends on perception, i.e. one really is defining perceived resolution. With subpixel sampling, there is a balance between artifacts in luminance and in chrominance, which can only be made subjectively. The degree of disturbance caused by color artifacts is not easily objectively defined. From Figure 3.24, it is clear that subpixel sampling does not increase the resolution of the display by the factor of three we might hope for. If that were the case, the display should be able to render a signal with a frequency of three times the original Nyquist limit: 3f s /2. Subpixel sampling has changed the color of the spectrum repeats, but they are still located at f s. Figures 3.23 and 3.24 show that signal components at f s will be aliased into a constant color error, so the display is already quite useless at this frequency. The color errors shown in Figure 3.21 are clear examples of this effect. In Section 4.3 and in [13, 52], this was mentioned as a potential problem, but it actually accounts for an increase in resolution. 19 Only if the luminance contributions of RGB would be equal, the shift would be complete: no luminance aliasing below spatial frequencies of 3f s/2. 20 Due to limited resolution of the printed page, the resolution of the zoneplate images has to be rather low. In those cases where details of the zoneplate are concerned (such as pixel structure, and not the large features like the repeats) we zoom in in separate images. 72

edges), on a (simulated) vertical stripe display (120 by 120 pixels). a) pixel sampling, b) subpixel sampling. a Figure 3.25 b Zoneplate (part of baseband), a) pixel sampling, b) subpixel sampling.

88 3.5 The subpixel resolution of a color matrix display a Figure 3.24 b A zoneplate image (radial frequency sweep, the lines indicate the frequency range: from DC ([f x, f y] = 0) at the center, to the pixel sampling frequency ([f x, f y] = [1/p x, 1/p y]) at the edges), on a (simulated) vertical stripe display (120 by 120 pixels). a) pixel sampling, b) subpixel sampling. a Figure 3.25 b Zoneplate (part of baseband), a) pixel sampling, b) subpixel sampling. To find a resolution improvement, we have to look at frequencies much closer to the baseband. The spatial frequency of the aliased color signal will increase as the signal frequencies come closer to the baseband. The Nyquist limit itself has not shifted: frequencies above the Nyquist limit still cause aliasing. However, the human visual system is less sensitive to high frequency chrominance errors than to high frequency luminance [29, 161]. This, combined with the reduced amplitude of the luminance aliasing, can cause frequencies above Nyquist to have better visible baseband component than aliased components. So, although there is aliasing, some frequencies above Nyquist can be resolved by the display, so the resolution is increased. This is, however, not the only aspect of the resolution increase. There is also an effect for frequencies below Nyquist, and for this we consider the MTF and the Kell factor Increased Kell factor with subpixel sampling Figure 3.25 shows the upper part of the baseband from figure 3.24, where the beat patterns that determine the Kell factor are visible. 73

89 Chapter 3 Static display resolution With subpixel sampling, we can see that, due to the increased amplitude difference between baseband and repeat, the amplitude of the beat frequencies will be reduced. Stated differently, the beat frequencies will be shifted from the luminance to the chrominance, which reduces their visibility. This reduction will vary from display to display, because it for example depends on the color primaries, display brightness and the viewing distance [29]. We will not give a detailed quantitative analysis, but explain the effect with a focus on signal processing. A reduced visibility of the beat frequencies translates into an increased Kell factor, without changing the physical properties of the display. Even though this does not increase the resolution in terms of resolvable frequencies, it does increase the signal amplitude (MTF) at these frequencies, which corresponds to an increased perceived sharpness. Subpixel sampling allows us to display the frequencies below the Nyquist limit of the display with less distortion and higher modulation. This was also found in [24], where the pixel structure noise, related to the beat patterns, was measured by determining the maximum frequency that a viewers panel could recognize. The noise was found to be less for grayscale images than for images of a single primary color. Comparing Figure 3.15 to the diagonal high frequency parts of Figure 3.25, we can see that the beat frequencies and the jagged lines that are usually associated with a low resolution (or poorly anti-aliased) image, have the same origin Signal processing The shift of aliasing components from luminance to chrominance can explain why a display that receives a subpixel sampled signal can have increased resolution. There are however still two questions unanswered: How much extra resolution can be gained, and is there any special processing needed? The answer to the first question is related to the answer to the second. We have seen that subpixel sampling gives an increase in resolution for frequencies around the display Nyquist frequency, but it results in very annoying color errors for frequencies around the display (pixel) sampling frequency. Figure 3.21 already illustrated these errors on a text image that was subpixel sampled to the VS SPA. To profit from the extra resolution, these color errors must certainly be prevented. This implies that we have to suppress those frequencies that lead to these color errors, while keeping intact those frequencies that represent the resolution increase. In other words, the input signal must be filtered before it is sampled, which is a well known signal processing function in sampling theory: anti-alias filtering. Chapter 4 will deal with this signal processing in more detail, but from the display analysis in this chapter, the need for signal pre-filtering is already evident. A complete answer to the question how much resolution can be gained, can however not be given before this processing is discussed. In this chapter, we will continue to analyze resolution of various other subpixel arrangements. There, it will even be more evident that signal processing is required to fully 74

90 3.6 Subpixel arrangements profit from their subpixel resolution. Therefore, we will take the analysis in this chapter only as far as we can without going into details of signal processing, which will be done in Chapter Subpixel arrangements As introduced in Section 2.2.8, to make a SCMD, the subpixels of each color must be distributed over the display area in a regular pattern, to enable addressing and color synthesis [126]. The vertical stripe arrangement is the simplest and most often applied choice for distributing subpixels over the display area, but there are many more subpixel arrangements. We can characterize a subpixel arrangement (SPA) by a unit pixel and a lattice. The full screen of a CMD is constructed by repeating the unit pixel over the lattice. The unit pixel is a set of primary color subpixels, each defined by their position and shape (D a ( x) and A vs ( x) in equation 3.50, respectively). For some SPAs, the unit pixel will correspond to what we called the full color pixel before, but for other SPAs it is convenient to define the unit pixel differently. With each unit pixel, we associate a unit pixel pitch, u = [u x, u y ], that describes the horizontal and vertical spacings in the lattice, i.e. the lattice can be converted to physical dimensions on the screen by giving the pitch in physical dimensions (mm, inches, etc). Note that the unit pixel pitch can be different from the pixel pitch, p = [p x, p y ], as will be explained in more detail later. Lattice theory [98] teaches that a 2-D lattice Λ is described by two basis vectors v 1 = [x 1, y 1 ], v 2 = [x 2, y 2 ] that span the lattice: Λ = LAT (L) = { nl : n Z 2 } (3.58) Where L = [ v T 1 v T 2 ] is the lattice basis and n = [n 1, n 2 ] is the index of a point on the lattice. For example, the point x on the lattice with index n = [2, 3] is found as x = [2x 1 + 3x 2, 2y 1 + 3y 2 ]. There are many possible combinations of unit pixels and lattices that describe a particular SPA. For most SPAs, it is most convenient to use the smallest possible unit pixel, i.e. the unit pixel that can describe the SPA by repeating over the lattice with the smallest possible basis. We shall call this the primitive unit pixel Vertical stripe subpixel arrangement Figure 3.26 shows the unit pixel and basis for the vertical stripe (VS) SPA. The basis is [ ] [ ] ux 0 px 0 L vs = = (3.59) 0 u y 0 p y The VS unit pixel is a primitive unit pixel, and it is the common choice for the VS SPA. This unit pixel is what we called full color pixel, or simply pixel 75

Chapter 3 Static display resolution a b Figure 3.26 a) Vertical stripe subpixel arrangement (VS-SPA), b) unit pixel and corresponding lattice. (Color version in Figure 3.30). before, i.e. u x = p x and u y = p y.

91 Chapter 3 Static display resolution a b Figure 3.26 a) Vertical stripe subpixel arrangement (VS-SPA), b) unit pixel and corresponding lattice. (Color version in Figure 3.30). before, i.e. u x = p x and u y = p y. However, there are two 21 other primitive unit pixels for this SPA: those with blue or red in the center. The VS lattice is rectangular 22, i.e. the basis vectors are parallel to the coordinate axes, as apparent from the diagonal form of L vs. This is an important property for a pixel lattice for a matrix display, because each pixel must be indexed by the rows and columns of the display matrix. The rows and columns define a 2D rectangular lattice with basis vectors parallel to the horizontal and vertical edge of the display. The position of a pixel at row n r and column n c, i.e. at matrix index n m = [n c, n r ], can be found by using Eqs and In other words, the matrix index is equal to the pixel index Delta-Nabla subpixel arrangement Figure 3.27 shows the so-called Delta-Nabla (DN) arrangement 23. While VS has color subpixels offset in horizontal direction only, the DN-SPA can be regarded as a 2D-SPA. A primitive lattice basis and unit pixel for DN is shown in Figure 3.27b. The primitive lattice is given by: [ ] ux 0 L dn = 1 2 u y u y (3.60) and a primitive unit pixel consists of the Delta structure of RGB subpixels. The DN-lattice is non-rectangular 24, because its basis vectors are not parallel to the horizontal and vertical axis. This has a number of consequences. First, it is much less straightforward to define a pixel than for rectangular lattices like VS, not in the least because there exists no unique definition of what is a pixel [16] anyway. We avoid the pixel definition dilemma, and use the (primitive) unit 21 There are actually infinitely many choices for the primitive pixel, but if we require that subpixels cannot be split over adjacent unit pixels and we only consider subpixels within a certain distance (usually meaning that the unit pixel must be connected ), the number of primitive unit pixels is easily countable. 22 This is called a factorable or separable lattice in lattice theory. 23 The Delta-Nabla arrangement is also known as Delta Triad, or Triad 24 It is hexagonal if u x/u y = 1 2 3, and diagonal if ux/uy =

92 3.6 Subpixel arrangements a b c Figure 3.27 a) The Delta-Nabla subpixel arrangement. Two choices for the unit pixel and corresponding lattices are shown: b) primitive lattice and unit pixel, c) rectangular lattice and unit pixel. (Color version in Figure 3.30.) pixel, because it is associate it with a lattice that does not have to be rectangular, and it can always be related to (the pitch of) a certain choice of pixel. For DN, the pixel can be chosen equal to the unit pixel, i.e. u x = p x and u y = p y. However, measuring the display resolution in terms of number of pixels can be confusing for SPAs with non-rectangular lattices 25. This is because the lattice does not repeat with a horizontal and vertical pitch that directly corresponds to the pixel size. The number of pixels on the display is better found via the unit pixel size, which is given by the determinant L of the lattice. From now on we will choose the (unit) pixel pitch such, that the lattice basis contains only simple (rational) multiples of u x, u y, p x and p y. In general, p x p y will correspond to the rectangular full color pixel size. The fact that the DN-lattice is non-rectangular 26, also makes the relation between the matrix index and the pixel index complicated. This is not a problem for the analysis of many properties of the SPA, but for application in a matrix display the pixel lattice must be aligned with the matrix lattice. This means that the SPA must be described by a rectangular lattice, i.e. by a diagonal basis like in Equation In general, a lattice can be described by a diagonal basis, if a non-primitive unit cell is chosen. Figure 3.27c shows how this works for Delta-Nabla, and also explains the origin of its name. We take a non-primitive unit pixel, consisting of a Delta and a Nabla of RGB subpixels, which has a diagonal basis 27 L dnr = [ ] [ ] 2px 0 ux 0 =, (3.61) 0 p y 0 u y where p x and p y are the same as in Figure 3.27b, but the unit pixel (including u x and u y ) is chosen differently. The arrangement can be described as two horizontally interleaved rectangular lattices, each with a different pixel unit. The RGB subpixels in the Nabla units are vertically offset relative to those in the Delta units by [ 1 2 p y, 1 2 p y, 1 2 p y]. Any SPA with a basis containing only rational multiples of u x and u y can be described with a rectangular lattice 25 See footnote 31 on page 79 for another example 26 It is hexagonal if p x/p y = 1 2 3, and diagonal if px/py = This is the densest factorable sublattice [98] of L dn. 77

Chapter 3 Static display resolution a b c d Figure 3.28 Pentile subpixel arrangements. a) Basic PT0, b) Simplified layout PT1, c,d) corresponding units and lattices. (Color version of (b) in Figure 3.

3 PenTile subpixel arrangements Any combination of a primitive unit pixel and lattice can be used to build a color matrix display, as long as at least three primary colors are used, and the lattice

93 Chapter 3 Static display resolution a b c d Figure 3.28 Pentile subpixel arrangements. a) Basic PT0, b) Simplified layout PT1, c,d) corresponding units and lattices. (Color version of (b) in Figure 3.30). [98], by choosing a non-primitive unit pixel 28. This non-primitive unit pixel will contain an integer multiple m of the subpixels in the primitive unit PenTile subpixel arrangements Any combination of a primitive unit pixel and lattice can be used to build a color matrix display, as long as at least three primary colors are used, and the lattice can be described by a rectangular lattice of possibly non-primitive unit pixels. Some interesting alternatives are presented in this section, namely the PenTile SPAs introduced by Clairvoyante Labs. [20, 21, 23]. The original idea behind the PenTile SPAs is to use unit pixels containing unequal number of subpixels for each color. Two PenTile layouts are shown in Figure The SPA that was introduced first, which we call PT0, has half the number of blue subpixels, and red and green arranged in a quincunx pattern. It is described by a square lattice L pt0 = [ ] ux 0 = 0 u y [ ] px 0 0 p y (3.62) and unit pixel as indicated in the Figure. The PT0 SPA contains 5 subpixels per unit pixel (hence the name PenTile ). According to their inventors [21], it should be better matched to the HVS characteristics than other SPAs. This is because the HVS sensitivity for blue is much lower than for red and green. This would allow a Pentile display to reduce the number of subpixels, and therefore 28 In this context, a suitable choice of u x, u y will simplify matters considerably. 29 The number m is equal to the index of the rectangular lattice in the primitive lattice. 78

3.6 Subpixel arrangements a b c Figure 3.29 a) Pentile Layout 6 PT6, b) primitive unit pixel and lattice, c) rectangular unit pixel and lattice. (Color version in Figure 3.30).

94 3.6 Subpixel arrangements a b c Figure 3.29 a) Pentile Layout 6 PT6, b) primitive unit pixel and lattice, c) rectangular unit pixel and lattice. (Color version in Figure 3.30). also reduce the number of drivers, without sacrificing resolution. The number of column drivers per unit pixel can be as low as This is possible by connecting the blue subpixel in two adjacent unit pixels to the same column drivers and different rows 30. Since the PT0 lay-out turns out to be difficult to manufacture, the alternative PT1 layout was introduced, with the same lattice as PT0, but a slightly different unit pixel. The subpixel apertures are square, and the large blue subpixel can be seen as a single subpixel twice the size of red and green, or as two separate blue subpixels. In the latter case, i.e. when all the connections in the matrix are present, the unit pixel contains 6 subpixels, and can be connected by 3 drivers and two rows. Figure 3.29 shows a SPA that is another variation on the PenTile principle [22], which will be called PT6. It is found by also halving the number of red subpixels relative to the green. This is not so easily justified by HVS characteristics, but has the green pixels in a rectangular lattice, which simplifies many aspects of subpixel rendering (Section 4.6). The primitive unit pixel contains 2 green, a red and a blue subpixel, and has a quincunx lattice 31 : [ 1 L pt6 = 2 u ] [ ] x 0 px u = (3.63) y u y p y 2p y The rectangular unit pixel is found with [m x, m y ] = [2, 1]. It contains 8 subpixels, and is connected with 4 drivers and two rows. [ ] [ ] ux 0 2px 0 L pt6r = = (3.64) 0 u y 0 2p y Parameters for comparison of subpixel arrangements To compare the resolution of displays using different SPAs, we must try to keep other factors as equal as possible ( invariables ). Particularly when SPAs 30 This is equivalent to using a double pixel unit with 10 subpixels, 2 rows and 5 column drivers, i.e. 2 1 drivers per original unit pixel 31 2 We choose the pixel pitch equal to half the unit pitch, which does not correspond to a full color pixel. For the PT6 arrangement it is nevertheless simplest to count pixels according to the green subpixels 79

95 Chapter 3 Static display resolution use different relative numbers of subpixels, we must carefully choose display characteristics such as the horizontal and vertical pixel pitch. We can choose various invariables, such as equal number of subpixels per (unit area of the) display, or the total number of drivers in the display, or even some cost-function of the number of row and column drivers and subpixels. Note that all the unit pixels used from here correspond to a rectangular lattice. Equal number of subpixels per area Comparing display with equal number of subpixels is both a simple and a fair comparison, since the number of subpixels is a main indicator of cost and complexity of a display. We therefore choose the relative size of the unit pixel pitches u x and u y, as defined relative to the lattice for each SPA in the previous section, such that the number of subpixels per unit area equals that of the reference VS display. The number of subpixels per area, S A, relates to the number of subpixels per unit, S U, the number of units per area, U A, and the unit pixel pitches: S A = S U U A (3.65) U A = 1 u x u y (3.66) So, for SPA n to have equal subpixels per unit area to the reference SPA r : S n A = S r A = Sr U (u x u y ) r = Sn U (u x u y ) n (3.67) (u x u y ) n = Sn U SU r (u x u y ) r (3.68) With the unit aspect ratio AR U = ux u y, this can be related to the one-dimensional (horizontal) pixel pitch: u n x = u r x SU n SU r AR n U AR r U (3.69) This reflects the freedom to choose the pixel aspect ratio, AR p = p x /p y, regardless of the number of pixels of the display or the display aspect ratio. In general, the pixel aspect ratio is chosen such that the horizontal and vertical resolution is balanced. Figure 3.30 shows the four SPAs with equal subpixels per area, and Table 3.1 shows the corresponding unit parameters (with VS as a reference). We will use these parameters in the analysis in the following sections. 80

63 pixel pitch p x for equal S A to VS 1 1 1.41 0.82 Table 3.1 Parameters of several SPAs, using rectangular unit pixels, compared under the equal subpixels per area condition.

96 3.6 Subpixel arrangements Figure 3.30 SPAs with equal subpixels per unit area. From left to right: VS, DN, PT1 and PT6. Equal subpixels condition VS DN PT1 PT6 subpixels per unit pixel S U unit pixel aspect ratio AR U unit pitch u x for equal S A to VS pixel pitch p x for equal S A to VS Table 3.1 Parameters of several SPAs, using rectangular unit pixels, compared under the equal subpixels per area condition. Equal number of drivers In this example, we choose the pixel pitch such, that the total number of drivers, i.e. rows plus columns, is equal for all displays 32. The total number of drivers per display, DR D, is related to the number of drivers per unit, D U horizontally and R U vertically, and the number of units per display U = [U x, U y ] DR D = U x D U + U y R U (3.70) DR D can also be related to U x only, via the display aspect ratio AR and unit pixel aspect ratio AR U DR D = U x (D U + R U AR U AR ) (3.71) So, for display with SPA n to have equal drivers to the reference display r : U xn = U xr (D Ur AR r + R Ur AR Ur ) (D Un AR n + R Un AR Un ) (3.72) 32 In practice, column drivers are more expensive than row drivers. On the other hand, increasing the (relative) number of rows relates to decreasing the addressing time per row, which again increases cost. Therefore, approximating cost by number of rows plus columns is both simple and reasonably accurate. 81

73) p xn = p xr m xr (D Un AR + R Un AR Un ) m xn (D Ur AR + R Ur AR Ur ) (3.

97 Chapter 3 Static display resolution Figure 3.31 SPAs with equal drivers per display. From left to right: VS, DN, PT1 and PT6. And, with equal display size and aspect ratio W r = W n U xr u xr = U xn u xn u xn = u xr (D Un AR + R Un AR Un ) (D Ur AR + R Ur AR Ur ), (3.73) p xn = p xr m xr (D Un AR + R Un AR Un ) m xn (D Ur AR + R Ur AR Ur ) (3.74) The unit pixel pitch for equal drivers depends on the display aspect ratio, which was not the case for the equal subpixels per area condition. Since this dependency is only a few percent for aspect ratios near 4:3, and the equal driver condition has a, somewhat arbitrary, equal weighting of rows and columns, we shall use AR = 1 as an average condition 33. Nevertheless, the number of drivers is an important design parameter for a display. In cases where the aspect ratio and cost of row and column drivers are known, the number of pixels on the display can be optimized, even though the cost is the same, by choosing appropriate display orientation 34 and electrode connections (select/data electrode exchange per unit pixel). Figure 3.31 shows the four SPAs with equal drivers per display, as obtained from using the unit parameters listed in Table 3.2 (i.e. the VS as a reference). The equal driver condition is favorable with respect to S A for the Pentile displays: The PT1 SPA has 28% more subpixels per area than the VS SPA in the equal driver condition Resolution of 2D subpixel arrangements The main reason for introducing alternative pixel arrangements is that these can achieve higher perceived resolution than the vertical stripe arrangement, at comparable cost, i.e. the same number of subpixels or drivers. However, as we 33 AR = 1 is the average when there is equal chance that a display is used in portrait and landscape mode 34 With fixed orientation, this can also mean that the row and data drivers are reversed, i.e. respectively arranged horizontally and vertically. 35 This is caused by a more balanced distribution of data and row drivers, partly an artifact of the chosen equal weighting of rows and columns. If we would take DN with D u = 3 and R u = 2, the relative S A can even be increased to

98 3.7 Resolution of 2D subpixel arrangements Equal drivers condition VS DN PT1 PT6 data drivers per unit pixel D U row drivers per unit pixel R U unit pixel aspect ratio AR U unit pixel pitch u x for equal DR D to VS horizontal pixel/unit factor m x pixel pitch p x for equal DR D to VS (Relative) subpixels per area S A Table 3.2 Parameters of several SPAs, using rectangular unit pixels, compared under the equal drivers condition. shall see, this can only be true when the subpixel positions inside each pixel are taken into account by video processing before the display addressing process: subpixel rendering or subpixel addressing. This processing is an important function in any display system that claims to increase resolution by means of an alternative SPA D subpixel arrangements with pixel sampling When pixel sampling is used, alternative SPAs have no resolution gain over VS, and sometimes resolution is even reduced. To illustrate this, Figure 3.32 shows the zoneplate image on a DN display with equal subpixels per area as the VS (Figure 3.24). This corresponds to using the Delta-Nabla addressing scheme (Section 3.6.2) with RGB signals that are sampled at the center of each delta or nabla pixel. Figure 3.32 shows that the DN display does not have more resolution than the VS. The repeats are at the same location as for the vertical stripe arrangement. Although DN has less visible pixel structure than VS, there are actually more artifacts, such as jagged edges and magenta-green discoloration around the Nyquist limit. To exploit the extra resolution that the Delta-Nabla arrangement offers, we have to use subpixel sampling, as shown in the next subsection. For the PenTile arrangements, the situation is a little more complicated. The SPAs are not easily associated with a full color pixel, and this already indicates that they are designed to be combined with SPA-specific addressing. When we do force a choice for a full color pixel, we directly see that this does not increase resolution. For example, for the PT1 SPA, the unit pixel (Figure 3.28b) can be used as a full color pixel, and Figure 3.33 shows the result of sampling according to this pixel choice on the zoneplate image. Resolution is much lower, i.e. the repeat frequencies are lower, than for the VS with equal subpixels per 83

Chapter 3 Static display resolution b a c d Figure

pixel sampling, b) zoomed in on horizontal Nyquist

repeats), c) zoomed in on DC (center of zoneplate)

sampling, with equal subpixels per area.

34 Zoneplate on a Delta-Nabla display with

99 Chapter 3 Static display resolution b a c d Figure 3.32 a) Zoneplate on a Delta-Nabla display with pixel sampling, b) zoomed in on horizontal Nyquist frequency (in between the vertical center repeats), c) zoomed in on DC (center of zoneplate) to show pixel structure, d) same as (c) but for VS SPA. a b Figure 3.33 a) Zoneplate on a PT1 display with pixel sampling. b) same zoneplate on a VS display with pixel sampling, with equal subpixels per area. b a c Figure 3.34 Zoneplate on a Delta-Nabla display with subpixel sampling. Compared to the VS-SPA, the spectrum repeats move to a diagonal high frequency, which increases the vertical resolution. Zoomed in on vertical high frequencies, b) DN, c) VS. 84

100 3.7 Resolution of 2D subpixel arrangements area. This is a direct consequence of having more than one subpixel per color in the unit pixel ( redundant red and green subpixels). Each subpixel should be driven with its own signal to profit from extra resolution. The PT6 SPA with pixel sampling shows a similar behavior, but since it is even less straightforward to choose a pixel (containing all the colors) for PT6 than for PT1, we shall not discuss this here Subpixel sampling on 2D subpixel arrangements As with the vertical stripe SPA, we can also take into account the subpixel positions in the sampling process for alternative SPAs: subpixel sampling. As we will see in the following examples, the general effect of subpixel sampling is the same: The lower repeat spectra are shifted from the luminance to the chrominance, giving less distortions inside, and reduced aliasing outside the baseband. The main difference with the VS SPA, is the position of the repeat spectra, i.e. their two-dimensional frequency, and their amplitude, i.e. their (complexvalued) Y UV -Y UV cross-talk matrix Φ. Figure 3.34 shows the result of subpixel sampling on the zoneplate image, for the DN SPA. Compared to the pixel sampled image in Figure 3.32, the spectrum repeats in the vertical direction have moved to a diagonal frequency. This is a consequence of the pixel lattice (Equation 3.60). The frequencies of the spectrum repeats are given by the reciproke lattice [32, 25], which is given by: [ ] L dn = (L dn ) -1 1/px 1/2p = x (3.75) 0 1/p y L dn shows that there are repeats at the horizontal sampling frequency f sx = 1/p x, and at the diagonal frequency f = [1/2p x, 1/p y ] = [ 1 2 f sx, f sy ]. The fact that the vertical repeat has moved, corresponds to a higher vertical resolution of the display. Note that, for DN, the subpixel sampling is actually a combination of sampling according to the quincunx pattern of each color, and sampling according to the phase difference between RGB in each pixel. In fact, we can describe DN with only horizontal R/B displacements relative to G. The 2D subpixel pattern really results from the quincunx sampling per color. A large part of the increased resolution comes from sampling according to the quincunx pattern, which shifts the repeat position from vertical to diagonal frequencies. The subpixel sampling accounts for the shifting of the repeats from luminance to color. The repeat position is determined by the lattice of the (primitive) unit pixel. The two effects are both solved by the principle of subpixel sampling: sample each subpixel at its displayed position. Figures 3.35 and 3.36 show the subpixel sampled zoneplate for the PT1 and PT6 PenTile SPAs, respectively. The frequency range of the zoneplate is the same as in the VS and DN examples, but because we use an equal number of subpixels per area (see also Appendix C), the pixel pitch is different, and therefore the aliasing occurs at different frequencies. As indicated in Section 3.6.3, there 85

are basically two different sampling schemes for PT1: with one or two blue samples per unit pixel.

101 Chapter 3 Static display resolution a b Figure 3.35 Zoneplate on a PT1 display. The sampling phases for each color are shown below each figure. a) One blue sample per unit pixel, b) two blue samples per unit pixel. Figure 3.36 Zoneplate on a PT6 display. are basically two different sampling schemes for PT1: with one or two blue samples per unit pixel. In the first case, both blue subpixels in the unit are regarded as one big subpixel (like PT0), and addressed with the same value. The twoblue subpixel case was used to calculate the subpixels per area in Section The one-blue sampling is shown in Figure 3.35 to illustrate the Pentile effect as intended with the PT0 SPA, i.e. that blue has different sampling grid and lower pixel density than red and green. Figures 3.35 and 3.36 show that, just as for VS and DN, also for PT1 and PT6 the lower order repeats have shifted from luminance to chrominance. Not 86

102 3.7 Resolution of 2D subpixel arrangements all primary colors have the same sampling lattice, so each primary color will have different repeat positions. The repeats are positioned at integer multiples of the reciproke lattice frequencies of the unit pixel, but the repeat colors vary depending on the number of subpixels and position inside the unit pixel. In fact, the repeat positions reflect the sampling schemes, rather than the subpixel arrangement. For all examples used up to here, the sampling scheme was identical to the SPA (D a ( x) = D s ( x), i.e. D s ( x)d a ( x) = 1). Note that, when a unit pixel contains more than three subpixels, the corresponding delay matrix will also have more rows and columns: one for each subpixel in the unit. The color and position of the repeat spectra in the subpixel sampled zoneplate correspond directly to the unit pixel. The color of the repeat is determined by the phase difference between RGB, as given by D f s ( f) at the repeat frequency. For example, a phase difference of 120 degrees between R, G and B, as with the repeats in VS and DN SPAs, will generate all possible hues (red, yellow, green, cyan, etc.) depending on the phase of the image signal. With PT6, the phase difference of the diagonal repeats is 180 degrees between R and B, while G has no repeat. The G signal will therefore have average intensity (combining the normal black and white modulation of the zoneplate). Combined with the 180 degree phase difference between R and B, this gives an orange-cyan color (RGB = [ ] and [0 1 21], respectively. However, for the one-blue PT1 the sampling is slightly different than for the SPA, and the repeats actually correspond to the unit pixel of the PT0 SPA 36. In the one-blue PT1 case, the resolution for blue is half that of red and green. The two-blue PT1 has different repeat positions for blue (square grid at double vertical density) and red/green (quincunx grid at double density), but their densities in pixels per area are the same. However, the number of blue samples has very limited influence on the resolution, because the HVS is very insensitive to blue. Nevertheless, the blue sample density determines where and how severe color errors occur (due to aliasing in the blue component), so it has an influence on the resolution when we consider filtering in the Chapter Resolution comparison of different SPAs We can now compare the resolution of different SPAs, by identifying the area in the spectrum where the input signal is displayed without perceived distortions. This is much like defining a Kell factor in 2D. However, there are two reasons why this definition of without perceived distortions is not trivial. First, there is no objective measurement for perceived distortions, in particular related to the difference in visibility between (low frequent) color artifacts, and (high frequent) luminance. Furthermore, the perception of images is strongly related to the viewing distance. Especially the subtle balance between color errors and luminance detail that we encounter in subpixel sampling, changes with viewing distance [29]. Second, as mentioned in Section 3.5.7, the process of subpixel sampling must 36 The difference between one-blue PT1 and PT0 is the pixel structure which would not be visible on the low-resolution zoneplate image anyway 87

be accompanied by suitable signal pre-processing (filtering) to prevent the worst artifacts. The perceived distortions will also depend on this filtering.

103 Chapter 3 Static display resolution a b Figure 3.37 The procedure used to estimate the Kell area. a) The outline, given at the start of the experiment, was adjusted by means of the control points (indicated by the arrow). b) Example of finally drawn outline. be accompanied by suitable signal pre-processing (filtering) to prevent the worst artifacts. The perceived distortions will also depend on this filtering. Nevertheless, we can use the 2D Kell factor, i.e. the Kell area, to estimate the resolution of several SPAs, and indicate their differences. In order to do this, we simulated subpixel sampled zoneplates on VS, DN, PT1 and PT6 SPAs, and performed a subjective evaluation based on 8 participants. The participants were asked to adjust an outline on the simulated image, until they felt that the baseband signal ( the circles in the center, i.e. the original signal) was equally visible as any distortions ( The other circles, i.e. the repeat spectra and other sampling and reconstruction artifacts). Figure 3.37 shows the procedure, in (a) the outline at the start, in (b) an outline as drawn by one of the participants. Although the task seemed difficult to most participants at first, they quickly understood how to perform it. Of course, there remained a certain subjective judgment about when the distortions were equally strong as the original signal, especially when the distortions have non-zero chrominance, but this was the purpose of the test in the first place. Viewing distance was approximately 1000 times the (VS) pixel pitch. Figure 3.38 shows the resulting Kell areas, for several SPA. In the figure, the simulated zoneplates are multiplied with the Kell area value, averaged over all participants. This value is obtained as follows: For each participant, the Kell area is represented by a value of zero outside, and one inside the area, respectively. The average, therefore, is a value between zero and one. This value, and therefore also the amplitude of the simulated images in Figure 3.38, indicates the probability that a viewer will judge the signal inside the Kell area. Table 3.3 shows the size of the averaged Kell areas, relative to the VS baseband. Figure 3.38 and Table 3.3 directly show that the Kell area varies substantially, both in shape and in area, between different SPAs. We further observe that, when comparing the areas with pixel sampling and subpixel sampling (Figure 3.38a-d), subpixel sampling increases the Kell area substantially, with 20% for VS, and with 27% for DN. With subpixel sampling, the Kell area for DN is 7% larger than VS. The DN shape is very different, giving 88

104 3.7 Resolution of 2D subpixel arrangements Figure 3.38 a b c d e f g h a) The 2D Kell factors for different SPAs, as obtained from a subjective test. Shown are the simulated zoneplates, multiplied by the average Kell area over all participants, for several SPAs (equal subpixels per area, subpixel sampled and unfiltered unless indicated): a) VS pixel sampled, b) DN pixel sampled, c) VS, d) DN, e) PT1, f) PT6, g) PT1 filtered, h) PT6 filtered. The lines indicate the baseband of the VS SPA. 89

105 Chapter 3 Static display resolution more resolution in the vertical direction 37. Also, we see that the shape of the VS area gives much higher resolution in the diagonal direction than in horizontal and vertical direction, even higher than the factor of 2 that can be expected from the square baseband shape, and this effect is even more pronounced in the subpixel sampled VS. To our knowledge, this has not been observed before. The experiment shows that diagonal frequencies on VS suffer less from distortions due to imperfect reconstruction, i.e. the square pixel structure seems to interfere more with horizontal and vertical frequencies. This effect is also present in pixel sampled DN, showing that it effectively also has more or less square (AC) pixel structure. The effect also shows that, although we regard the VS as a 1D SPA, there can still be effects in the 2D spectrum that are not easily explained in 1D. Furthermore, the effect also explains why the diagonal lines in the basic example of Figure 3.15 show so much improvement: the VS SPA with subpixel sampling just has a high resolution in that direction. The experiment also indicates an absolute value of the Kell area for pixel sampled VS of 0.85, corresponding to a square baseband with a 1D Kell factor of This number is higher than typical 1D (CRT) Kell factors of 0.7, and the Kell factor increases to well over 1 for 2D subpixel sampled SPAs. Although this seems to verify the advantage of FPDs over CRTs, we expect that our experimental procedure gives a slight overestimation of the Kell factor. This is due to the fact that viewers are able to follow the circles in the zoneplate, which makes them better visible in areas (at frequencies) that do suffer from strong distortions. These frequencies would probably have been judged outside the Kell area if they were shown separately, i.e. as an image with only a single frequency. Measuring the 2D Kell factor in this way should therefore be more accurate, and also correspond to earlier 1D Kell factor measurement procedures 38, but would also take considerably more time. From this experiment, we should therefore be careful to draw conclusions on the absolute value of the Kell area, and use the Kell area as a tool to investigate relative differences in shape and size between the SPAs. A more accurate estimation of the 2D Kell factor for different displays is an interesting topic for further research. When comparing the Kell areas of DN and VS with the PenTile SPAs (PT1 and PT6 Figure 3.38e,f), it is striking to see that the latter actually perform worse. For PT1, the area is 0.81, compared to 1.02 (VS) and 1.07 (DN). Whereas PT1 should provide optimal resolution, its Kell area is even smaller than pixel sampled VS. For PT6, the Kell area is also lower than VS and DN, but at least higher than pixel sampled VS. The reason for this, is that PT1 and PT6 have unequal sampling rates for each color. Take the example for PT1, where the blue component has fewest subpixels. As indicated in the previous sections, this means that the blue signal will cause aliasing at relatively low frequencies, which leads to very visible (yellow/blue) color artifacts. These artifacts reduce the Kell area significantly, while red and green still show considerable modulation depth. Therefore, it is expected that 37 The Kell area also shows that horizontal and vertical resolution are not fully balanced in DN, the unit pixel aspect ratio should be changed slightly to achieve this. 38 Nevertheless, Kell factor measurements have never been very accurate, and there is no standardized measurement method. 90

106 3.8 Conclusions SPA Kell area VS pixel 0.86 DN pixel 0.85 VS 1.02 DN 1.09 PT PT PT1 filtered 0.95 PT6 filtered 1.10 Table 3.3 Kell area (2D Kell factors), relative to VS baseband, for different SPAs the Kell area for PT1 can be increased by filtering the blue signal (reducing alias), while leaving red and green unchanged. A similar argument holds for PT6, where red and blue cause aliasing at lower frequencies than green. Subpixel filtering is the topic of Chapter 4. Without going into details in this section, we also included a filtered version of PT1 and PT6 in the experiment. The filters are the basic 5-tap filters that will be further discussed in Section 4.6, which eliminate the most severe color errors. The Kell areas of the filtered PT1 and PT6 are shown in Figure 3.38g,h and have sizes of 0.95 and 1.10, respectively. This shows that the PT1 Kell area is higher than pixel sampled VS, but not higher than subpixel sampled VS. A better performance is found with PT6, which gives a Kell area comparable to subpixel sampled DN. Also, the shapes of PT1 and PT6 Kell areas, and to a lesser extent that of DN, are more circular than of VS, which should be optimal for natural images. The experiment shows that the Kell area, and therefore also the perceived resolution, does depend on the applied signal processing, as indicated in Section It appears that filtering increases the Kell area. However, the perceived resolution in terms of sharpness and MTF does not necessarily increase with filtering, since the suppression of color artifacts also means reduced modulation of other frequencies 39. We will discuss this further in Chapter 4, after which we can draw a better conclusion on the perceived resolution of different SPAs. 3.8 Conclusions In this chapter, we have presented an analysis, based on signal processing principles, of the spatial properties of color matrix displays, specifically regarding resolution and addressing that takes into account the color subpixel arrangement. 39 We can actually further enlarge the Kell area by applying filters with stronger suppression, but this will not increase perceived resolution, as explained in Chapter 4. 91

107 Chapter 3 Static display resolution The perceived resolution of these displays can be increased when the subpixel arrangement is taken into account by means of subpixel sampling, even though the number of subpixels is not changed. This makes color matrix displays quite different from CRTs. Resolution in color matrix displays is limited by sampling artifacts, while in color CRTs it is limited by spot size. In general, subpixel arrangements can be described with a unit pixel and a pixel lattice, where the pixel lattice must be rectangular in order to allow matrix addressing. Perceived resolution of a display, for a fixed number of (sub)pixels, can potentially be increased by introducing alternative subpixel arrangements (to the vertical stripe ), in particular two-dimensional arrangements. However, these can only profit from this extra resolution, if appropriate subpixel sampling is used. Subpixel sampling can increase the perceived resolution, because it shifts aliasing from luminance to chrominance, for which the human visual system is less sensitive. This has a twofold effect on resolution. First, signal frequencies that exceed the Nyquist limit of the display suffer less from aliasing, which allows them to be passed to the display without causing artifacts. Second, frequencies below Nyquist are less affected by distortions due to the fundamentally non-ideal display aperture. This effectively increases the so-called Kell factor for color matrix displays with subpixel sampling, and shows that the perceived resolution increase can also be seen as an increase in sharpness, or MTF, rather than the ability to display higher frequencies. Subpixel sampling does not beat the sampling theorem: frequencies above Nyquist still cause aliasing. The fact that this aliasing is shifted from luminance to color has a positive effect near the Nyquist frequency, but causes severe color errors for signal components near the sampling frequency. Therefore, appropriate (pre-)filtering to remove these components before sampling, is an essential signal processing function in a display system that applies subpixel sampling. In this chapter we have estimated perceived resolution, based on the 2D Kell factor, as shown in figure 3.38 and table 3.3. Nevertheless, a full evaluation of the resolution increase that is practically attainable in color matrix displays can only be done by taking into account the signal processing, which is the topic of the next chapter. 92

108 CHAPTER 4 Subpixel image scaling Video processing for increased static resolution This chapter deals with signal processing related to the spatial addressing characteristics, including color synthesis, of matrix displays. We show that taking into account the spatial addressing format and color synthesis method of matrix displays during spatial format conversion, i.e. subpixel image scaling [81, 79], can increase perceived spatial resolution. The chapter is organized as follows. First, Sections 4.1 and 4.2 introduce the need for, and the basics of spatial format conversion for matrix displays. Section 4.3 reviews the basic signal processing method required when using subpixel sampling: subpixel rendering. Sections 4.4 and 4.5 discuss the generalization of subpixel rendering to 1D subpixel scaling. Subpixel scaling for 2D subpixel arrangements is discussed in Section 4.6. In Section 4.7 we evaluate the results of subpixel scaling. Finally, Section 4.8 determines the perceived resolution improvement from using subpixel scaling on several 2D subpixel arrangements. Conclusions are drawn in Section 4.9. a b Figure 4.1 A CRT (a) can adapt its spatial addressing format to the incoming video, a matrix display (b) cannot. 93

109 Chapter 4 Subpixel image scaling Figure 4.2 A matrix display requires image processing (scaling) to convert the input video to the number of pixels required by the display addressing. 4.1 The spatial addressing format of FPDs One of the properties in which color matrix displays (CMDs) behave distinctly different than CRTs, is the spatial addressing (See Figure 4.1). The addressing format of a CRT is variable, because the scanning parameters, i.e. line and field frequency, can be controlled electronically. A matrix display, on the other hand, has a fixed addressing format, because the number of rows and columns are hard-wired in the display matrix. Simply put, a CRT can adapt to many addressing formats, while a matrix display can not. Therefore, a CMD should be addressed with a video signal with a number of samples that exactly matches the number of pixels and lines on the display 1. The model of the display process from Chapter 3 assumed that the display receives a correctly sampled signal, which was incorporated into the model by sampling a continuous input signal to the right number of samples. However, as stated in Section 3.5.2, a matrix display does not sample the signal. In practice, the display receives a sampled video signal 2. There are many different existing video formats, differing in resolution and/or aspect ratio, which must all be displayed. Consequently, the spatial addressing format of a CMD will usually require a different number of samples than provided in the input signal, so the input must be converted to the display format before the display addressing process (Figure 4.2). This requires signal processing that is called image sample rate conversion (SRC), image scaling, or simply scaling. Since we will discuss many basic aspects of scaling, in order to discuss subpixel scaling in this chapter, we start with a short introduction to scaling in Section Spatial format conversion An image scaler basically changes the number of pixels that is used to represent the same image information, i.e. it re-samples the image, as shown in Figure 4.3. A scaling operation can increase or decrease the number of pixels, i.e. up-scaling 1 More specifically: the number of unit pixels multiplied by the number of subpixels per unit pixel on the display 2 In Chapter 1 it was already shown that all video signals are sampled in at least one spatial dimension 94

110 4.2 Spatial format conversion a b c Figure 4.3 Image scaling. b) shows a reference image (80x60 pixels), with a) the image after downscaling (40x30 pixels) and c) the image after up-scaling (160x120 pixels). Scaling changes the number of pixels used to represent the image, which is independent of the physical image size (taken constant in this example). K H L [N x, N y ] K[N x, N y ] K[N x, N y ] K/L[N x, N y ] Figure 4.4 The basic scheme of a sample rate converter (SRC), converting a signal from resolution [N x, N y] to K/L[N x, N y]. The SRC chain consists of: upsampling with factor K, low-pass filtering H and downsampling with factor L. and down-scaling, respectively. An image scaler does not control the actual scale of the image, since this depends on the physical size of the display. SRC is therefore a more accurate terminology than scaling, but we will use the latter since it is more common in the context of video processing and displays. SRC is a well known technique in multirate digital signal processing [33, 120, 46]. It can be described, for rational scaling factors, as a cascade of upsampling (zero insertion), low-pass filtering, and downsampling (decimation), as shown in Figure 4.4. The process of SRC is illustrated in Figure 4.5, showing a signal in the SRC chain. The sampled input signal (sample spacing x in ) is upsampled to a higher sample rate by inserting zero samples. This signal is filtered, i.e. interpolated to approximate the continuous (original) image signal. The filter kernel, H(x), is indicated in Figure 4.5, at two positions. The downsampling process then discards the samples that are not needed, to arrive at the desired resolution (sample spacing x out ). However, for downscaling (N out < N in, i.e. x out > x in ), there are frequencies in the input that cannot be represented in the output signal. If they are not removed, they will cause aliasing, which can introduce undesired artifacts in the signal. In that case another purpose of the SRC filter is the removal of these frequencies: anti-aliasing. 95

111 Chapter 4 Subpixel image scaling sampled original upsampling filtering (interpolation) downsampling Figure 4.5 A signal during sample rate conversion, from top to bottom: input signal (sample spacing x in), after upsampling, after filtering (the filter kernel H is shown at two positions, with different phase H k ), after downsampling (sample spacing x out) Polyphase scaling filters Polyphase filters are commonly used as an efficient implementation of a sampling rate converter [120, 33]. In a polyphase filter, samples not used after downsampling are not calculated, and multiplications with zeros from upsampling are omitted, which greatly reduces the number of computations. To create a polyphase upscaler, the upsampling factor K is fixed, and the filter H(n) is decomposed into K polyphase components H k (n) = H(Kn + k). To calculate one output sample, only one of the phases, H k, of the filter is used. The filter phase depends on the relative position of the output sample to the input samples. The filter kernels shown in Figure 4.5, show two different phases. Each filter phase, H k, only contains the points in the kernel that correspond to the sample positions for that phase. The scaling factor is determined by varying the factor L, i.e. by calculating only the output samples that are needed. A specific advantage of polyphase filters, is that a single design of the interpolation filter H(n) can be used for a wide range of scaling factors. For upscaling, the filter should have a cut-off frequency near 1 2 f s/k, corresponding to the Nyquist frequency of the input signal. This cut-off frequency presents the main trade-off in scaling: a high cut-off frequency can lead to interpolation (image reconstruction) artifacts, while a low cut-off frequency will blur the image, and therefore reduce resolution. For downscaling, the cut-off frequency should be near the Nyquist frequency of the output signal to suppress aliasing. To achieve this, a polyphase upscaler can be converted, through the process of network transposition, into a downscaler [27, 68]. In the transposed implementation, the downscaling factor L is fixed, and the upscaling factor K is variable. The transposed structure retains the advantages of the polyphase upscaler, such as efficiency and a single filter design for a wide range of downscaling factors. The filter should have a cut- 96

112 4.3 Subpixel rendering Figure 4.6 The display signal chain with subpixel sampling requires pre-filtering to suppress color (aliasing) artifacts. off frequency near 1 2 f s/l, which corresponds to the Nyquist frequency of the output signal. The cut-off frequency controls the trade-off between resolution and aliasing artifacts. For downscaling this trade-off is even more critical than for upscaling, because the output image has a lower resolution than the input image, and therefore care must be taken not to reduce resolution further than necessary to prevent artifacts Video scaling SRC theory can be applied to many types of signals, such as video and audio. Video SRC typically uses relatively short filters (4-6 taps per phase). Furthermore, images are two dimensional signals, so this requires the application of SRC in both dimensions. The SRC theory itself is extensible to more dimensions [25, 100, 32], which we shall use in Section 4.6. However, in practical image processing applications, two separate 1D conversions are applied [46]. For downscaling, the input is first scaled horizontally to the output N x, and then scaled vertically to N y, and v.v. for upscaling. 4.3 Subpixel rendering Video scaling is a basic, and well established video processing technique. However, the situation changes when we consider scaling for Color Matrix Displays (CMDs). In chapter 3 it was shown that the resolution of a CMD increases when the input signal is sampled according to the subpixel arrangement. Also, it was concluded in Section that signal pre-filtering is required to prevent color artifacts from spoiling the resolution increase (Figure 4.6). The combination of subpixel sampling and filtering is usually called subpixel rendering. The challenge of subpixel rendering is to prevent the most serious color artifacts, while maintaining the resolution increase from subpixel sampling. This section will first discuss the basic method of subpixel rendering, followed by an analysis of subpixel filtering in Section 4.4. We will extend this to the more general subpixel scaling in Section 4.5, which is extended to 2D subpixel arrangements in Section 4.6. The analysis in chapter 3 assumed a continuous input signal that was delayed and sampled as required by the subpixel arrangement. In a practical system, however, the input signal is not continuous. The filtering, delays and 97

(re-)sampling must be applied to a discrete video signal.

113 Chapter 4 Subpixel image scaling a b Figure 4.7 Two ways to look at the basic subpixel rendering method. a) Normal : each subpixel is a (weighted) average of the triple resolution input, b) Transposed : each input pixel (a virtual pixel ) is distributed over three subpixels. (re-)sampling must be applied to a discrete video signal. Previous subpixel rendering methods [11, 13, 21, 42] solved this by requiring that the input images have a resolution that corresponds to the subpixel resolution, for example three times the number of pixel in horizontal direction for the VS-SPA. Using displaced sampling on this signal is possible because the required samples at each subpixel position are available (and a similar argument holds for delaying before sampling ). In a sense, the input of a subpixel rendering algorithm is already subpixel sampled, but all samples still have all color components. There are basically two ways to look at the process of subpixel sampling and filtering that together form subpixel rendering (corresponding to the normal and transposed forms in SRC), as illustrated in Figure 4.7. In the normal form, each display subpixel receives a weighted average of pixels in the high-resolution input image [13, 21, 42]. This is equivalent to the transposed form, where the intensity of each input pixel is divided over a number of subpixels. Color artifacts are prevented when the intensity of each input pixel is divided equally over the surrounding red, green and blue subpixels, because an achromatic (i.e. R = G = B) input pixel results in equal intensities of red, green and blue. Each pixel in the high resolution image, and therefore each subpixel on the display, is called a virtual or logical 3 pixel at the position of each primary color subpixel. A virtual pixel can still display a full color by using the neighboring subpixels to produce the other colors. Virtual pixels can also be used to explain 3 The term logical pixel has a different meaning in [21] from [126]. To avoid confusion, we will not use the term logical pixel. 98

114 4.4 Subpixel filtering the resolution increase, because they allow a more precise positioning of image information than full-color pixels. Although virtual pixels are an illustrative way of selling the higher resolution, the actual resolution of the display is not equal to the number of virtual pixels (i.e. the number of subpixels), as discussed in Chapter 3. Moreover, virtual pixels combine the two separate processing steps of subpixel filtering and subpixel sampling (Figure 4.6). Separation of filtering and sampling allows a general analysis as in chapter 3, and it also allows the generalization of subpixel rendering and image scaling into subpixel scaling, which will be discussed in Section 4.5. The next section first discusses subpixel filtering. 4.4 Subpixel filtering Aliasing artifacts can appear in any system that (re-)samples a signal. Therefore, the system must, before the sampling process, filter out signal components with frequencies that cause aliasing. So-called anti-alias filtering is a basic technique in digital signal processing theory [33, 120, 130]. With subpixel sampling, this has not changed, only now it is aimed at removing disturbing color errors that arise from aliased components: color anti-aliasing [42]. In [52] it was shown that the discoloration from subpixel sampling disappears when the image moves at a certain velocity. This velocity corresponds to a movement of 1/3 pixel per frame, where color errors disappear because each subpixel receives samples from all phases, so the average color is white. However, we like to prevent color artifacts also in non-moving image parts, or in image parts that move at other than this particular velocity. Filtering, therefore, is an important feature in all subpixel rendering methods. The ideal anti-alias filter from signal processing theory is a low-pass filter with unity response below, and zero response above the Nyquist frequency. This ideal filter is not realizable in practice, because the very limited length of video filters results in a rather wide transition band. This requires an approximation of the ideal filter that can be related to the Kell factor as follows: Typically, the filter roll-off can start at the Kell-fraction of the Nyquist frequency, because these frequencies cannot be displayed without distortions anyway. Also, aliasing can be minimal in a practical filter by starting roll-off early. This is not the case for a subpixel sampled system, as we will see in this chapter, building on the conclusions from Chapter 3. Chapter 3 has shown that the worst (color) artifacts are found at the repeat frequencies, so the filter should at least sufficiently suppress these. Furthermore, the cut-off frequency of the anti-alias filter can be higher than the Nyquist limit, because there can be resolvable frequencies there. The higher the filter cut-off frequency is chosen, the more low-frequent the color errors will become, but the more detail will be preserved. Effectively, a trade-off results between color errors and unsharpness. Furthermore, the reduced distortion of frequencies below the Nyquist limit allows the filter response to have higher amplitude at these frequencies, increasing the perceived resolution in terms of sharpness. 99

115 Chapter 4 Subpixel image scaling Subpixel filtered displayed image spectrum To further illustrate the effect of subpixel sampling and filtering, Figure 4.8 shows the resulting horizontal spectrum of an image displayed on a VS display, using this modified anti-alias filtering. This plot differs from the spectrum plots in Chapter 3 (e.g. Figure 3.23), in the spectrum of the image before sampling. In Chapter 3, we used an ideal band-limited spectrum, with no frequencies above Nyquist. In this example, we use a practical band-limited spectrum, as obtained with an anti-alias filter with finite transition between pass- and stopband. Two filters are shown: a conventional anti-alias filter with cut-off well below Nyquist to create a nearly zero transfer at Nyquist, and a modified filter with cut-off much closer to Nyquist. Because the baseband and repeat spectra are not so nicely separated as in Chapter 3, the contributions of baseband and repeats are plotted separately [159], using drawn and dashed lines, respectively (the total spectrum is the sum of these two). Figure 4.8 shows that the conventional filtering indeed results in hardly any aliasing, but also in considerable suppression of high baseband frequencies, and in a very small difference between pixel- and subpixel sampling (besides the color signal). Note that repeat components above f s /2 correspond to reconstruction errors, i.e. pixel structure. With extended cut-off filtering, the difference between pixel- and subpixel sampling becomes larger. First of all, the attenuation in the high frequencies of the baseband is much stronger with pixel sampling, but also the aliasing is stronger. With subpixel sampling, the opposite occurs: less aliasing and less suppression, at the cost of (high-frequency) color aliasing. Figure 4.9 shows some different filters that can be used to set this trade-off. Carefully choosing the filter response is a crucial part of subpixel rendering. A filter that suppresses frequencies close to the Nyquist limit can even render the subpixel sampling useless (apart from the misconvergence correction) because the largest gains from subpixel sampling are found in this region. Given the subpixel arrangement, subpixel sampling can be applied to any display, but the filter design depends on the specific parameters of each display Subpixel filter design In particular, the visibility of (color) artifacts will depend on the pixel pitch versus viewing distance [29], but also on the brightness of the display and the color points of the primaries. This area still leaves many opportunities for further research, and requires careful (and subjective) optimization to the specific characteristics of every display system where subpixel scaling is applied. Also, filtering should be performed on a signal that is linear with respect to light output, i.e. after removal of the gamma-correction. Filtering a non-linear signal can introduce distortions, i.e. higher harmonics, that may alias and show up as color errors. This requires more filter suppression, and therefore reduces the resolution. Figure 4.10 shows the filters from Figure 4.9 applied to the zoneplate image. Compared to the images in Figure 3.24, the color aliasing has been suppressed by the filter. 100

116 4.4 Subpixel filtering a b c Figure 4.8 d Display spectrum with (sub)pixel sampling and anti-alias filtering: a,c) conventional anti-alias filtering, b,d) extended cut-off anti-alias filtering. Luminance (Y) is shown in a,b) and chrominance (U) is shown in c,d). Note that the difference between pixel- and subpixel sampling is negligible for f x < 1 fs in (a), while it is substantial in (b). 2 The frequency responses of the filters also illustrate that it is difficult to separate the effects of subpixel sampling outside and inside the baseband (see also Section 3.3.2). This is because practical filter responses will always have a finite transition band between pass- and stopband. This means that increasing the amplitude of frequencies just below the Nyquist limit will simultaneously increase the amplitude of aliasing. For this practical reason, the effects of subpixel sampling inside and outside the baseband are strongly related Amplitude Amplitude a 0.4 taps b f/f Nyquist Figure 4.9 Filters with different color error - resolution trade-offs. a) impulse response, b) frequency response. 101

Chapter 4 Subpixel image scaling a b c d e Figure 4.10 Subpixel scaling for a VS display with different filters color error vs. resolution trade-offs on the zoneplate image.

5 Subpixel image scaling The input image has to be scaled to match the number of pixels on the display, in order to provide the input to the addressing process (Figure 4.2).

In other words, the input image is scaled to the display resolution and subpixel structure, such that the display reconstructs the original image that the input represents in the best possible way.

117 Chapter 4 Subpixel image scaling a b c d e Figure 4.10 Subpixel scaling for a VS display with different filters color error vs. resolution trade-offs on the zoneplate image. a) original, b) subpixel sampled without filtering, c,d,e) subpixel sampled and filtered with different filter cut-offs. The lines indicate the Nyquist frequency f s/ Subpixel image scaling The input image has to be scaled to match the number of pixels on the display, in order to provide the input to the addressing process (Figure 4.2). Furthermore, subpixel rendering (Figure 4.6) should be used to optimally use the resolution of the display. This section deals with the combination of these two processing functions: subpixel scaling. In other words, the input image is scaled to the display resolution and subpixel structure, such that the display reconstructs the original image that the input represents in the best possible way. The general model of subpixel sampling from Chapter 3 uses a delay and sampling of the continuous signal RGB c. This can be adapted for non-continuous input signals, i.e. to subpixel scaling, because the (approximated) continuous original signal is present inside the scaler after the filtering (Figure 4.4). There the RGB dependent delay can be inserted, followed by the (down)sampling. Since polyphase filters provide a very efficient and flexible scaling method, we will also use them for subpixel scaling. The subpixel filter can be included in the scaling filter, which makes the implementation more efficient. Not only does this reduce the number of filter operations, but it also prevents unnecessary blurring that would occur when the subpixel filter is applied after the scaler. In the polyphase scaler, the RGB-dependent delay corresponds to using different filter phases for R, G and B individually, as shown in Figure Therefore, subpixel rendering can be included efficiently in an image scaler. The combination of scaler and subpixel renderer also illustrates the use of 102

118 4.5 Subpixel image scaling a b Figure 4.11 Subpixel scaling with a polyphase filter uses three different phases for RGB: a) system diagram, b) filter phases indicated in sampling process. the delayed sampling description from Subsection A natural combination with other processing functions, without affecting the display model, is possible by putting the subpixel dependent sampling in the processing chain instead of in the display addressing model. Since the signal frequencies that profit most from the subpixel scaling are close to the Nyquist limit, the impact of subpixel scaling will be largest when the input signal contains enough signal in these frequencies. In other words, when the display is the limiting factor in the video chain. Subpixel scaling will not add resolution to a low resolution input, and only the less visible misconvergence correction remains. Therefore, subpixel sampling is most useful when the display resolution is less than the input resolution, i.e. for downscaling. An additional advantage in the subpixel downscaler is given by the transposed form implementation used in downscaling, which defines the scaling filter relative to the sampling grid of the display. Therefore, the filter can be designed once, according to Section 4.4, and the color error vs. resolution trade-off setting does not depend on the scaling factor. The additional hardware cost of subpixel scaling is low and there is no need for an input signal at three times the resolution as with subpixel rendering, since subpixel scaling will work with any scaling factor. Subpixel rendering has now become a part of the scaler, which reflects that the scaler is the video processing component that is responsible for converting the input image to the display resolution. A display panel that has this component, is able to receive any input format, and optimally display it, without unnecessary loss of resolution. If only the subpixel rendering function is present in the display, the interface between the display panel and the preceding part of the video chain becomes awkward: We need to supply this display with a fixed resolution, but this resolution is higher 103

119 Chapter 4 Subpixel image scaling Figure D scaling: from input grid (crosses), via intermediate grid (dots) to output grid (squares). than even the perceived resolution of the display. It is therefore not only potentially inefficient use of bandwidth, but there is also no strong reason to use this particular fixed resolution. A more detailed discussion about this topic involves system architecture, interfaces and even standardization, which we consider outside of the scope of this thesis. Nevertheless, this is an important topic which has not yet been solved. There is no industry standard for display interfaces that include subpixel scaling D subpixel rendering and scaling Subpixel scaling can be generalized for 2D subpixel arrangements. Image scaling is already a 2D operation, but it is usually separated into a horizontal and vertical scaling step. For the VS-SPA, only the horizontal scaler has to be used in the subpixel mode. Subpixel scaling can be generalized for other subpixel arrangements, but for 2D-SPAs, a separable scaler cannot directly be applied. Because 2D-scaling is separated into horizontal and vertical scaling, the twodimensional filter response is also separable. This means that the diagonal response cannot be chosen independently of the horizontal and vertical response. For many 2D SPAs, however, the spectrum repeats are in diagonal direction (e.g. with DN, see Figure 3.34). The separable 1D scaling will put unnecessary constraints on the horizontal and vertical filter response to assure sufficient suppression of the diagonal frequencies. Non-separable 2-D filters are therefore required 4. The full 2D version of the 1D sample-rate converter, is called sampling structure conversion [32]. Analogous to the 1D case, the 2D sample grid is first upsampled to a denser grid by inserting zeros. After filtering on this grid, the desired output samples are selected during downsampling. The intermediate grid is chosen such, that it contains both the output and the input grid. The output grid can have a different density than the input, but also a different geometry. Take for example the sampling structure conversion from a rectangular lattice to a quincunx lattice of three times lower density, shown in Figure This can be performed by first upsampling the input by a factor 2 horizontally, and then downsampling in quincunx by a factor of 6. Comparable to the 1D sample 4 A filter can operate on a quincunx lattice separably, if the 1-D filters operate along the lattice basis vectors. For image processing, however, scanning in horizontal and vertical direction is preferred, to maintain a simple relation with the image border 104

120 4.6 2D subpixel rendering and scaling input image Scaling (2D) subpixel scaling Subpixel rendering 2D filter subpixel sampling subpixel drive values Display input resolution intermediate resolution display resolution Figure 4.13 Subpixel scaling chain for general 2D SPAs. rate converter [68], the two-dimensional filter operations can be implemented in a poly-phase structure [25]. For subpixel scaling to practical SPAs, we can however use a simpler implementation, which we discuss below. We assume that the input image is sampled on a rectangular grid, which we must convert to the output grid that corresponds to each color in the SPA. The scaling factor, i.e. the ratio of densities of input and output sampling grids, will depend on the input and display formats, and can therefore have almost any value. Particularly, for generalized 2D sampling structure conversion, noninteger scaling factors can require awkward, very high density intermediate grids, leading to inefficient implementations. Figure 4.13 shows a simple alternative. We first scale the input image, with a traditional image scaler, to an intermediate grid that has a density that is an integer multiple of, i.e. containing, the output subpixel grid. For some SPAs, we can allow a small offset between the intermediate grid points and the actual centers of the subpixels. For example, in PT1 we assume that the R and G subpixels are on a quincunx grid, where in reality they are slightly offset, and do not form a simple grid. The remaining filtering and (down)sampling required to get to the output grid is thereby reduced to simple ( single-phase ) filters and integer downsampling, of course for each color in the position corresponding to the subpixel. The 2D stage in the subpixel scaling chain corresponds to subpixel rendering. The combination of the variable horizontal/vertical scaler and the 2D subpixel rendering will be called 2D subpixel scaling (Figure 4.13). This is illustrated using the Pentile PT1 SPA in Figure The inventors of PT1 SPA have proposed an intermediate grid and corresponding simple 5-tap filter for red and green, and 4 tap average for blue, also shown [21] in Figure The intermediate grid contains 4 samples per output unit pixel, where each sample corresponds to either a red or a green subpixel. The position of the blue subpixels is assumed in the center of the 4 intermediate pixels. The subsampling from the intermediate sampling grid to the quincunx grid formed by the red or green subpixels requires a filter that suppresses the diagonal repeats at f = [±f sx, ±f sy ] (See also Figure 3.35) in the intermediate grid. The filter in Figure 4.14 is actually the smallest symmetric filter that has zero transfer at these frequencies 5. 5 For these simple geometries, there is a simple way to check if a filter on the input lattice has zero transfer at all repeats corresponding to integer downsampling to an output lattice: Copies of the filter impulse response, repeated at all output grid positions, should sum up to a constant at all input positions. 105

Chapter 4 Subpixel image scaling a b R R c 1 8 R R R1 R R R2 R R 0 1 0 1 4 1 0 1 0 R R G G 1 8 G G G2 G G G1 G G 0 1 0 1 4 1 0 1 0 G G 1 4 B B B1 [ 1 1 1 1 B B ] d Figure 4.

121 Chapter 4 Subpixel image scaling a b R R c 1 8 R R R1 R R R2 R R R R G G 1 8 G G G2 G G G1 G G G G 1 4 B B B1 [ B B ] d Figure 4.14 same as R 2D subpixel scaling on the PT1 SPA. a) intermediate grid and subpixel sampling, b) as (a), per unit pixel, c) filter impulse response, d) filter frequency response. After filtering the red and green samples at the intermediate grid, the signal is downsampled according to the subpixel positions, to find value for each subpixel during the addressing process. The blue color is obtained by simply interpolating between 4 intermediate samples. This corresponds to filtering with a 4-tap average and downsampling a factor of two in both dimensions. The PT1 blue filter also shows that, if this is required by the SPA, the 2D filtering and downsampling can include interpolation, to create samples at a different position than the intermediate samples. Nevertheless, for 2D-filtering that retains as much as possible the high frequencies in the input, the intermediate grid should have higher density than the output grid. Therefore, preferably not all subpixel positions in a unit pixel should be interpolated like the blue pixel in PT1, but it can help to reduce the density of the intermediate grid. Alternatively, we can treat the phase shift relative to the intermediate grid for one of the subpixels in the unit in the same way as for 1D subpixel scaling (Section 4.5): the phase shift is included in the phase selection of the polyphase filter of the horizontal and/or vertical scaler that precedes the 2D filter. With an appropriate choice of intermediate grid, the phase shift can be constant over the whole image. The non-separable 2D filter is then only used to suppress diagonal aliasing. The intermediate grid for subpixel rendering will generally have the same periodicity as the unit pixel, i.e. it looks the same for each unit pixel. Therefore, we can describe the subpixel rendering process per unit pixel, by giving the 106

4.6 2D subpixel rendering and scaling a b Figure 4.15 Zoneplate on PT1: a) subpixel sampled (identical to Figure 3.35a), b) with filtering to suppress color aliasing. a b Figure 4.16 Subpixel rendered zoneplate on Pentile PT6 (a) and DN (b) display.

The numbering of the samples in Figure 4.14b corresponds to the rendering format as used by the simulation process described in Appendix C. Figure 4.15 shows the result of the 2D filtering and downsampling as described for the PT1 arrangement, for a zoneplate image.

122 4.6 2D subpixel rendering and scaling a b Figure 4.15 Zoneplate on PT1: a) subpixel sampled (identical to Figure 3.35a), b) with filtering to suppress color aliasing. a b Figure 4.16 Subpixel rendered zoneplate on Pentile PT6 (a) and DN (b) display. intermediate grid, the subpixel sampling per color, and the filter per color. For the PT1 subpixel rendering, this is given in Figure 4.14b+c. For non-symmetric filters, i.e. when the filter implements an additional phase shift relative to the sample position, the coefficient at the sample position is also indicated. The numbering of the samples in Figure 4.14b corresponds to the rendering format as used by the simulation process described in Appendix C. Figure 4.15 shows the result of the 2D filtering and downsampling as described for the PT1 arrangement, for a zoneplate image. The filtering has removed the color aliasing. The remaining spectrum repeats now correspond to the sampling frequency of the intermediate grid (2[p x, p y ] in this case). These frequencies should normally be removed before sampling to the intermediate grid. Regarding the PT1 SPA, it must also be noted that the lower density of the blue subpixels makes appropriate gamma correction very important. High frequencies that are suppressed for the blue component can still be present in red and green. Therefore, red and green can have extreme intensities (black/white), while those of blue are average (gray). This is a very sensitive case for gamma correction, and using incorrect gamma correction will cause a mismatch between the blue light output and the average light output of red and green, which 107

123 Chapter 4 Subpixel image scaling a b 1 8 R1 R G1 G2 G3 G4 no filter 1 8 B2 B Figure 4.17 Pentile PT6 rendering. a) sampling and intermediate grid, b) filter coefficients (RB filters identical to PT1 RG filters) results in color errors. For the PT6 arrangement, 2D filtering and subsampling similar to that of PT1 can be used, starting from a double density intermediate grid as shown in Figure For PT6, red and blue are subsampled to quincunx and green is unaltered. The quincunx anti-alias filter can be the same 5-tap filter as for PT1. This sampling scheme neglects the offset between R + B and G, as if the green subpixels are at the same position as red and blue. In the filters described in the next section, we also include this offset. For the DN arrangement, red, green and blue are subsampled in quincunx, so in principle the same filter as for PT1 can be used. However, R + B are horizontally offset relative to G. A simple filter is no longer sufficient, so as mentioned before, this offset can be achieved with a higher horizontal density intermediate grid. A possible grid that exactly matches the subpixel positions, has 6x2 samples per unit pixel. However, we use a smaller horizontal density, with 4x2 samples per unit as shown in Figure This can also achieve an offset between R, B and G, without discarding too much of the resolution that we are trying to preserve. Figure 4.18 also shows a simple filter with 5 taps, but it is slightly different from the PT1 and PT6 filter. The DN grid still needs a half-sample phase shift for R + B, which is accounted for in the filter. An alternative, e.g. for efficient implementations, is to include all the horizontal phase shifts in the scaling step before subpixel rendering. This requires only a factor of 2 quincunx subsampling in the rendering, as with PT1 and PT6. Figure 4.16 shows the zoneplate on PT6 and DN displays, using the filters described above. Clearly, the filters have suppressed the color aliasing, but resolution is still not optimal, as we show in the next section Higher order 2D filtering: sharpening Although the simple 5-tap quincunx anti-alias filter used in the previous section has the desired suppression of the diagonal repeat spectra, it also shows considerable suppression at baseband frequencies. This is not necessary, and it can be prevented, at the cost of extra line memories, by applying higher order 2D filters. A simple way to construct such a filter, is to multiply the 5-tap (3 lines) frequency response with a high-frequency boosting filter. This is equivalent to applying a sharpness enhancement ( peaking ) filter in cascade with the 108

124 4.6 2D subpixel rendering and scaling a b 1 24 R1 R G1 G B1 B c Figure 4.18 DN rendering. a) sampling and intermediate grid, b) filter coefficients, c) filter responses. To get R + B horizontally displaced from G, the intermediate grid uses 4x2 samples per unit pixel, and R + B have an extra phase shift in the filter anti-alias filter, for example f = 1 + α /16 (4.1) where f is the peaking filter, i.e. a high-pass filter, added to the original, and α is the peaking factor (0=no extra sharpness). By using this 3-by-3 filter kernel, the filter footprint is increased to 5 lines. Figures show sharpened 5-line 2D filters for PT1, PT6 and DN, respectively. Figure 4.22 shows these filters on the zoneplate. Regarding the amount of sharpening, care has been taken to approximate a flat pass-band, i.e. preventing excessive overshoots in the impulse response, or amplification larger than 1 in the frequency response. Scaling filters should always be conservative in the amount of applied extra sharpening, since there may be a sharpness enhancement elsewhere in the system. Nevertheless, the amount of sharpening in subpixel rendering can have a significant effect on the final image quality. Therefore, for proper comparison, subpixel rendering methods should apply a flat filter, meaning the filter response does not significantly amplify the signal for any frequency (filter transfer 1). Furthermore, part of the sharpening/flattening function can be taken over by the scaler. This allows a reduction of the number of line memories in the inseparable 2D filter, possibly back to the three lines of the basic quincunx filter. The scaler can only influence the flatness of the overall filter in a separable way, i.e. only the horizontal and vertical response can be changed independently, but for some applications this could be sufficient. 109

Chapter 4 Subpixel image scaling R+G 1 128 B 1 64 0 1 2 1 0 1 8 18 8 1 2 18 104 18 2 1 8 18 8 1 0 1 2 1 0 1 3 3 1 3 23

19 5-line 2D filters for PT1: flat response versions obtained by sharpening the basic 3-line filters R+B 1 128 G 0 1 3

20 Sharpened filters for PT6 (including phase shift between R + B and G).

125 Chapter 4 Subpixel image scaling R+G B Figure line 2D filters for PT1: flat response versions obtained by sharpening the basic 3-line filters R+B G No filter Figure 4.20 Sharpened filters for PT6 (including phase shift between R + B and G). R G B Figure 4.21 Sharpened filters for DN. 110

126 4.7 Results and discussion a b c Figure 4.22 Zoneplates scaled with sharpened 2D filters: a) PT1, b) PT6, c) DN 4.7 Results and discussion In this section, some examples of subpixel scaling on natural images and graphics are shown. Since there are differences between natural images and graphics, these are discussed in separate sections. The examples are simulations of images on a display, as described in Appendix C Natural images Figure 4.23 shows the results of downscaling a natural image with different scaling methods, on different SPAs: a) VS with pixel scaling, b) VS with pixel scaling (extended cut-off), c) VS with subpixel scaling, d) DN with subpixel scaling, e) PT1 with subpixel scaling, and f) PT6 with subpixel scaling. Let s first compare the different scaling methods on the VS SPA, shown in Figure 4.23a,b,c. Without subpixel scaling, there is a trade-off between sharpness and aliasing/reconstruction errors. Figure 4.23b uses a scaling filter that has a higher cut-off frequency than Figure 4.23a. Therefore, Figure 4.23b is sharper than Figure 4.23a, but also shows increased jaggedness, i.e. image distortion in the form of AC pixel structure and aliasing, e.g. seen in the diagonal ropes. A reduction of these distortions is accompanied by a loss of sharpness, as in Figure 4.23a. When we turn to subpixel scaling,the improvement can be explained in two ways, depending on the amount of jaggedness and sharpness. First, the comparison of Figure 4.23a and Figure 4.23c shows an increased sharpness, for comparable jaggedness. Although there is no apparent increase in resolution in terms of number of pixels, i.e. of highest reproducible frequency, there is an increase in sharpness, i.e. MTF, or perceived resolution. Second, Figure 4.23b and Figure 4.23c show comparable sharpness, but the subpixel sampled image has reduced jaggedness. This shows that subpixel sampling decreases the pixel structure related distortions, ultimately allowing higher MTF / sharpness for these same amount of distortions. Figure 4.23d,e,f shows the same image, scaled to the three 2D SPAs (DN, PT1,and PT6), with equal number of subpixels per area. These SPAs also increase the sharpness in vertical direction, where the extra resolution for the vertical stripe arrangement only occurs in the horizontal direction. The figure shows that DN has significantly higher sharpness and resolution. This is not just 111

Chapter 4 Subpixel image scaling a b c d e f Figure 4.23 Subpixel scaling on (detail of) a natural image. a) unsharp & low jagginess, b) sharp & jagginess (e.g. in the diagonal ropes), c) subpixel scaling: sharp & low jagginess, d,e,f) subpixel scaling on respectively DN, PT1 and PT6 SPAs.

127 Chapter 4 Subpixel image scaling a b c d e f Figure 4.23 Subpixel scaling on (detail of) a natural image. a) unsharp & low jagginess, b) sharp & jagginess (e.g. in the diagonal ropes), c) subpixel scaling: sharp & low jagginess, d,e,f) subpixel scaling on respectively DN, PT1 and PT6 SPAs. a perceived resolution increase, but the DN display is able to show more detail than the VS, especially in the vertical dimension (e.g. in the car). The differences between DN and PT6 are much smaller than between DN and VS. The biggest difference here, is that PT6 has better visible pixel structure, especially in the red areas. The difference between DN and PT1 is much bigger, where PT1 shows significantly more pixel structure, caused by the low-frequency blue subpixel structure, that becomes visible because it is much darker than red and green. PT1 resolution is better than VS, but the pixel structure makes it a difficult trade-off. To make the pixel structure disappear, the viewer has to stand so far back, that the resolution gain is also not so visible anymore. 112

128 4.7 Results and discussion Text and graphics Subpixel scaling is also applicable to graphics, notably text. Previous subpixel rendering methods (e.g. Cleartype and its variations [13, 42, 1]) were targeted at this category. These methods use oversampled rendering, and a fixed subpixel downscaling filter. This is usually referred to as anti-aliasing, but we prefer to look at it as downscaling an oversampled rendered image. Text is particularly suited for improvement using subpixel techniques, not only because it contains very high frequencies [8], but also because traditional, non-subpixel based, rendering methods suffer heavily from aliasing. A quality improvement by applying better anti-alias filtering, as in image and video processing, should be expected. This quality improvement, which is also applicable to CRTs [140], is actually only partly caused by the subpixel rendering. Improved rendering at a higher resolution ( oversampling ) combined with more advanced anti-aliasing also plays a role. In fact, subpixel rendering methods are now in common use in computer text rendering, for example Microsoft s Cleartype [101], or Adobe s version, called Cooltype [1]. These implementations are fully integrated into the text rendering engine, so it is actually possible to adapt the rendering, e.g. font line thickness, to the subpixel pattern. In this chapter, we have combined rendering with video scaling, and in this section we show that it can also improve (oversampled) text. As an advantage, the proposed flexible subpixel scaling eliminates the need for a particular input resolution. Therefore, it can be implemented in the display, independent from the graphics rendering process. Figure 4.24 shows the results of subpixel scaling on a text image. The black and white text was not scaled, but directly rendered to the display resolution, a common practice in computer applications 6. Here, we clearly see the jaggedness that is caused by a total absence of anti-alias filtering, but which does result in very sharp edges. With anti-alias filtering, applied by downscaling oversampled text, the jaggedness decreases but remains clearly visible. This is a direct result of the imperfect reconstruction of the display (see Section 3.3.2). More blurring is required to also suppress the distorted frequencies. This may be no problem on a CRT display that already suffers from blurring by the electron spot, but on a matrix display this is undesirable. With subpixel scaling, the jagginess is much reduced, and we can maintain a high level of sharpness without distortions. Obviously, it can never become as sharp as the black and white image, but this is the price to pay for an undistorted image. Finally, with the Delta-Nabla arrangement, there is an additional improvement in the vertical direction Subpixel scaling vs. CRT addressing As mentioned in Section 3.5.1, color CRTs have always been using subpixel sampling, by virtue of the shadowmask. The application of subpixel sampling/scaling in matrix display, however, results in a much bigger gain in re- 6 This font rendering is more of an art than a science. Fonts are usually hand-optimized at each font size, to assure that all lines in the font are placed exactly on a pixel [102] 113

subpixel scaling on respectively DN, PT1 and PT6 display a b Figure 4.

129 Chapter 4 Subpixel image scaling a b c d e f Figure 4.24 Subpixel scaling on text images. a) black and white text, b) with antialias filtering, c) with subpixel scaling, d),e),f) with subpixel scaling on respectively DN, PT1 and PT6 display a b Figure 4.25 Comparison between a) a CRT (photo) and b) a subpixel sampled DN display (simulated), with equal subpixels/phosphor dots per area. 114

130 4.8 Two-dimensional subpixel arrangement comparison solution. This is for two main reasons. First, the intensity of each subpixel in a matrix display can be controlled exactly, whereas the intensity of a phosphor dot in a CRT can only be controlled to the extent permitted by the size of the electron spot, which is not small enough to address single phosphor dots in practice (due to e.g. Moiré). Second, the deflection and focusing systems in CRTs are never perfect, so convergence errors between the RGB beams can limit resolution, even though each beam is subpixel sampled 7. Both effects combined prohibit giving each phosphor dot on a CRT a unique, optimized, intensity, which is possible on matrix displays. Figure 4.25 shows a comparison of an image on a CRT (photo from Philips 107T4 monitor), and on a DN display (simulated), with equal number of subpixels/phosphor dots per area. 4.8 Two-dimensional subpixel arrangement comparison The two Pentile SPAs have been introduced in order to improve the perceived resolution. However, the published resolution claims [21] are based on virtual pixels (Section 4.3) that have no direct relation to resolution. These claims are based o a comparison with the VS SPA, made in two ways, neither of which is a completely fair comparison. The first way uses a VS display with (supposedly) equal resolution, but having much more subpixels per area. For example, the PT1 SPA is claimed to have equal resolution to VS with double the number of unit pixels in both dimensions. This would mean that a PT1 display has equal resolution to a VS display, with 40% less drivers and 50% less subpixels. However, this claim is based on each PT1 unit pixel having four virtual pixels (counting only red and green subpixels), and the assumption that virtual pixels represent the same resolution as full color pixels. The results from Chapter 3 (Figure 3.38) show that this is not true. The second comparison uses a VS display with equal number of unit pixels. This can mean that the Pentile display has a higher resolution, but it also has many more subpixels and drivers per display, so this is not a fair comparison Subjective comparison In order to evaluate the resolution of alternative SPAs, subjective evaluations were performed using SPAs with equal number of subpixels per area, and with equal total number of drivers, as discussed in Section In total 5 different SPAs were used: VS, as a (lower) reference, DN, PT1, and PT6, and a VS with 50% more subpixels per area and 23% more drivers as an upper reference ( VSH ). For the 2D SPAs, two types of subpixel scaling were used: simple filtering (Section 4.6) and higher order filtering (Section 4.6.1). Displays were simulated (See Appendix C) on an IBM T221 [57] high resolution LCD monitor (22 inch diagonal, 3840x2400 pixels). This display enables 7 A convergence error equal to the phosphor dot spacing (- for R and + for B) would actually make the CRT pixel sampled 115

131 Chapter 4 Subpixel image scaling Figure 4.26 Images used in the perceptual evaluation. simulation of all SPAs in a line-up, with sufficiently large simulated display resolution (280x260 pixels for VS), and sufficient pixels per simulated display pixel (4.5x4.5 for VS). The viewing distance was 1.5 meters. This corresponds to 2700 times the pixel pitch (see Section 3.3.4) for the VS simulation, which puts the pixel structure just on the edge of visibility, but will allow the resolution improvements of other arrangements to be within the limits of visibility (according to Section 3.3.4, a resolution improvement of 25% should still be visible). Several types of image content were used, such as natural images and graphics, as illustrated in Figure Input images had at least a factor of 2 higher resolution than the display, so (subpixel) downscaling was applied, to make sure that display resolution is the limiting factor in the chain. A total of 797 viewers participated in the experiment, in a between-subject comparison with approximately 40 participants per condition. Viewers were asked to rank displays on image quality (which also includes effects of e.g. pixel structure visibility). Results are plot on a quality scale, which is based on the z-score [148, 121]. On this relative scale, higher scores correspond to higher quality, and differences are normalized according to their significance. For this test, a difference of 0.5 is considered statistically significant, but perceptually not yet very relevant (about 60% viewer preference). A difference of 1 is also considered perceptually relevant (about 80% viewer preference) Simple 2D subpixel scaling Figure 4.27 shows the z-scores, for simple subpixel scaling, collapsed over all images, for the equal subpixels per area (EQA) and the equal drivers (EQD) conditions. In these conditions, no alternative SPA gives significantly better perceived quality than the VS. PT1 even performs significantly worse than VS in 116

132 4.8 Two-dimensional subpixel arrangement comparison a Figure 4.27 b Relative subjective quality scores, collapsed across all images, a) for the equal subpixels per area (EQA) condition and b) for the equal drivers per display (EQD) condition. Figure 4.28 Relative subjective quality scores, collapsed across all images, for the equal drivers per display condition with higher order filtering (EQDS). the EQA condition. From EQA to EQD, PT1 and PT6 have improved slightly, which can be attributed to their slightly higher number of subpixels per area compared to VS and DN (see Table 3.2). Nevertheless, none is better than the upper VS reference (VSH) Flat response 2D subpixel scaling In order to fully profit from the extra resolution inherent to each SPA, subpixel scaling with better filters must be used, i.e. with a flatter response as described in Section This is shown in Figure 4.28, for the equal drivers condition. Now, the PT6 and DN give equal quality to the VSH, and significantly higher quality than VS. PT1 is slightly worse than VSH, but also slightly better than VS. Therefore, PT6 gives approximately equal resolution as a VS with 23% more drivers, which corresponds to 27% more subpixels per area. Resolution increase of PT1 is even less, and nowhere near to what is claimed [19]. The DN SPA gives approximately equal resolution to a VS with 23% more drivers, and this corresponds to a VS with 50% more subpixels per area Dependence on image content These results are qualitatively the same for natural images and text/graphics, as shown in Figure 4.29, for the nets and text images. The other images show similar results. Nevertheless, differences are smaller for natural images than for graphics/text. This can be explained by the relatively high amount of high spatial frequencies present in graphics/text, which makes resolution differences better visible. 117

Chapter 4 Subpixel image scaling a Figure 4.29 b Relative subjective quality scores, for images nets (a) and text (b), for the EQDS condition. 4.8.

This condition directly tests the claims that PT1 and PT6 have a resolution equal to four times the number of unit pixels (i.e. virtual pixels are real pixels ).

This image was displayed directly (pixel-mapped) on VS and DN. On PT1 and PT6, subpixel rendering was used, since the image matched the intermediate resolution for PT1 and PT6.

133 Chapter 4 Subpixel image scaling a Figure 4.29 b Relative subjective quality scores, for images nets (a) and text (b), for the EQDS condition Virtual pixel comparison Finally, the text image was used in a condition with equal units (EQU), shown in Figure This condition directly tests the claims that PT1 and PT6 have a resolution equal to four times the number of unit pixels (i.e. virtual pixels are real pixels ). The relative sizes of the unit pixels are shown in Figure 4.30a. The text image was scaled to the VS resolution, and not scaled after that (hence the f0 label). This image was displayed directly (pixel-mapped) on VS and DN. On PT1 and PT6, subpixel rendering was used, since the image matched the intermediate resolution for PT1 and PT6. As a reference, the PT1 and PT6 EQDS conditions were added (See figure 4.28), to see how resolution of these compare to the EQU condition in the, fairer equal driver condition. Figure 4.30b shows the quality scores for the EQU condition. The quality of PT1 and PT6 in the EQU condition is significantly lower than the reference VS, so clearly the virtual resolution is not real. Also, quality of PT1 and PT6 with EQDS (i.e. with subpixel scaling) is lower than VS, contrary to the previous conditions. This is because the input image was scaled to the VS resolution. To get from this resolution to the PT1 and PT6 resolution, (up-)scaling is required, a b Figure 4.30 Comparison using the equal unit (EQU) condition: a) Relative unit pixel sizes, b) Relative subjective quality scores. EQU conditions are labeled with f0. PT1 and PT6 are added as a reference, using the EQDS condition. 118

134 4.9 Conclusions which cannot increase the quality beyond that of the input image. Also, quality of VS and DN is equal, reflecting that DN does not increase resolution, but also apparently does not decrease resolution, when pixel scaling ( Delta-Nabla addressing ) is used. 4.9 Conclusions Subpixel image scaling is an attractive method to exploit the perceived resolution increase obtainable from the subpixel structure of matrix displays, particularly for downscaling applications. The method is applicable to any subpixel arrangement, poses no constraints on the input resolution, and is applicable to natural images as well as graphics and text rendered at a higher resolution. Subpixel scaling is most suitable for downscaling applications, i.e. where the display is the limiting factor for spatial resolution in the display chain, and also for text/graphics images. This is because the advantages from subpixel scaling are found mostly in the upper part of the baseband, which is typically empty after upscaling, but which is relatively filled for downscaled images and text images. A particular subpixel arrangement can only offer better perceived resolution than the vertical stripe arrangement, if the image resampling is adapted to this arrangement. This means that subpixel scaling is a crucial part of the signal processing chain if the display is supposed to offer resolution benefits with special subpixel arrangements. The resolution increase from subpixel scaling, is bigger for two-dimensional SPAs, than for the one-dimensional VS SPA. The resolution advantage of alternative SPAs can only be reached with higher order filters, designed to preserve sharpness as much as possible over the displayable spatial frequencies without visible color errors. With these filters, the Pentile layout PT6 from Clairvoyante inc. gives equal resolution to a reference VS display, with 20% less drivers and 30% less subpixels per area. The PT1 layout is slightly worse, but still better than VS. DN gives equal resolution to VS, with 20% less drivers and 50% less subpixels. PT6 and DN are approximately equal in terms of resolution with equal drivers, but this corresponds to DN having 15% less subpixels per area. Using the Pentile layouts with good subpixel rendering does increase the perceived resolution, but clearly, resolution claims as published by Clairvoyante inc. are exaggerated. 119

135 120

136 CHAPTER 5 Dynamic display resolution The temporal response in relation to motion artifacts As introduced in Sections and 1.3.2, the main function of a display is to reconstruct the original image from the video signal. In the temporal dimension, this relates to the conversion from the time-discrete video signal to the timecontinuous light domain. It takes a finite time to produce the amount of light that is prescribed by the video signal. Analogous to the spatial dimension, this time represents a trade-off between reconstruction artifacts and resolution. In the spatial dimension, as discussed in Chapter 3, sharpness and resolution of images on a display are largely determined by its spatial aperture. Wide apertures can cause resolution loss, while narrow apertures can result in line- and pixel structure artifacts. In the temporal dimension, similar effects take place. A display that has a narrow temporal aperture, i.e. produces light in a very short time, suffers from flicker. On the other hand, a display that takes too much time to produce light in each pixel, has reduced dynamic resolution. The dynamic resolution has a close correspondence to the perceived resolution of moving images. The ability to faithfully reproduce moving images is an important requirement for a TV display. Therefore, the relation between dynamic resolution, moving images, and the temporal properties of the display is an important topic, which is addressed in this chapter. This topic has become even more important with the introduction of flat panel displays (FPDs). A CRT has little problems with moving images, because it produces light in a very short time, but a CRT has its weak point in the spatial dimension due to the electron spot limitations. FPDs, on the contrary, are strong in the spatial dimension, as shown in the previous chapter, but their temporal properties result in artifacts in moving images, i.e. motion artifacts, that are not present in CRTs. To illustrate that LCD and PDP outperform the CRT for still images, but not for moving images, Figure 5.1 shows simulations, of which the details are explained later in the Chapter, of still and moving images 121

Chapter 5 Dynamic display resolution Figure 5.1 Simulation of still images (top) and moving images (bottom) on, from left to right, CRT, LCD and PDP. on CRT, LCD and PDP.

137 Chapter 5 Dynamic display resolution Figure 5.1 Simulation of still images (top) and moving images (bottom) on, from left to right, CRT, LCD and PDP. on CRT, LCD and PDP. The chapter is organized as follows. In Section 5.1, we first introduce some basic aspects of displays, related to the temporal dimension, by reviewing the temporal properties of CRT displays. The rest of the chapter concerns the temporal properties of LCDs and PDPs, and their relation to motion artifacts. Section 5.2 discusses a property of the human visual system called eye tracking, which stands central in this chapter. Section 5.3 introduces a model of the temporal behavior of displays, and its relation to motion artifacts. Section 5.4 to 5.8 present an analysis of the temporal properties of LCDs. Section 5.4 starts with the LCD temporal properties. Section 5.5 relates these properties to the LCD motion artifact: motion blur. A temporal frequency analysis of the display chain is presented in Section 5.6. Based on the preceding analysis, Section 5.7 discusses methods to characterize the amount of motion blur on a display, and Section 5.8 presents an overview of several methods to reduce motion blur on LCDs, which are based on modifications to the display design and driving. Finally, the dynamic performance of PDP displays is covered in Section 5.9, and the chapter ends with conclusions. 5.1 CRT response and flicker The CRT temporal response is determined by the phosphor decay, as was illustrated in Figure 2.4. The phosphor decays to 1% intensity in approximately 1ms, which is much shorter than the frame1 time of 20ms for 50Hz video formats. The image as reproduced on a CRT is therefore nowhere near the continuous light profile of the original, but consists of short light flashes with dark periods in 1 We assume de-interlaced signals in this chapter, so we will discuss temporal addressing using frames in stead of fields. 122

138 5.1 CRT response and flicker Figure 5.2 Temporal light reproduction on a CRT: short flashes (intensity scaled to fit in figure) with dark periods in between. between (Figure 5.2). This temporal modulation of the light intensity results in image flicker, of which the visibility depends on the field frequency, the screen size (viewing angle) and display brightness [37]. For modern, bright and large area TVs, the 50Hz frame rate results in visible flicker, and frame rates larger than approximately 75Hz are needed to reproduce pictures without flicker at practical TV brightness levels. To this end, higher frame rate displays such as 100Hz TVs, were introduced around 1990 [46]. Because all current video source formats have frame rates lower than 75Hz, video processing is required to convert the input video frame rate to the display frame rate (See also Section 2.4.1). Besides increasing the frame rate, flicker can also be reduced by reducing the dark period in each frame period, i.e. by increasing the duration of light emission in each frame. This is completely analogous to the spatial dimension, where line/pixel structure can be reduced by increasing the size of the spatial reconstruction profile 2. However, the dark period must be almost zero to completely eliminate flicker at frequencies in the range of 50-60Hz [37]. On LCDs, PDPs, and many other display types, the temporal behavior is very different from the CRT. This was already introduced in Chapter 2, and the simulated images in Figure 5.1 illustrates that this has an impact on moving images. In order to analyze this, we need some background on a special property of the human visual system, eye tracking, that is introduced in Section 5.2. In Section 5.3 we introduce a temporal display model, that we will use to give a general explanation of motion artifacts in relation to display properties. The remaining part of this chapter discusses this topic in more detail for LCDs and PDPs. 2 To emphasize this analogy, display flicker could, therefore, also be called frame structure. 123

139 Chapter 5 Dynamic display resolution Figure 5.3 Eye tracking: moving images on the screen are transformed to stationary images on the retina. 5.2 Eye tracking The Human Visual System (HVS) is insensitive to high temporal frequencies, i.e. we cannot see very fast variations in light intensity. This behavior can be described by modeling the HVS using a temporal low-pass filter, with a cut-off frequency in the range of 50-75Hz [161]. Temporal intensity variations (in nature and in video) are mostly caused by motion. A moving object will generate a time varying intensity at a fixed position in space. Moreover, a moving object can exhibit temporal frequencies that can be far beyond the HVS cut-off. Nevertheless, humans are able to sharply perceive moving objects. This is possible because moving objects are tracked by the human eye [161, 162, 151, 130], by means of rotation of the eyeballs and head movements. Eye tracking projects the image of moving objects to a fixed position on the retina, as illustrated in Figure 5.3. This assures that there are no high temporal frequencies at the retina, and that moving objects can be perceived sharply. For our purpose, eye tracking can be divided in two distinct phenomena: smooth pursuit eye movement and saccades [85, 31]. Smooth pursuit eye movement occurs when the viewer is following a moving object, resulting in a smooth movement of the point of view. Saccades are very fast eye movements, occurring when the point of attention jumps from one object to another. Saccades can result in angular speeds of up to 1000 degrees per second, and take less than 200 ms. During saccades, visual sensitivity is reduced, so the viewer cannot always clearly perceive what goes on during the saccade. Both effects appear to be driven by a very sensitive, and unconsciously operating, feedback and feedforward control mechanism. In the eye, the display coordinates (x d, t d ), are transformed to retinal coordinates (x e, t e ). With eye tracking, this becomes (x d, t d ) (x e, t e ), { xe = x d + vt d t e = t d (5.1) where v is the speed of the moving objects on the screen, which equals the speed of the point of attention in case of eye tracking 3. The next sections discuss 3 This equation neglects size differences between display and retina, i.e. uses spatial coordinates normalized to image size. 124

140 5.3 Temporal display model and motion artifacts Figure 5.4 Temporal display model: from time-discrete video signal, to time continuous displayed intensity how and why this transformation can lead to various artifacts in moving images, depending on the particular temporal properties of the display. 5.3 Temporal display model and motion artifacts The display converts the video signal to light: a sequence of discrete fields I v (n) is converted to continuous light I d (t). A model of this behavior is shown in Figure 5.4. The display is addressed with the video signal at a certain time (I a (t), see Equation 5.2), after which the opto-electronic effect reconstructs the light intensity, which is represented by the opto-electronic transfer and the temporal reconstruction profile. The model is the temporal equivalent of the spatial model presented in Section 3.2. Just as in the spatial dimension (Section 3.2), the temporal addressing delay should match the space-time varying delay introduced by scanning at the source. This delay, together with the reconstruction should result in as good an approximation of the continuous original as possible. The temporal addressing delay is described by: I a (nt + t d ) = I v (n) = I c (nt + t c ) I a (nt ) = I c (nt + t c t d ) (5.2) where t d is the delay at the display, t c is the delay at the camera, n is the frame number and T is the frame period, and I is normalized intensity. The delay is added to the (starting) time of each frame 4. Next to pure temporal artifacts like flicker, displays can suffer from a second type of artifacts related to the temporal display response. These artifacts occur with moving images, so they are generally called motion artifacts. Motion artifacts are caused by a combination of the display response and eye tracking by the viewer. We will see that motion artifacts are not pure temporal artifacts, because they involve a combination of spatial and temporal effects. Depending on the specific characteristics of the display, the artifacts can have a different appearance. In this section, motion artifacts due to temporal addressing characteristics are discussed. This provides an introduction to motion artifacts due to temporal reconstruction characteristics, which are discussed for LCD and PDP in the remainder of this chapter. 4 Delay is measured relative to each frame. There is obviously an extra (fixed) delay between the camera and the display, depending on the intermediate transmission and storage. This extra delay merely affects to what extent the displayed images are said to be live. 125

141 Chapter 5 Dynamic display resolution a b c Figure 5.5 Motion artifacts due to temporal delay variations. a) Original image: a vertical line, moving horizontally. b) Perceived image, when using FT- CCD and line scan display. c) Perceived image, when using FT-CCD and tiled line scan display Motion artifacts due to temporal addressing characteristics Let s first consider the effect of motion on temporal addressing 5 [49, 118]. When the delay in the camera and display do not match, i.e. there is a nonzero delay difference t = t d t c, spatial distortions will occur in moving images. The reason for this is eye tracking (Equation 5.1), which transforms the temporal delay on the display to a spatial delay (offset) x e on the retina. x e = v t (5.3) Because not all positions on the screen are addressed at the same time, the delay varies at least with position, but it can depend on more parameters. Figure 5.5 shows some examples of temporal delay variations, for an example image that contains a vertical line moving horizontally. Figure 5.5b shows an example where t varies smoothly with x, as is the case with a frame-transfer CCD-camera ( t c 0) and a regular line scanning display t d = T y H. In this case, the distortions will also vary smoothly over the screen, and this is generally not even noticeable. Artifacts become more severe, when t varies more abruptly [49]. In tiled displays, for example, t can make a sudden step at the boundary of two tiles. This results in a break-up of moving objects at the tile boundary, because x e will also change abruptly (Figure 5.5c). As a further example, consider color sequential displays (Section 2.2.8). There, the addressing varies with the color component t = [ t R, t G, t B ], much like the spatial delay variation in subpixel color matrix displays. On moving images, this will give a spatial delay that varies with color, as shown in Figure 5.6. This artifact is called color breakup [105], and causes colored edges to appear on moving objects. Especially for unsaturated images, this is very annoying. Moreover, during fast saccades, the spatial offset between the colors is even larger. Even though visual sensitivity is lower during a saccade, saccadic color breakup is typically very visible and annoying [90]. To prevent artifacts related to these temporal addressing mismatches, obviously the temporal addressing can be chosen such that it matches with the source, and that there are no abrupt variations in the delay. Since there are 5 We assume the temporal reconstruction has a fixed delay 126

142 5.4 LCD temporal characteristics a b c Figure 5.6 Color breakup in color sequential displays. The color components are perceived with a spatial offset when the point of attention moves over the screen. a) space-time plot, b) still image, c) perceived moving image. many types of sources and displays, in practice it is only possible to choose either of them such that there are no abrupt variations. As an alternative, Chapter 6 discusses how video processing can be used to compensate for temporal addressing mismatches. PDPs have a rather exotic temporal addressing, that will be discussed in Section 5.9. In (non-color sequential) LCDs, the temporal addressing is no cause for concern. There, it is the temporal reconstruction that is far from ideal. This also results in artifacts in moving images, but of a different type as those described up to here. LCD temporal characteristics and their relation to motion artifacts is discussed in the next four sections. 5.4 LCD temporal characteristics Figure 5.7 shows a model of the temporal aspects of the LCD display process [74]. As introduced in Section 2.2.5, the LCD temporal properties are determined by three main parts: the active matrix, the LC optical stack, and the backlight. The active matrix is characterized by a Sample&Hold (S&H) behavior: Each pixel receives one video sample per frame time, and holds it for the frame period, resulting in a voltage over the LC-cell. In response to this voltage, the LC molecules in the LC cell change orientation, resulting in a certain transmission of the cell. The cell transmission modulates the backlight intensity to create the display light output. Figure 5.7 Model of the temporal aspects of the LCD display chain. 127

143 Chapter 5 Dynamic display resolution Figure 5.8 The step response of an LCD with an LC response time of 2 frames (33ms at 60Hz). There are several ways to look at the response, such as the step-, impulse- or frequency response. The last two will prove to be very useful to quantify motion blur. Nevertheless, the step response is discussed first, because it is easiest to explain and measure, and used in common practice throughout today s display industry. Figure 5.8 shows the step response of a typical LCD (See also Section 2.2.5) 6. The temporal response, i.e. the intensity over time, can be measured at a single pixel, over a certain area, or as the response of the whole frame, and typically, all these will be the same. Therefore it does not matter for our purpose, whether we use the pixel response, or the display response. We will consider the temporal response separately from the spatial response ( pixels ). The temporal step response, S t (t), is the response to a video signal that is zero for all frames before t = 0, and 1 for all frames from t > 0 onwards. We use the normalized step response. A general step signal S I t (t) from I start to I end is transformed to the normalized step signal via: S t (t) = SI t (t) I start I end I start (5.4) Note, that the input signal is discrete: intensity values (for each pixel) are only known at integer multiples of the frame time 7, not in between. In Figure 5.8, we also assume the backlight to emit light continuously (in Section we discuss non-continuous backlights), and therefore, the LCD display also emits light during the complete frame period. Comparing Figure 5.8 to Figure 5.2, it is clear that the LCD has no large area flicker, because still image parts are also rendered without temporal modulation. A second important observation is that the LC response can have a big influence on the total LCD response 8. The LC molecules need some time to react to changes in cell voltage, which can be significantly longer than the frame period, as in the example of figure 5.8. The LC response time (LC-RT) is 6 In this example, we have neglected the so-called capacitive effect [14], that can cause a small drop in intensity at the end of each frame, for simplicity. 7 This neglects addressing delay because it does not cause (serious) artifacts in LCDs. 8 Note the difference between LC response (only the LC layer) and LCD response (the total display). Section further discusses this. 128

144 5.4 LCD temporal characteristics a b c Figure 5.9 a) Still image, b) moving image (left to right) on LCD with response as in Figure 5.8, c) as (b), but with overdrive (see Section 5.4.1). commonly measured as the time between 10% and 90% of the step intensity. The example in Figure 5.8 has an LC-RT of 33 ms, which is relatively slow, since current LC-RTs can be well below 16ms. With such a slow response, the LCD cannot follow fast changes in the image content. This results in an annoying smearing (blurring) of moving objects, as illustrated in Figure 5.9b. In Section 5.5.1, we explain how these images were made. In most LCDs, the response time depends on the gray-level [111], and more specifically on the gray-level transition. For example, transitions from black to white can be slower than from white to black, and also small transitions (gray to gray) are typically slower than large transitions (black to white). Therefore, a full characterization of the LC response time requires measurement of a number of different gray-level transitions, where the response time of the display is e.g. taken as the average of these [158] LC response time improvement It is not necessary to understand the relation between motion artifacts and the temporal response in detail, to conclude that the LC response must be improved to be able to faithfully reproduce moving images. Therefore, LCD panel manufacturers have put a lot of effort into speeding up the LC response, mainly by applying better materials or by improved LC cell design (e.g. smaller cell gap, other LC effects, etc.). There are currently large screen LCDs with responses around to 1ms [87], but these are certainly not dominating the market. A widely applied method for response time improvement, called overdrive [111], is based on video processing. Overdrive improves the response speed of the LC pixels by exaggerating gray-level transitions between frames, as shown in Figure Depending on the previous and current frame intensities, an overdriven drive value is applied, such that at the end of the frame period, the desired intensity is reached. This currently enables a reduction of the response time to less than the frame period. Overdrive can only speed up responses between intermediate gray-levels, since the limited signal dynamic range prevents overdriving of white and black. Nevertheless, LC gray-to-gray responses are typically much slower than the black-to-white response [95], so overdrive can provide a large improvement of the response speed. As a result of improved panel design and (over)drive schemes, currently the best displays available exhibit response times below 16ms, i.e. below the 60Hz 129

145 Chapter 5 Dynamic display resolution Figure 5.10 LCD overdrive. By adding an overshoot in the drive value after a change in gray-level, the LC response time can be reduced to less than the frame period. frame period. This is a crucial value, since the worst blurring artifacts are prevented for an LCD that can respond to image changes within the frame period (Figure 5.9c). However, speeding up the LC response time to lower values is not enough to eliminate motion blur. Even an infinitely fast LC (LC-RT = 0, i.e. an ideal S&H display) will suffer from motion blur. From the LCD response model of Figure 5.7, we can quickly see why: the LC response is only a part of the LCD response: the Sample & Hold and backlight also contribute. To fully explain the relation between the total LCD response and motion blur, we first combine the temporal response, motion and eye tracking in the next section. This approach has been followed by many authors [88, 39, 56, 132, 167, 93], leading to the previously mentioned conclusion. We will also discuss alternative ways of characterizing motion blur, directly from the response, without using motion. 5.5 LCD temporal response in relation to motion blur The process that leads to motion blur, including the S&H effect can be summarized as follows: The human visual system tracks moving objects across the screen. Thereby, the HVS temporal low-pass characteristic integrates (averages) the light intensity along the motion trajectory. This turns the temporal light profile of the display into a spatial profile. As a result, a sharp, moving object in the image is transformed into a stationary, but blurred, object on the retina. This is illustrated in Figure 5.11, that shows a space-time intensity diagram on a (zero LC-RT) LCD, and the intensity perceived by an eye-tracking viewer. A stationary object (top) is perceived as sharp, while (the edges of) a moving object are blurred, i.e. spread out in space. (middle). The higher the speed, the more blurring occurs (bottom). The integration of intensity along the motion trajectory causes the temporal intensity profile of the display response to be 130

146 5.5 LCD temporal response in relation to motion blur Figure 5.11 Space-time intensity diagram on a (zero LC-RT) LCD, and the intensity perceived by an eye-tracking viewer, illustrating the process that leads to motion blur. Top: stationary single pixel wide line. Middle: same line, moving at one pixel per frame. Bottom: effect of varying speed. mapped to a spatial profile on the retina. An object that moves over the screen is spread out over a wider area on the retina, than when the object would be stationary. Figure 5.12 shows how an originally sharp edge, i.e. a spatial intensity step in the image, is transformed, on an infinitely fast LCD, into a blurred perceived edge. Depending on the speed, the edge transition becomes wider, so the edge is more blurred. These blurred edges are found by integrating the time dependent intensity of each pixel along the motion trajectory (Figure 5.11, [167]). In Section 5.7.2, we will show an alternative method for determining the blurred edge profile from the display response. From the step response, it may seem that the display is very fast and should have no problems with moving images, but Figure 5.11 proves the opposite. A Figure 5.12 The perceived intensity profile of an edge on an ideal hold LCD, moving at speeds of 0, 1, 2, 3, and 4 pixels per frame (ppf). At zero speed, the edge is sharp (it equals the original), but the edge transition becomes wider, i.e. the edge is more blurred, at increasing speeds. 131

147 Chapter 5 Dynamic display resolution Figure 5.13 The temporal impulse response corresponding to the step response from Figure 5.8. reason for the contradiction is that the signal after S&H can be mistaken for the step input, which neglects the S&H contribution to the total display response. The step response must be combined with motion in a space-time description, to find the cause of motion blur. Although this provides an explanation of motion blur, we will show in the next section that the relation between motion blur and display response is more directly described by the temporal impulse response. It allows characterization of motion blur introduced by the display, without resorting to the space-time description, i.e. without using motion The temporal aperture Instead of the temporal step response, we use the temporal impulse response of a display to analyze motion blur. The impulse response represents the light emission profile from the display that results from a single input frame of the video signal. From a signal theoretic point of view, this directly represents the reconstruction part of the display model from Section 5.3, i.e. the temporal display aperture. Figure 5.13 shows a typical LCD impulse response, corresponding to the step response in Figure 5.8. Figure 5.14 shows impulse responses for various LC-RTs, with and without overdrive. The faster the LC response, the more the impulse response resembles a block function, i.e. the ideal S&H profile. With overdrive, the impulse response extends over less than two frames, even for slow responses. The faster the response, the less overdrive is needed 9. A direct measurement of the impulse response involves addressing the display with only a single frame with a certain gray-level. In practice, however, it is easier (and common practice) to measure the display step response. Moreover, since the LC response time usually depends on the gray-level transition, the step response enables measurement of each transition separately. Therefore, we derive the impulse response, that we will from now on call the temporal display aperture A t (t), from the step response S t (t), by subtracting a shifted version of 9 We use normalized responses, so the (also normalized) overdrive signal can go below 0 and above 1 for gray-to-gray transitions. Only the extreme black to white transitions will actually result in out-of-range overdrive values, i.e. in a less effective overdrive. 132

148 5.5 LCD temporal response in relation to motion blur a b Figure 5.14 The (normalized) temporal impulse response for an LCD (driven at 50Hz) with different LC response times (5, 10, 15, 20 and 25ms). b) The same LCD as in (a), but with overdrive. The dashed lines show the applied driving signal (including overdrive). the step response, according to the following formula 10 : S t (t) = A t (t) = A t (t nt ) (5.5) n=0 A t (t nt ) A t (t nt ) n=0 n=1 = S t (t) S t (t T ) (5.6) This way, the temporal aperture corresponds to a single gray-level transition, and it is easily normalized. This makes it easier to measure the full range of gray-level transitions. Directly measured impulse responses will correspond to two transitions: from the starting gray-level to the applied gray-level and back. In general, the directly measured impulse response will not be equal to the temporal aperture, the difference will depend on the variations of the response with gray-level. In those cases where very non-linear responses are concerned, the full range of gray-level transitions is needed to analyze display behavior from the display response. Nevertheless, the application of overdrive will reduce graylevel response dependencies greatly, and the response of fast panels will also vary less with gray-level. Therefore we will analyze single apertures, using normalized intensity, that represent the average response of the display The motion aperture As shown in Figure 5.11, motion and eye tracking represent a transformation of the temporal display response into a spatial effect ([49, 83, 74], Equation 5.3). Consequently, the temporal impulse response can be transformed to a spatial impulse response. The following mathematical derivation shows that this results in a motion-dependent spatial aperture A m (x, v), which we will call the motion aperture 10 By calculating the impulse response from (measured) step responses, the black level of the display is discounted from the response, such that the impulse response always vanishes toward infinity. This is needed (in Section 5.7.2) to determine the integrated intensity of the response. 133

149 Chapter 5 Dynamic display resolution Let s start with the spatio-temporal aperture A(x, t) of the display 11, which is separated into the spatial and temporal apertures A s (x) and A t (t): A(x, t) = A s (x)a t (t) = (A s (x)δ(t)) (A t (t)δ(x)) (5.7) which can also be written as a convolution using δ-functions. Eye tracking with speed v corresponds to a transformation of coordinates (x, t) at the display, to coordinates (x e, t e ) at the retina 12 : (x e, t e ) = (x + vt, t) (5.8) So, the spatio-temporal aperture A(x, t) of the display is perceived as a spatiotemporal aperture A e (x e, t e ) on the retina A e (x e, t e ) = (A s (x e vt e )δ(t e )) (A t (t e )δ(x e vt e )) (5.9) This perceived aperture still is a function of time, and also depends on the speed. To analyze motion blur, which is a perceived spatial effect, we are mainly interested in the spatial properties of the perceived aperture. Therefore, we remove the time dependence by only considering the temporal average ( DC ). This average is a simplification of the temporal low-pass characteristic of the HVS (see also Section 5.6.2), i.e. we assume that the eye cannot perceive high temporal frequencies. The average corresponds to the case where the image itself has no temporal variation, i.e. it is a scrolling still image. When this image is tracked, the viewer perceives a still image again, so its properties in the spatial dimension can be analyzed by discarding all temporal variations 13. In other words, image blur relates to the perceived spatial display aperture A se (x e ), that is the temporal average of the perceived aperture A e (x e, t e ). This is obtained by integrating the aperture over time 14 : A se (x e ) = = A e (x e, t e )dt e A s (x e vt e )δ(t e )dt e A t (t e )δ(x e vt e )dt e (5.10) Due to a property of the δ-function, f(x)δ(x a)dx = f(a), the spatial aperture is not affected by this integration, so Equation 5.10 reduces to A se (x e ) = A s (x e ) A t (t e )δ(x e vt e )dt e = A s (x e ) A m (x e, v) (5.11) 11 1D space is used for now for simplicity. The extension to 2D, including arbitrary motion direction, will be made later. 12 This neglects size differences between display and retina, i.e. the spatial coordinates are normalized to image size. 13 These temporal variations are the pure temporal artifacts, i.e. flicker. 14 The formulation using convolution and δ-impulses, allows the order of convolution and integration to be changed. 134

5.5 LCD temporal response in relation to motion blur Figure 5.15 Due to eye tracking, the temporal aperture of the display is transformed to the motion aperture in the spatial domain on the retina.

150 5.5 LCD temporal response in relation to motion blur Figure 5.15 Due to eye tracking, the temporal aperture of the display is transformed to the motion aperture in the spatial domain on the retina. Left: the spatial and temporal display aperture in display coordinates, right: perceived spatial apertures in retina coordinates. The total perceived spatial aperture is the combination of the display spatial aperture and the motion aperture. This shows that the temporal display aperture is transformed to the spatial domain, where it becomes the motion aperture A m (x e, v), which has a width that is inversely proportional to the speed: A m (x e, v) = = 1 v A t (t e )δ(x e vt e )dt e A t (t e )δ(t e x e v )dt e = 1 v A t( x e v ) (5.12) which uses also another property of the δ-function: δ(ax) = a 1 δ(x). In case v = 0, A m is found by direct substitution in Equation 5.12: A m (x e, 0) = δ(x e ). The calculation is straightforwardly extended to two-dimensional space x, apart from the construct x v. This denotes the time t at which x = vt, which can only be solved if x and v are parallel. This relates to the fact that motion blur occurs only in the direction of motion (See also Equation 5.26). Figure 5.15 shows the motion aperture A m (x, v), and how it is constructed from the temporal impulse response. Figure 5.15 shows the motion aperture for the LCD response from Figure The resulting spatial aperture, as perceived by the viewer, is the combination, i.e. the convolution (Equation 5.11) of the motion aperture and the spatial aperture. Here, we use a square aperture of one pixel wide for simplicity, referring to Chapter 3 for more details of the LCD spatial aperture. Figure 5.16 shows the motion aperture, and the total perceived spatial aperture for several speeds. The higher the speed, the wider the motion aperture extends spatially. This perceived spatial extension of the essentially temporal effect results in motion blur. 135

151 Chapter 5 Dynamic display resolution a b Figure 5.16 The motion aperture (a) and total perceived spatial aperture (b) for several speeds. An illustration of this motion blur was already given in Figure 5.9b. This figure uses the temporal aperture from Figure 5.13, transformed into the motion aperture to filter the original image. Figure 5.9c was made with a temporal aperture including overdrive (as in Figure 5.14). For motion blur-free images as on a CRT, the perceived spatial aperture should depend as little as possible on the speed of motion. Therefore, the motion aperture should be as narrow as possible, so the impulse response should be as short as possible. The step response of an ideal hold LCD may seem infinitely fast if the S&H is neglected, but the response should have approximated a series of narrow pulses to avoid motion blur. The impulse response reveals the amount of motion blur that is introduced by a display more directly than the step response, as explained by the motion aperture concept The temporal MTF and dynamic resolution Instead of analyzing the display response in the temporal domain, we can also analyze the response in the frequency domain. This leads to the concept of the temporal Modulation Transfer Function (MTF). The MTF [28] is a well known method to describe the spatial characteristics (sharpness and resolution) of various optical components, including displays. The spatial MTF is the Fourier transform of the spatial display aperture, which is also called the pixel aperture, point-spread function or the spatial impulse response. Analogous to the spatial MTF calculation, we can also take the Fourier transform of the temporal display aperture to give the temporal MTF. The temporal MTF reveals how much each temporal frequency is suppressed by the display. The temporal MTFs of several types of LCD (ideal hold, normal LCD with LC- RT of 10 and 25 ms, and overdriven LCD with LC-RT of 10 and 25 ms) are shown in Figure The circles indicate (-3dB) bandwidth, that is discussed in Section From the MTFs, we can see that all these LCDs suppress high temporal frequencies considerably, which indicates a loss of temporal resolution. Also, the faster the response, the higher the MTF becomes. These MTFs also show that there is no large-area flicker in any of these LCDs, because the re- 136

152 5.6 Temporal frequency analysis of the display chain Figure 5.17 The MTFs corresponding to the responses in Figure Drawn lines are without OD (10 and 25 ms LC-RT), dashed lines are with OD (10 and 25 ms LC-RT), dotted line is zero LC-RT. Faster response corresponds to a higher MTF. The circles indicate (-3dB) bandwidth. sponses are zero at multiples of the frame rate (f t T = 1). This is a result of the fact that the S&H is applied at the frame rate (50 Hz), combined with a continuously emitting backlight. We will see later how the MTF changes when either of these is different. The temporal MTF of LCDs was already introduced in [89], although there the S&H was falsely introduced as part of the HVS, not of the display. In [110], an MTF was calculated as the Fourier transform of the step response, which is very different from the MTF as introduced above, for example because it leads to infinite amplitude at zero frequency. Figure 5.17 also shows that, at least for the faster LCDs, the MTF is still substantially high (less than 6dB down) for frequencies up to half the frame rate. This means that the highest temporal frequency below the Nyquist limit can still be reproduced. In the spatial dimension, this is considered to be quite good (Section 3.3.3): the display can reproduce pixel-on/pixel-off patterns with considerable modulation. In the temporal dimension, however, this is not enough. The reason for this is explained in Section 5.6, that presents a frequency analysis of the display signal chain. 5.6 Temporal frequency analysis of the display chain The very basic representation of the signal chain from source to displayed image is shown in Figure The original scene, represented as a time varying light video signal light Original image Camera Display I c (x,t) I s (x,t) I d (x,t) Displayed image Sampling Reconstruction Figure 5.18 Basic display chain: original image - sampled image - displayed image. 137

153 Chapter 5 Dynamic display resolution image, is a space-time-continuous intensity function I c ( x, t), where x has two dimensions: x = (x, y) T. This original image is sampled (by the camera) in time and space. The spatio-temporal sampling process is described by: I s ( x, t) = I c ( x, t) Λ( x, t), (5.13) where Λ( x, t) is a three-dimensional lattice of δ-impulses [130]. We can assume a rectangular sampling lattice, which is described by sampling intervals [p x, p y ] and T : Λ( x, t) = δ(x kp x )δ(y lp y )δ(t mt ) (5.14) k,l,m The reconstruction of the physical light emission by the display can be described by a convolution with the display impulse response. This aperture is also a function of space and time: A( x, t). The image, as produced by the display, becomes: I d ( x, t) = I s ( x, t) A( x, t) = (I c ( x, t) Λ( x, t)) A( x, t) (5.15) The two operations of sampling and reconstruction account for a number of characteristic differences between the displayed image and the original. These are best described by a frequency domain description, so we apply the Fourier transform F (F ( x, t)) = F f ( f x, f t ) to Equation 5.15: I f d ( f ( x, f t ) = Ic f ( f x, f t ) Λ f ( f ) x, f t ) A f ( f x, f t ), (5.16) where the Fourier transform Λ f ( f x, f t ) of lattice Λ( x, t) is the reciprocal lattice, with spacings p 1 x, p 1 y and T 1 (the frame rate). The spatial sampling and reconstruction process was already discussed extensively in Chapter 3. This chapter focuses on the temporal behavior. Because many of the temporal characteristics can be described without details of the spatial sampling (i.e. as if the signal was spatially continuous), we will refer to the spatial sampling only when necessary Frequency spectrum of displayed images The resulting spectrum of the displayed image is shown in Figure 5.19 for a display with infinitely short impulse response, i.e. an ideal CRT, which is commonly called impulse-type display. We start with a still image, i.e an image that contains only low temporal frequencies. To simplify the illustration, we omit the spatial repeats, as if the signal was continuous in the spatial dimension. For the displayed images, this is equivalent to assuming that the spatial dimension has been reconstructed perfectly, i.e. the original continuous signal was spatially band-limited according to the Nyquist criterion, and the reconstruction effectively eliminates the repeat spectra. In the temporal dimension, the short duration of the light emission gives a flat reconstruction spectrum. As 138

154 5.6 Temporal frequency analysis of the display chain Figure 5.19 Basic display chain for an impulse-type display. Shown is the spatiotemporal frequency spectrum for, from left to right: original image I c - sampled image I s - displayed image I d with impulse display - perceived image after eye low-pass I p. Figure 5.20 Basic display chain for a hold-type display. The spatio-temporal frequency spectrum is shown for, from left to right: sampled image I s - aperture function A (note: white=low, black=high) - displayed image I d - perceived image after eye low-pass I p. a consequence of this flat spectrum, the temporal frequencies in the baseband (f t < (2T ) 1 ) are not attenuated, but also at least the lowest order repeats are passed. The image, as it is finally perceived by the viewer, is also determined by the characteristics of the HVS, notably its temporal low-pass characteristic. Figure 5.19 shows that, if we assume that the HVS suppresses all frequencies f t T 1, the perceived image is identical to the original image. This assumption is not always true, given the existence of large area flicker on CRTs. This is caused by the first repeat spectrum (at low spatial frequencies) that is not completely suppressed for frame rates < 75Hz. Section introduced the frequency response of LCDs, which suppresses considerably more high frequencies than the CRT response. Here, we use the example of a zero LC-RT LCD, of which the frequency response is described by the Fourier transform of the box function with a width T h = T, i.e. a sinc characteristic: A f ( f x, f t ) = sinc(πf t T h ) (5.17) The displayed image according to Equations 5.16 and 5.17 is shown in Figure 5.20, that shows the image spectra in the display signal for the hold-type display. This figure shows again that LCDs eliminate large area flicker at all frame rates, because the sinc characteristic has zero amplitude at the sampling frequency. 139

155 Chapter 5 Dynamic display resolution Original image Motion Sampling Reconstr. Tracking low-pass filter Perceived image I c (x,t) I m (x,t) I s (x,t) I d (x,t) I e (x,t) I p (x,t) Figure 5.21 A simple HVS model, added to the display signal chain. From left to right: original (still) image - moving image - sampled image - displayed image - eye tracking - low-pass filtering. Figure 5.22 Moving image spectrum: original (still) image I c - moving image I m - sampled moving image I s Frequency spectrum of displayed moving images Now we consider the effect on moving images. The whole chain from original image to perceived image is shown in Figure The separate parts in the chain are explained next. First, an original still image I c is transformed to a moving image I m, according to the speed 15 v: I m ( x, t) = I c ( x + vt, t), (5.18) The motion is modeled as a simple translation. Equation 5.18 can also be transformed to the frequency domain [130], where it becomes: I f m( f x, f t ) = I f c ( f x, f t v f x ) (5.19) This results in a shearing of the spectrum as shown in Figure 5.22, reflecting that spatial variations in a moving object will generate temporal variations in the image [46, 130]. This moving image is then sampled and reconstructed in the display chain, resulting in the displayed moving image 16 (similar to Equation 5.16): I f d ( f ( x, f t ) = Im( f f x, f t ) Λ f ( f ) x, f t ) A f ( f x, f t ), (5.20) This image reaches the eye, where eye tracking will transform the moving image back to a stationary image on the retina (I e ), which represents the inverse 15 Here the speed is measured in the units used for x and t. When the sampling intervals p x, p y and T are known, v can also be expressed in pixels per frame. This corresponds to the motion vector or frame displacement vector. 16 We use the symbol I d (and I e, I p) without subscript m, for both still and moving images. Motion is indicated in the formulas, either by I m, or by (non-zero) v. 140

156 5.6 Temporal frequency analysis of the display chain Figure 5.23 Perceived moving image on an impulse-type display. The first part of the chain is shown in Figure The second part is shown here, from left to right: Displayed image I d - after eye tracking I e - after eye low-pass I p. Figure 5.24 Perceived moving image on a hold-type display. The first part of the chain is shown in Figure The second part is shown here, from left to right: Aperture function A - displayed image I d - after eye tracking I e - after eye low-pass I p. transformation of Eqs and 5.19: and I e ( x, t) = I d ( x vt, t) (5.21) I f e ( f x, f t ) = I f d ( f x, f t + v f x ) (5.22) Substituting Equation 5.20 in Equation 5.22, gives the image as projected onto the retina of the eye tracking viewer: I f e ( f x, f t ) = = ( Im( f f x, f t + v f x ) Λ f ( f x, f t + v f ) x ) A f ( f x, f t + v f x ) ( Ic f ( f x, f t ) Λ f ( f x, f t + v f ) x ) A f ( f x, f t + v f x ) (5.23) The perceived image Ip f ( f x, f t ) after low-pass filtering by the HVS is shown in Figure 5.23 for an impulse-type display, and in Figure 5.24 for a hold-type display 17. From Figure 5.24 we can see that high spatial frequencies in the perceived image are suppressed, i.e. the perceived image is blurred. Therefore, the effect of the temporal aperture of the display, combined with eye tracking, can be described as a perceived spatial filtering of moving images. 17 We again assume perfect reconstruction in the spatial domain 141

157 Chapter 5 Dynamic display resolution Figure 5.25 Spatial frequency response (white is high, black is low) of the spatial filtering due to the temporal display aperture and eye tracking, as a function of speed (in pixels per frame, if we express f x in cycles per pixel, and T = 1 frame) Motion blur as a perceived spatial filter To further analyze motion blur in the frequency domain, we focus on the spatial properties of the perceived image. Therefore we simplify the temporal low-pass filtering of the HVS, and only consider the perceived image at temporal DC, i.e. f t = 0: I f p ( f x ) = I f p ( f x, 0) (5.24) and we also assume that the original image contains no temporal variation: I f c ( f x, f t ) = 0 for f t 0 I f c ( f x ) = I f c ( f x, 0) (5.25) In other words, we consider the perceived blur of a scrolling still image. This is the same procedure we followed to derive the motion aperture in Section The frequency description, however, shows the complete situation. The perceived image can still contain high temporal frequencies, which can be seen as flicker when they are below the visual threshold frequency. Therefore, although the main effect of non-ideal reconstruction is the spatial effect of blur, there are also reconstruction artifacts at higher temporal frequencies. Using also Equation 5.17, the observation that motion blur is a spatial filtering of moving images due to the temporal aperture of the display, is described as follows: I f p ( f x ) = I f p ( f x, 0) = I f c ( f x, 0) A f ( f x, v f x ) = I f c ( f x ) sinc(π v f x T h ) (5.26) The filter of Equation 5.26 depends on the motion speed v and the hold time T h. The vector dot-product v f x is zero for components f x perpendicular to v, and the filter has no effect on these frequencies. This shows that the motion blur only occurs in the direction of the motion, i.e. only frequency components 142

158 5.7 Motion blur metrics parallel to the motion direction are attenuated (see also Section 5.5.2). The sharpness perpendicular to the motion is not affected. Therefore, the inherently two-dimensional behavior of the filter can be simplified by only considering frequencies parallel to the motion. Figure 5.25 shows the amplitude response of this filter as a function of the size of the speed ( v ) and spatial frequency along the motion direction ( f x v/ v ). Although the temporal hold aperture is beneficial with respect to large area flicker, it will cause a spatial blurring of moving objects on the retina of the viewer. Higher spatial frequencies will be attenuated by the sinc characteristic, and this attenuation will increase with speed. 5.7 Motion blur metrics To compare two displays with different temporal characteristics, a measure for the motion blur is required, i.e. a motion blur metric. With very slow response LCD panels, e.g response times exceeding the frame time, the LC-response dominates the motion blur, and it is enough just to mention the LC-response time. After years of ongoing LC response time reduction, the point has been reached where the S&H becomes a significant contribution to the motion blur. To accurately characterize the dynamic performance of LCDs, a response time measure is required that represents the total response of the LCD, including S&H. Moreover, several motion blur reduction methods have been introduced that target the S&H (see Section 5.8), which cannot be characterized by LC-RT alone. Consequently, new motion picture response times (MPRT) have been proposed [62] that aim at measuring motion blur including S&H behavior Motion picture response time (MPRT) Figure 5.26 shows one way to quantify the motion blur, which is the basis of MPRT proposals. These methods take a moving edge (a spatial step), and measure the blurred edge width (BEW, [167, 62] typically the 10-90% response), for different speeds. Experiments have shown that the BEW correlates well with perceived motion blur [142], for a number of LCD types. Figure 5.26 On an LCD, a moving edge appears blurred to the viewer. The width of the blurred edge profile (BEW) indicates the amount of motion blur. 143

159 Chapter 5 Dynamic display resolution Figure 5.27 a b Measuring motion blur using blurred edge widths and -times. a) BET for ideal hold. b) BET as function of LC-RT for an LCD with and without overdrive, and the ideal hold display (LC-RT=0). The BEW can be measured [110, 63] (e.g. in pixels), or calculated by integrating the temporal step response along the motion [167] as in Figure Because the edge width depends on the speed, the first attempts to characterize motion blur by means of blurred edge profiles did not result in an unambiguous response value [88]. Normalizing the width by the speed does quantify motion blur with a single value, as was found later [63]. This value can be regarded as the blurred edge time (BET, e.g. in ms), which can be further normalized by the frame period, resulting in the N-BET (dimensionless, % of frame time), which forms the basis of the proposed MPRT [62]. For example, Figure 5.27a shows the blurred edge for the ideal hold, which will have a N-BET of 0.8, because the blurred edge profile takes 80% of the frame time to rise from 10% to 90%. Figure 5.27b shows the BET as a function of LC-RT. A similar figure appears in [63]. This shows that the concept of MPRT provides an attractive method to investigate some of the aspects of motion blur. For example, the figure clearly shows that the S&H effect dominates the motion blur for low response times, because the MPRT has a lower limit at the ideal S&H display MPRT (13ms at 60Hz). Also, the figure shows that overdrive can reduce motion blur significantly, but its effect decreases if the response time is low, i.e. when S&H dominates the response. In further MPRT proposals [142], the N-BET is extended to the E-BET, with E-BET = 1.25N-BET. This is intended to result in an MPRT that equals the frame time for an ideal hold. However, this destroys the backwards compatibility with the LC-RT, since the 10-90% BET asymptotically approaches the LC- RT for slow response LC, as Figure 5.27b shows. Therefore, the MPRT in the following analysis will use the simpler 10-90% BET, referred to as MPRT BET, unless mentioned otherwise MPRT directly from impulse response In this section, we use the motion aperture to relate blurred edges directly to the temporal response. This allows a determination of the MPRT, without using blurred edges, i.e. without actually using motion information [76]. An edge is blurred when the perceived spatial display aperture has non- 144

160 5.7 Motion blur metrics zero width. The blurred edge is obtained from convolving the perceived spatial aperture A se (x) (Equation 5.11) with a (unit) spatial step signal S(x) B x (x) = S(x) A se (x) = S(x) (A s (x) A m (x, v)) (5.27) where B x (x) is the blurred edge profile, and S(x) is a spatial step S(x) = { 0 if x < 0 1 if x 0 (5.28) Equation 5.27 shows that the blurred edge is determined by a speed-dependent part, i.e. the motion aperture A m (x, v), but also by a speed-independent part, i.e. the spatial aperture A s (x). The spatial aperture can cause blurring even in static images (see Chapter 3), which is not motion blur. Therefore, the spatial aperture should be discounted from the blurred edge when measuring motion blur. In previous MPRT definitions this is not done, which implies that it is assumed that A s (x) has no influence on the blurred edge width, i.e it has zero width. For LCDs this is almost correct, since the spatial aperture is much smaller than the motion aperture, even at low speeds, but this does not hold for all displays. Later in this section we show that this assumption is not needed, but for the following derivation we do assume that A s (x) has no influence, so S(x) A s (x) = S(x), and B x (x) = S(x) for v = 0. The blurred edge now becomes: B x (x) = S(x) A m (x, v) = S(x) 1 v A t( x v ) (5.29) Equation 5.29 relates B x (x) to A t (t) via Equation A convolution of a function f(x) with the unit step S(x) is equal to integrating f(x) from to x, so B x (x) becomes 18 B x (x) = x 1 v A t( u )du (5.30) v This again shows that the blurred edge depends on the speed v. The higher v, the wider B x (x) extends spatially, as was also demonstrated in the previous sections, e.g. in Figure Since the width of B x (x) scales with speed, there is no unique blurred edge for a display. To arrive at a single number that characterizes motion blur, the blurred edge is normalized by substituting x = vt, which corresponds to the transformation from BEW to BET (Section 5.7.1). Thus, B x (x) is transformed into a function of time, B t (t): B t (t) = B x (vt) = vt 1 v A t( u v )du = t A t (w)dw (5.31) 18 Assume v > 0, the case for v < 0 is found by using A t( t) in stead of A t(t) 145

Chapter 5 Dynamic display resolution Figure 5.28 The MPRT BET is equal to the time it takes to produce the central 80% of intensity in the temporal aperture.

161 Chapter 5 Dynamic display resolution Figure 5.28 The MPRT BET is equal to the time it takes to produce the central 80% of intensity in the temporal aperture. Note that B t (t) does not depend on the speed. In the previous examples, the determination of the blurred edge time involved the use of motion at several places, i.e. integrating intensity along the motion vector, or convolution with the motion aperture, and by scaling position with speed. However, Equation 5.31 shows that B t (t) is also calculated by simply integrating the temporal aperture. Note also, that B t (t) is normalized, because A t (t) is also normalized. The MPRT BET is determined by taking the 10% and 90% transition points in B t (t). Because B t (t) is the integral of A t (t), the MPRT BET is equal to the time it takes to produce the central 80% of the intensity of the temporal aperture, as shown in Figure In other words, using the motion to go from temporal display response and spatial step signal to BEW, and motion again to go back to BET is an unnecessary detour in hindsight. Motion is not needed at all to characterize motion blur if we use the temporal impulse response. Furthermore, the specific measurement method of 10-90% BET proposed in the MPRT BET is directly found from the (integrated) impulse response. In Sections 5.4 and 5.5.1, it was already mentioned that the LC response typically depends on the gray-level transition. Therefore, the MPRT BET will also vary with gray-level, whether it is determined from the temporal aperture or not. This also means that, like with the LC-RT, the MPRT BET of a display has to be derived from a set of response measurements for different gray-levels. Finally, using the temporal aperture to determine motion blur, also takes out the influence of the spatial aperture. It is no longer necessary to assume that the spatial aperture is infinitely small, because it is not part of the calculation. The width of the temporal response directly corresponds to the motion blur, i.e. to the speed-dependent part of the perceived blur, even when the spatial aperture is not negligible. On the contrary, when the motion blur is measured from a (moving) spatial edge, the spatial aperture will have an influence. This will at least bias the MPRT to higher values at low speeds, but can severely affect the MPRT when the spatial aperture deviates more from the ideal, for example with CRTs. 146

162 5.7 Motion blur metrics Temporal bandwidth We have described the motion blur effect as a convolution with the motion aperture in the spatial domain. We can also characterize the motion of a display, by considering the temporal frequency domain. We aim to characterize the dynamic resolution of a display, with a more direct measure of dynamic behavior than an edge blur based MPRT. Therefore, we use the temporal MTF of the display (Section 5.5.3), and introduce the temporal bandwidth based MPRT, MPRT BW : MPRT BW = 0.35 f 3dB (5.32) where f 3dB is the frequency where the MTF has dropped 3dB (71%) off the maximum (at DC), which is a measure of the bandwidth (BW) of the system. In Figure 5.17, bandwidths are indicated as a point on each MTF curve. Equation 5.32 is obtained from the following rule-of-thumb for the 10-90% rise-time of first order linear systems [104], e.g. the exponential approximation for the LC-response: t rise = ln( ) 2πf 3dB = 0.35 f 3dB (5.33) We can consider the blurred edge width as the rise-time of the display. This is not only the rise-time of the LC, but of the whole LCD signal chain as illustrated in Figures 5.7. We have already shown that the temporal aperture characterizes all response parts in a simple way, so we can also use the temporal aperture to determine the rise-time, i.e. the bandwidth, of the total LCD system. Now, we can relate the dynamic resolution of the display in a direct way to its temporal properties, in particular the bandwidth of the temporal MTF. Figure 5.29 shows the MPRT BW (Equation 5.32), as a function of LC-RT for several LCD types (ideal hold, normal and overdriven). The MPRT BW corresponds very well 19 to the MPRT BET, which was shown in Figure Therefore, we conclude that for the display apertures in this figure, the bandwidth provides a measure for the motion blur of a display that is equivalent to the BET. This can be understood by realizing that motion blur corresponds to the dynamic resolution of the display, which is represented by the temporal MTF. Higher BW gives higher resolution, i.e. less motion blur. Moreover, the bandwidth calculation also shows that we do not need to assume a certain motion in order to determine the amount of motion blur. We can characterize motion blur by only using temporal display characteristics. Actual blurred edge widths do not have to be determined. Either the duration, or the bandwidth of the temporal aperture can already give us this information. 19 ln( The value of 0.35 is an approximation to the precise value of ) = This 2π approximation seems justified, since the accuracy of perceptual experiments that relate BEW to blurring is much less. Moreover, the MPRT BW for an ideal hold (sinc MTF) also shows a small deviation from the MPRT BET, of ca. 1%. To make the ideal hold MPRT BW exactly equal to the MPRT BET, this value should be , which is also approximated by

163 Chapter 5 Dynamic display resolution Figure 5.29 MPRT BW as a function of LC-RT for several LCD types (ideal hold, normal and overdriven) Discussion There are a number of reasons why measuring the MPRT via the temporal aperture would be preferred over the conventional blurred edge based method. First, measuring the BEW is not trivial [110], because it either involves a camera that must follow moving images, or a measurement of the temporal response, followed by an integration of the response along the motion ( time integration [167]). The temporal impulse response provides an equivalent measure, either directly (Equation 5.31) or via the MTF (Equation 5.32), as shown in Figure 5.29, without using motion. The measurement of a pure temporal step/impulse response is much simpler than transforming the process to the spatial dimension using eye tracking. Note that we are determining an imperfection of the display, not of the HVS, so it should not be necessary to include HVS characteristics to get a number of the temporal performance display. Nevertheless, when the perception of motion blur is concerned, for example to determine when a certain amount of blur is visible or objectionable, or when a single number like the MPRT is not enough to characterize the motion blur, the transformation to the spatial domain is useful, because it relates to how we perceive these temporal display aspects. A detailed understanding of the perception of resolution of moving images should lead to better motion blur metrics, and this is still an interesting subject of further research. Moreover, measuring motion blur via the spatial blurring of moving edges assumes that the original edge is infinitely sharp, i.e. the spatial aperture of the display is very small. If it is not, the combined effect of spatial and temporal apertures is measured (see Figure 5.15), which does not result in a pure response time, and the measurement is again dependent on the speed. Especially when this measurement is used on CRTs (as in [110]), or on displays that apply (spatial) image enhancement, inconsistencies will occur. Therefore, the temporal impulse response and bandwidth are useful tools to describe the temporal characteristics of displays, and to quantify motion blur. No cameras or moving images are needed, only a (fast) light sensor. The method is less sensitive to timing, and the influence of the spatial aperture is taken out. Of course, a moving camera is still a useful tool to capture images from a display 148

164 5.8 Motion blur reduction methods a b c Figure 5.30 Temporal response of frame rate increase on LCD with 12ms LC-RT: a) step, b) impulse, c) MTF. as they appear to an eye-tracking viewer. 5.8 Motion blur reduction methods The previous section concluded, via several methods, that motion blur is proportional to the duration of the temporal display aperture, where this duration can be measured in several ways, in time or in frequency domain. Therefore, the aperture must be shortened in time to reduce motion blur. Referring to the LCD model from Figure 5.7, the temporal response can be improved by modifying any of the three parts: S&H, LC and backlight. The LC-based options were already covered in Section but, because it is not the only contribution to the display response, infinitely fast LC response is not enough to eliminate motion blur. Consequently, many methods have been proposed to modify the display in other ways in order to tackle this problem [88, 39, 49, 135]. All affect the temporal aperture in some way. In fact, all of these methods attempt to increase the bandwidth of the temporal aperture. This section presents an overview of display based motion blur reduction methods, i.e. methods that change the display response. We describe these in terms of the temporal response model and dynamic resolution, as presented in the previous sections. The next chapter discusses video processing methods for motion blur reduction, that are complementary to the display based methods Higher frame rates To shorten the temporal aperture, there are several methods that change the S&H characteristics of the display. One of these methods is to increase the frame rate [88, 39, 49, 135]. Figure 5.30 shows the step and impulse response and MTF of a 60Hz LCD, and of two high frame rate (HFR) LCDs 20 : 90 and 120 Hz. From the impulse response and MTF it is clear that increasing the frame rate increases dynamic resolution. Doubling the frame rate reduces the holdtime by a factor of two, which increases the bandwidth, even when combined with a typical LC-RT of 12ms. 20 Similar results are found for 50, 75 and 100 Hz. 149

165 Chapter 5 Dynamic display resolution Figure 5.31 Motion blur (MPRT BET ) of frame rate increase from 60 to 120 Hz. a b Figure 5.32 Motion blur simulation of frame rate increase: a) 60Hz, b) 120 Hz. Figure 5.31 shows the bandwidth/mprt for these frame rates, as a function of LC-RT, and Figure 5.32 shows a perceived moving image simulation that illustrates the difference between 60 and 120Hz. At short LC-response times, the double rate gives half the motion blur of the normal rate, while it brings only a minor improvement if the response time is long, or if overdrive is not applied. In order to be effective, this method requires a motion compensated frame rate conversion [88, 49]. The application of simple frame repetition will not decrease the hold time. For example when the frame rate is doubled by frame repetition, the response is not changed, because a pixel is still addressed with the same drive signal for the same time as when the frame rate is not increased. Moreover, frame repetition can cause annoying artifacts like irregular motion ( judder ) for non-integer up-conversion ratios [46]. Besides the required, relatively complex, video processing [43], the higher frame rate also requires better panel performance, such as shorter line address times Black and gray frame insertion An alternative to the motion compensated frame rate increase is Black frame insertion (BFI) [87, 56, 108, 107]. With BFI, each frame is divided in two subframes: a data and a black subframe. In the data subframe, the LCD is driven to the desired level, while the black subframe is always driven to black. BFI only changes the S&H part of the LCD, not the LC response or backlight. 150

166 5.8 Motion blur reduction methods a b c Figure 5.33 Temporal response of black frame insertion: a) step, b) impulse, and c) MTF. BFI has the disadvantage of halving the light output. Moreover, BFI requires high rate panels just as HFR, because it also increases the addressing speed. The duration of the black field does not have to be 50% of the field. This is called Black Bar Insertion (BBI [56]), and the general method can be called Black Data Insertion (BDI). The actual response of BFI will depend on the specific driving scheme that is applied to drive the panel to black. In the following examples, we assume that the same amount of overdrive can be applied to the black subframe as is used for the data subframe. This represents an average condition between the two extreme driving schemes: no overdrive or full overdrive. The full overdrive scheme should result in a response that can reach black at the end of the frame time. Especially for low black bar ratio s, this requires a large amount of overdrive, which is not realistic. With less than the full overdrive applied, the response will not reach black within one frame time for low black bar ratios. Normally, however it is not possible to apply overdrive to black, so a fast response panel is needed to successfully apply BFI [87], in which case the overdrive is not significant. A detailed comparison of all BFI variations is outside the scope of this thesis, so we continue with the average same overdrive condition. The step, impulse and frequency responses of BFI with 50% and 33% black bar ratio are shown in Figure The corresponding impulse response, and consequently also the MTF, are actually identical to the HFR response with frame period equal to the data subframe period. There is one major difference, however, related to the first zero in the MTF. Both BFI and HFR change the position of the zero-crossing in the temporal MTF. With BFI, this corresponds to a dark period (see the step response), which results in large-area flicker, i.e. the MTF at the frame rate is unequal to zero. The HFR display still gives a continuous light emission, i.e. the MTF zero occurs at the frame rate. HFR does not produce large-area flicker, and also preserves the display light output. A variation on BFI, is Gray Frame Insertion (GFI) [73], which prevents the light output reduction of BFI, by also using the second subframe to produce light for higher gray values. This also reduces flicker, because high intensities show less temporal modulation. GFI presents a trade-off between motion blur and flicker, by varying the amount of light in the second subframe. The response of GFI depends on the gray-level. It is equal to 0% BBI for high gray values, and equal to 50% BBI for low gray values. Therefore, the amount of motion blur is somewhere in between 0% and 50% BBI, depending on the settings of the 151

Chapter 5 Dynamic display resolution a b Figure 5.34 Impulse response of Gray frame insertion. a) Different gray-levels have a different response, e.g. gray-level 0.25 (subframe values 0.

167 Chapter 5 Dynamic display resolution a b Figure 5.34 Impulse response of Gray frame insertion. a) Different gray-levels have a different response, e.g. gray-level 0.25 (subframe values 0.5 and 0) has a shorter impulse duration than level 0.75 (subframe values 1 and 0.5). b) 2D plot of impulse response vs. gray-level. The lines indicate the 10% and 90% MPRT BET threshold times. blur/flicker trade-off. An extensive evaluation of GFI is outside the scope of this thesis because, besides the trade-off setting, it will also depend on the statistics of typical source material, i.e. the relative importance of bright and dark image parts. Nevertheless, Figure 5.34 shows the impulse response for a simple setting of GFI: maximal intensity in first subframe, and minimal intensity in second, for a number of gray values. The response is plot in two ways: as a set of aperture curves (a) and in 2D (b), where the horizontal axis is time, the image intensity represents the response intensity, and the vertical axis represents the gray-level dependency. The 2D plot also shows lines that represent the 10% and 90% intensity points (see Figure 5.28), between which the MPRT is measured. This shows that the MPRT actually depends on the gray-level, from approximately 50% of the frame period at low gray-levels, to almost 90% of the frame period at high gray-levels. Figure 5.35 shows the step response (as a 2D intensity vs. time and graylevel plot), MTF, and MPRT, where the MPRT is taken as the average MPRT over all gray-levels. The step response shows that GFI will produce flicker only for the low and medium gray-levels. The MTF depends on the gray-level. The MPRT is in between 60 Hz and 120Hz HFR. The MPRT, however, illustrates a typical problem that comes with a display of which the response is variable. We have already discussed the dependence of the LC-RT on gray-level, but with a b c Figure 5.35 Gray frame insertion: a) step response, b) MTF, c) The MPRT (average) lies between 60Hz and 120Hz FRI. 152

168 5.8 Motion blur reduction methods Figure 5.36 Ideal and realistic scanning backlight temporal intensity profile (35% duty cycle). GFI we see another type of variation. When the temporal characteristics (such as the dynamic resolution) of such a display must be summarized in a single number, it is inevitable that this number corresponds to an average value of the response range. Transferring further details of the response will require more statistics, or some sort of functional description of how the response changes, for example a plot like in Figure Scanning backlight The backlight can also be modified to reduce the duration of the display aperture. This is achieved by switching off the backlight during a part of the frame, which involves a scanning backlight (SB) [39, 141]. Figure 5.36 shows both an ideal, and a realistic temporal backlight profile that takes into account characteristics like lamp decay time, spatial light profiles and lamp cross-talk. Since backlights are constructed from a number of horizontal fluorescent tube lamps, and the display surface must be uniformly lit, the lamp cross-talk (the intensity contribution at each position from the different lamps) can be significant. Figure 5.37 shows the resulting temporal impulse response, how it is composed of its parts, and how the aperture signal is built in the LCD model (Figure 5.7). The backlight intensity profile is modulated with the LC response, i.e. the pixel transmission. This shows that, due to the finite backlight duty-cycle, the backlight cross-talk and the non-zero LC-RT, the first and last parts of the a Figure 5.37 b The total temporal aperture is a combination of three parts: S&H, LC and backlight. a) The different parts overlayed on the aperture, b) the aperture as a signal in the LCD display model. 153

169 Chapter 5 Dynamic display resolution a b Figure 5.38 a) The scanning backlight LCD temporal aperture for different duty cycles: 50%, 35%, 25% and 12% (total intensity decreases with duty cycle, which is not corrected here). b) Corresponding MPRT BET. a b c Figure 5.39 Motion blur simulation of a scanning backlight LCD. a) Continuous backlight (LC-RT 12ms), b) Scanning backlight (duty cycle 35%), c) as (b) but with sub-optimal timing. LC response also receive light. This leads to the so-called ghosting effect[141], discussed later in this section, but which is directly visible from the temporal aperture. Figure 5.38 shows the effect of varying duty cycle, in the aperture (a), and in the MPRT BET (using 10%-90% threshold) in (b). Figure 5.39 shows a simulated moving image, with and without scanning backlight. It also shows the effect of using a sub-optimal setting of the backlight timing, i.e. the on-period of the lamps relative to the display addressing. This timing is important, which is the main reason for making the backlight scanning in stead of flashing : the addressing delay varies from top to bottom of the screen, and the lamp timing must be in phase with it [39]. The effect of wrong timing can be clearly seen from the temporal aperture. Figure 5.40 shows the temporal aperture, as a function of lamp timing. Note that Figures 5.37a and 5.38a represent a horizontal cross-section in Figure So-called ghosting artifacts, i.e. double images, can appear if the timing is wrong [141], i.e. when the backlight is on when the LC is halfway its transition. This causes side-lobes next to the main impulse peak, which are visible, via the motion aperture, as double images. For each duty cycle and LC response time, the backlight timing can have a different optimum. The backlight timing dependence is another example where the temporal aperture can vary, like the gray-level variation of GFI. The difference is that the backlight timing will be removed by optimization during design of the display 21, while the GFI variations 21 If the number of backlight segments is too small, the timing cannot be optimized for every 154

$5.8 Motion blur reduction methods Figure 5.40 The scanning backlight LCD impulse response, at 35% duty cycle, as a function of lamp timing (measured as a fraction of the frame period).$

170 5.8 Motion blur reduction methods Figure 5.40 The scanning backlight LCD impulse response, at 35% duty cycle, as a function of lamp timing (measured as a fraction of the frame period). The lines indicate the MPRT threshold points at transition percentages of 5, 10 (drawn line), 15, 20, 25, and 30%. are inherent. The 2D aperture plot is a useful tool for analyzing the effects of such variations MPRT sensitivity In Figure 5.40, the upper and lower threshold points, between which the MPRT is measured (see Figure 5.28), is also plot for several choices of this threshold. This shows that, due to the complex profile of the impulse response, the MPRT value depends heavily on the MPRT threshold percentage. For example, around the optimum timing (0.55), the position of the 10% and 90% threshold varies substantially. This indicates that the MPRT value will be very sensitive to lamp timing. MPRTs for the scanning backlight LCD and the high frame rate LCD are shown in Figure 5.41, as a function of the applied threshold (e.g. 10% indicates BET measurement between 10% and 90% intensity). In order to compare different thresholds, the extended blurred edge time (E-BET) must be used. The figure shows the results for a scanning backlight with several duty cycles, and for a normal LCD at various frame rates. Figure 5.41 also shows that there is a large variation in MPRT with the MPRT threshold percentage, especially for SB. This will certainly make it difficult to measure MPRT accurately in practice, especially using complicated equipment like the measurement system from [110]. Moreover, this shows that it is not possible to capture all the details of the display response in a single parameter. There is definitely room for more research on this topic, especially where complex responses like those of SB are concerned. Nevertheless, in the following section, we use the MPRT to compare several motion blur reduction methods. In this comparison, the display parameters (i.e. SB as above) and the threshold (10%) are such, that the MPRT sensitivity to position exactly. Some timing variation can therefore remain, depending on the position on the screen. 155

171 Chapter 5 Dynamic display resolution variations in these is moderate. To fully compare the motion blur performance of a display, we need to consider the full temporal response, and remember the conclusion from Section 5.7.4: that this holds all information to characterize motion blur Motion blur reduction method comparison Figure 5.41 also allows a comparison of the motion blur reduction methods presented in the previous sections [76]. Note again, that the results for BDI (0%, 33% and 50%) are identical to those of HFR (60Hz, 90 Hz and 120Hz, respectively), and that GFI is in between 0% and 50% BDI. The figure shows that going from 60 to 120Hz improves the E-BET (10-90) from 20ms to approximately 10ms. Applying a scanning backlight can reduce MPRTs to less than 10 ms at a low duty cycle, while the result at a higher duty cycle is comparable to 120Hz. Table 5.1 summarizes the MPRTs of several LCD motion blur reduction methods discussed in this chapter. Finally, it must be noted that the MPRTs that are presented here can still vary from display to display. First because there is still a large variation in reported system parameters of LCDs, such as response time and the backlight characteristics, of which we used only typical values in this comparison. Also, we assumed overdrive is applied and fully functional, but in practice the effectiveness of overdrive heavily depends on the specific response characteristics of the display. Overdrive is typically not perfect for all transitions, so MPRTs may be a little higher in practice. Nevertheless, this effect is small for the 12 ms LC-RT in this comparison, as apparent from the small difference between the MPRTs for normal and overdriven LCD. Figure 5.41 MPRT (E-BET) of several motion blur reduction methods at LC- RT=12ms, as a function of MPRT (lower) threshold: scanning backlight with 15, 25 and 35% duty cycle (drawn, from top to bottom, respectively), and 60, 90 and 120Hz LCD (dashed, from bottom to top, respectively). 156

172 5.9 PDP motion artifacts Method BET (ms) E-BET (ms) Table 5.1 Ideal hold display Normal LCD Overdrive High frame rate (90 Hz) High frame rate (120 Hz) Scanning backlight (35%) Scanning backlight (25%) Scanning backlight (15%) Black Data Insertion (50%) Black Data Insertion (33%) Gray Frame Insertion Comparison of several LCD motion blur reduction methods, using MPRT, for an LC-RT of 12 ms, and a frame rate of 60 Hz (unless mentioned otherwise). 5.9 PDP motion artifacts Plasma Display Panels (PDPs) also suffer from motion artifacts. As with the other examples in this chapter, these motion artifacts appear because eye tracking transforms the temporal response into a spatial effect. In PDPs, the temporal response is a result of the subfield 22 driving method as introduced in Chapter 2, and shown in Figure A model for the temporal behavior of PDPs is shown in Figure First, the subfield generation splits each video field into subfields and for each subfield a state S n is determined using the gray-level and the subfield distribution. Each subfield has a different delay t n, and is converted into light during the sustain time W n. Normally, the viewer is unable to perceive the individual subfields because they change too fast to follow, so the light of the subfields is integrated by the HVS for each pixel to the correct luminance value. However, when there is movement in the scene, the viewer will track the motion and, hence, perceive the subfields according to: N s I p ( x) = W n S n ( x t n v) (5.34) n=1 where, for each (input) field, I p ( x) is the perceived intensity of an image moving with speed v, and W n and S n are the weight and state (0 or 1, depending on 22 The term subfield is common language in PDP literature, so we use it here, although Sub-frame, or simply field could be more appropriate. Although for most PDPs the video has been de-interlaced before the display input, we will use fields instead of frames, for consistency. 157

Chapter 5 Dynamic display resolution a Figure 5.42 b Temporal aperture of PDP subfield driving as a function of input intensity (normalized gray-level).

173 Chapter 5 Dynamic display resolution a Figure 5.42 b Temporal aperture of PDP subfield driving as a function of input intensity (normalized gray-level). a) an ideal hold display for reference, b) PDP with a 6 subfield binary distribution ([ ]). Figure 5.43 Model for the temporal behavior of PDPs. the gray-level) of subfield n at position x, respectively. The total number of subfields is N s, and the delay of each subfield is given by t n. The temporal delay of each subfield causes a spatial mis-positioning when the viewer tracks moving images in the scene. Consequently, the subfields corresponding to each image in the sequence are perceived at different positions. This leads to the artifact described as Dynamic False Contours (DFC) [103, 165]. DFC is illustrated in Figure 5.44, that shows a space-time plot, and a simulated perceived image from the face sequence. The lady moves from left to right across the screen, which causes highly unnatural contouring at the locations of subfield transitions (e.g. where the highest weighted subfield switches on). The dynamic false contours are particularly visible in smooth gray scale areas. There, very big luminance differences can appear when the eye moves relative to the screen 23, because even very small gray-level transitions can change the subfield light profile substantially. Moreover, the temporal response is spread out over the whole frame period, so PDPs also suffer from motion blur. Nevertheless, motion blur on PDPs is much less than on typical LCDs (the average temporal aperture duration is about 40% of the frame period). We can also see the PDP response as a rather strangely shaped temporal aperture that varies with the gray-level (compare with Gray Frame Insertion from Figure 5.34). However, regarding the temporal response as a subfield dependent 23 Not only in smooth pursuit tracking, but also visible with saccadic eye motion. 158

5.9 PDP motion artifacts a Figure 5.44 b PDP subfield driving creates the Dynamic false contour motion artifact. a) simple space-time description with 6 binary weighted subfields.

The lady moves from left to right across the screen, which causes highly unnatural contouring at the locations of subfield transitions (e.g. where the highest weighted subfield switches on).

174 5.9 PDP motion artifacts a Figure 5.44 b PDP subfield driving creates the Dynamic false contour motion artifact. a) simple space-time description with 6 binary weighted subfields. b) simulated DFC on the face sequence (color version in Figure 6.7). The lady moves from left to right across the screen, which causes highly unnatural contouring at the locations of subfield transitions (e.g. where the highest weighted subfield switches on). a Figure 5.45 b PDP DFC reduced drive (spatially averaged) temporal profiles, as function of gray-level, for a) Linear Drive, and b) Duplicated Subfields. delay, leads us to a video processing based compensation method, discussed in Chapter 6. Since DFC is a very visible and highly unnatural artifact, the simple subfield driving as introduced in Chapter 2 is not considered suitable for TV applications. Therefore, DFC reduction methods must be applied. We can distinguish two types of DFC reduction methods. The first type is the display based solution, which changes the subfield driving scheme of the display. The second type is video processing based, which processes the image such that, given the display driving scheme, artifacts are reduced. The next chapter discusses this last type in more detail. The next subsection will outline some of the alternative driving schemes that are applied in PDPs to reduce DFC. 159

175 Chapter 5 Dynamic display resolution PDP motion artifact reduction by alternative drive schemes Most DFC reduction methods [97, 150, 134, 71, 166, 164] use a subfield distribution that differs from the simple binary one, to distribute the light emission more evenly in space and time, and avoid sharp transitions between gray-levels. Besides simply changing the order of the subfields in the frame, two main methods, applied in consumer products, can be distinguished (Figure 5.45): Linear Drive (LD) [150, 71], and Duplicated Subfields (DSF) [97]. LD aims at eliminating the irregular changes in temporal profile between successive gray-levels. This is achieved by preventing that, when increasing the gray-level to the next higher level, subfields are switched off. An extreme case is the full LD, which can also be seen as a pulse-width modulation, shown in Figure 5.45a. With LD, a subfield is only switched on if all the preceding subfields in the frame are also on. This increases regularity of the temporal response over the gray-level range, and effectively reduces DFC. Nevertheless, LD comes at the penalty of reduced number of gray-levels. The binary distribution has 2 N gray-levels, while the full LD has only N + 1. An advantage of LD is that the gray-levels can be chosen such, that the opto-electronic transfer (gray-level to displayed intensity) can be made to approximate a gamma characteristic (as in Figure 5.45a). This gives a perceptually optimal distribution of the limited number of gray-levels (see Section 2.1.1). With DSF, a redundant subfield distribution is used. In such a distribution, a gray-level can be obtained with more than one subfield state vector S(n) (See Equation 2.3). This is possible because the DSF subfield weight vector W (n) has non-binary weights, i.e. it uses a redundant base. In particular, the highest weighted subfields are duplicated at several times in the frame period, e.g. W = [ ], as shown in Figure 5.45b. For a certain gray-level, there is now a choice which of the duplicated subfields to use. For example, gray-level 48 can be made by subfield vectors S = [ ], [ ] or [ ]. By spatially dithering this choice, i.e. alternating the choice between adjacent pixels, the DFC is averaged out. Figure 5.45b shows this averaging in the temporal response, where for each graylevel, the average of all possible subfield combinations is shown. The result is a temporal response that shows much less abrupt variations with small change in gray-level. The redundant distribution means that DSF suffers from a reduced number of gray-levels, as compared to the binary distribution with the same number of subfields, and the dithering scheme can also result in checkerboard noise patterns. Moreover, the temporal response is still extended over the frame time, so motion blur is also present, sometimes appearing as double images for DSF. A variation on the DSF drive scheme is the so-called Multiple Level Subfield (MLS) scheme [134]. With MLS, the subfield order is reversed for each alternate pixel, which is possible because each subfield, through a special drive scheme, can be set to two different weights (normal and reverse order). Furthermore, more complex subfield drive methods exist [35]. These aim at optimizing, given the number of subfields, both the weight vector and the subfield drive vectors. This optimization can be done off-line, storing the result 160

176 5.10 Conclusions in a LUT, or theoretically even in real-time, by optimizing for all gray-level transitions that occur in an image. These methods are very computationally intensive, and rely on the selected optimization criterion, which should be a meaningful measure for DFC. Although a lot of research has been done on understanding and reducing DFC, there has been much less effort into measuring PDP motion artifacts in a standardized way, especially when compared to LCD. One reason for this, is that by applying techniques like described above, it is possible to very effectively reduce the DFC. Artifacts that remain are not motion artifacts, but rather loss of bit-depth, inaccurate gray-level response, increased noise and the like. We can also see from our analysis of the PDP response, that motion blur is present, but to a much smaller extent than in LCDs, because the temporal aperture is much shorter than the frame time Conclusions The image quality of a display is, besides the spatial properties, for a large part determined by its temporal properties. These properties can lead to pure temporal artifacts like flicker, but also determine the quality of moving images, i.e. the motion portrayal quality. Due to the HVS property of eye tracking, temporal effects in the display are transformed to spatial effects in the perceived image. Although FPDs outperform the CRT in the spatial dimension, their non-ideal temporal properties result in motion artifacts. We have modeled the temporal display response, in order to analyze the relation between temporal display properties and image quality. The temporal display response can be divided in two parts: temporal addressing and reconstruction. Artifacts in LCDs relate to a non-ideal temporal reconstruction, while artifacts in PDPs, and e.g. color sequential displays, are related to the temporal addressing. A temporal addressing mismatch between source and display can result in spatial position errors in the perceived image. In general, these are not very severe, but particularly for tiled and color sequential displays, artifacts can be very visible and annoying. The relation between display temporal reconstruction characteristics and motion artifacts has been discussed for LCD and PDP. We introduced the temporal impulse response as a tool to characterize the dynamic performance of displays. The temporal impulse response has a direct relation to motion artifacts, specifically motion blur, via the motion aperture. LCDs suffer from motion blur, in the first place because of the slow LC response. Nevertheless, the sample-and-hold and backlight are essential parts of the response, which is the reason why decreasing the LC response time is not enough to eliminate motion blur. We presented a simple model of the temporal aspects of the LCD signal chain, which was used to discuss the relation between temporal display properties and motion artifacts. We introduced the temporal aperture of LCDs, which directly shows the importance of the sample-and-hold and backlight in the response. 161

177 Chapter 5 Dynamic display resolution The temporal impulse response was used to derive a method that can characterize motion blur, without involving image motion. This emphasizes that motion blur is a spatial effect with a purely temporal cause. Analogous to the well known spatial MTF, the temporal impulse response relates to the temporal MTF of a display. In particular, the temporal bandwidth that is derived from the MTF has a direct relation to motion blur, which can therefore also be seen as a reduced dynamic resolution of the display. We showed that this method gives equivalent results to previously proposed motion picture response times (MPRT). The temporal impulse response and bandwidth can simplify the quantification of motion blur in displays to a purely time-based measurement and characterization. Using the temporal impulse response, we compared several display based motion blur reduction methods: higher frame rate, black frame insertion, gray frame insertion and scanning backlight. With these methods, the dynamic resolution of displays is improved considerably, reducing the MPRT to under 10ms. The comparison shows that a scanning backlight with a 35% duty cycle results in approximately the same reduction of motion blur as a frame rate increase from 60 to 120Hz. Finally, we described the temporal response of PDPs, where the subfield driving method results in the dynamic false contour motion artifact. A short overview of alternative subfield driving schemes for PDP motion artifact reduction was presented. 162

178 CHAPTER 6 Video processing for motion artifact reduction Motion compensated filtering for LCD and PDP In the previous chapter, we concluded that the temporal response of FPDs can lead to artifacts in moving images. With LCD displays, motion blur is the dominant artifact, and the amount of motion blur is inversely proportional to the display s temporal bandwidth. To reduce display motion blur, the display characteristics can be improved using a number of methods, as described in the previous chapter. All these improvements are based on reducing the duration of the temporal impulse response, i.e. increasing the display s temporal bandwidth. PDPs also suffer from motion artifacts, not so much because of the temporal extension of their response, but because of their irregular behavior with graylevel. Also here, motion artifacts can be reduced by adapting display parameters. In this chapter, we investigate video processing algorithms, as an alternative to modifications to the display, for reducing motion artifacts on FPDs. Con- Figure 6.1 Video processing for display motion artifact reduction consists of two components: temporal aperture correction and temporal delay compensation. Both components require motion estimation. 163

179 Chapter 6 Video processing for motion artifact reduction trary to the previous chapter, where we concluded that motion is not needed to characterize motion artifacts like motion blur, we will show that the motion of objects in the image plays an essential role in these algorithms. Using the display model from Section 5.3, we can define the required processing to compensate for the display, as shown in Figure 6.1. It basically consists of two components: delay compensation, and temporal aperture compensation. In Section 6.1, general temporal delay compensation is discussed. Section 6.2 presents algorithms for PDP motion artifact reduction, which forms a particular case of temporal delay compensation. Video processing for LCD motion blur reduction, which is the most relevant case for temporal aperture correction, is discussed in Sections 6.3 to Video processing for temporal delay compensation Compensation for the delay part of a temporal display response simply consists of adding a temporal delay to the video signal 1, which is the inverse of the delay that is introduced by the display. In moving images, a temporal delay translates to a spatial delay, as explained in Section To apply this transformation in a video processing algorithm, the basic techniques of motion estimation and motion compensation [46] are required. Motion estimation finds the motion of objects 2 in each frame. Motion compensation corresponds to applying a spatial shift, x = v t, to all objects in a frame according to their motion v and the required delay t. Temporal delay compensation is not very relevant for most displays, since the delay variation does not cause serious artifacts (Section 5.3.1). There are, however, a number of exceptions. The next two subsections discuss two of these: color sequential displays and higher frame rate displays, respectively. PDPs are a third display type that require temporal delay compensation, which is discussed in Section Color sequential display compensation In color sequential displays, the temporal delay depends on the color component: t = [ t R, t G, t B ]. Therefore, a compensation of this display delay involves shifting all objects in the image [49, 90], according to [ x R, x G, x B ] = v [ t R, t G, t B ] as illustrated in Figure 6.2. After the delay at the display, the RGB components of the image are aligned along the motion vector, so the viewer perceives all color components of a moving image at the same location. Note, however, that color break-up due to eye saccades (Section 5.3.1) is not prevented, since this also occurs on stationary images. The only way to prevent this extreme artifact is to change the display characteristics, since it is not related to image motion. 1 In so far that this delay is not already introduced on the source side. 2 Objects can also be blocks or pixels in the image 164

180 6.1 Video processing for temporal delay compensation Figure 6.2 Motion compensation for color sequential displays Delay compensation for higher frame rates A classic example of a display that requires temporal delay compensation, although it is usually not referred to as such, is one that runs at a higher frame rate than the input video. In that case, each input frame is converted into one or more output frames, where each output frame has a different delay (relative to the nearest input frame). For example, doubling the frame rate converts one input frame into two output frames, of which one has a delay of half an input frame period. This delay is compensated by a motion dependent spatial offset. In the previous chapter, higher frame rate driving was presented as a display modification for LCD blur reduction. However, in the context of this chapter, we can question whether a frame rate increase is a pure display modification. Although higher frame rate reduces the S&H time, which is considered a display parameter, and although we were able to describe the motion blur reduction from higher frame rate driving using the temporal display response, the display frame rate is not independent from the source and from the video processing chain. For FPDs, the frame rate is about the only display characteristic that is adaptable to the source, not fixed by the display 3. A second reason why the frame rate cannot be considered a display parameter alone, is the influence of the motion compensated video processing. In the previous chapter this was neglected, i.e. it was assumed that the display receives input frames at the required rate. But frame rate conversion might introduce artifacts that reduce the quality improvement obtained by the improved display response. Nevertheless, for most cases where motion blur is most visible, such as large pans over a scene, motion estimation and compensation can be almost artifact free. We consider a detailed discussion on frame rate conversion outside the scope of this thesis, and refer to [46]. 3 Some displays, however, do require physical modifications to run at higher frame rates. 165

181 Chapter 6 Video processing for motion artifact reduction Figure 6.3 Motion compensated subfield generation: To reduce motion artifacts on PDPs, the temporal delay of each subfield is compensated. 6.2 PDP motion artifact reduction The non-ideal temporal response of PDPs also leads to motion artifacts, particularly the Dynamic False Contour (DFC) artifact. Section 5.9 described how DFC is related to the subfield driving method of PDPs, and showed some alternative driving methods that can reduce DFC. These driving methods prevent DFC by changing the subfield distribution, but they actually do not remove the basic cause of DFC: the subfield-dependent temporal delay (See Figure 5.43) When the motion of objects in the scene is taken into account, we can try to actually put the subfields at the correct position in space and time, i.e. to compensate for the temporal delay of each subfield, as illustrated in Figure 6.3. This is called motion compensated subfield generation, and is described in the following sections Subfield delay compensation As with other types of temporal delay compensation algorithms (Section 6.1), subfield delay compensation involves a motion dependent spatial offset x n = vt n. In this context, it is better to describe the PDP response using subfield dependent delays, rather than as a gray-level dependent aperture (Section 5.9). The subfield dependent delays enable a compensation on each subfield independently, instead of on the field as a whole. In the following, however, it will become clear that special compensation algorithms are required that can deal with the highly nonlinear nature of the subfield drive scheme. Motion dependent techniques for PDPs have been reported [149], but they were not investigated widely, probably due to the difficulty of finding the correct motion of all objects in a video signal. However, high quality motion estimation and compensation have been implemented several years ago in scan conversion ICs for consumer applications [48, 44]. These ICs prove the feasibility of truemotion estimation and robust up-conversion at a consumer price level, enabled by the 3D recursive search block-matcher motion estimation algorithm [43]. Based on this motion estimation technology, we can develop motion compensated algorithms, such that the motion artifacts of PDPs will be reduced. The feasibility of this concept has been reported [30], and the algorithm presented here [77] builds on this work to arrive at an optimal motion compensation, applying the motion vectors from [44]. In the following sections, we discuss the basic algorithm for PDP motion 166

182 6.2 PDP motion artifact reduction position x+4 motion trajectory x+3 x+2 x+1 rounding errors a x x-1 SF0 SF1 SF2 SF3 SF4 SF5 Picture n-1 time SF0 SF1 SF2 Picture n b Figure 6.4 Rounding errors when shifting subfields. a) With a (1-dimensional) displacement of objects equal to 4 pixels between two pictures, 4 out of the 6 bits from position x are shifted to non-integer pixel positions and the ignition of a subfield has to be rounded to the nearest pixel as illustrated. b) Simulated perceived image (color version in Figure 6.7). compensation. In Section 6.2.2, we show a method to prevent some of the problems encountered in the basic algorithm. In Section 6.2.3, we propose a new algorithm to further reduce artifacts and arrive at a more optimal subfield generation for PDPs. Finally, we show some results, compare the new algorithm to other methods, and draw conclusions. Figure 6.4 shows the motion compensation method from [30]. This method uses the direct application of subfield delay compensation: the subfield drive value of each pixel in each subfield is shifted over a fraction of the motion vector: x n = D( x) t n T (6.1) where T is the frame period, and D is the motion vector, i.e. the displacement of each object between adjacent frames ( D = vt ). This works perfectly for certain discrete speeds, but generally the positions after compensation are non-integer. Therefore we have a rounding problem, as we cannot ignite pixels at less than full intensity in a subfield. Moreover, there may be areas in the picture to which multiple vectors point (double assignments), or to which no vectors point (holes). In these cases, ad hoc decisions for these areas are required, i.e. which information to use, or how to fill the areas, respectively. A compensation method for these rounding errors is proposed in [55], but in the next section, we propose a method to avoid the rounding problem altogether Preventing rounding errors Rather than shifting the information to a position in the current subfield along the vector, we start from a position in the current subfield and look along the motion vector to fetch the input information (located at the start of the picture period). Figure 6.5a shows the procedure. Via the fetch method, we can interpolate on the original intensity signal to get information from non-integer positions, and take the value corresponding to 167

183 Chapter 6 Video processing for motion artifact reduction position position x+4 MSB LSB motion trajectory x+4 interpolations x+3 x+3 x+2 x+2 input pixels x+1 interpolated pixel values x+1 Motion vector x x a x-1 SF0 SF1 SF2 SF3 SF4 picture n-1 SF5 time SF0 SF1 SF2 picture n b time x-1 SF0 SF1 SF2 SF3 SF4 SF5 SF0 SF1 SF2.. Picture n Picture n+1 c Figure 6.5 The fetch algorithm. a) The motion compensated intensity value at position x in each subfield is fetched by interpolation between existing pixels in an available picture. The state of that pixel in the subfield is determined from this interpolated intensity. b) The subfields that contribute to the perceived intensity are still taken from different values. c) Simulated perceived image (color version in Figure 6.7). the current subfield using the conventional bit-slicing or look-up table methods. Furthermore, we can use more advanced interpolation methods that are robust against vector errors, e.g. using multiple fields or non-linear filtering [45, 109]: ( I MC ( x, n) = Med I( x t n D, T n f 1), I( x + (1 t ) n T ) D, n f ), I av ( x, n f ) (6.2) where I MC is the motion-compensated luminance valid at position x in subfield n (delay t n ) between field n f and field n f 1. I av is defined as: I av ( x, n f ) = 1 2 (I( x, n f ) + I( x, n f 1)) (6.3) and Med is the median function, defined as: A if B <A<C or C <A<B Med(A, B, C) = B if A<B <C or C <B <A C otherwise (6.4) However, the creation of unintended intensities along the motion vector is not completely prevented, because the subfields along the motion vector will still be taken from different (interpolated) values, as shown in Figure 6.5b. Even a small change in these values can cause unwanted subfield transitions, particularly when the interpolation is around a gray-level where a higher weighted subfield switches, as shown in Figure 6.5c. The following section describes a method to prevent this Accumulate and switch subfields Our new algorithm combines the advanced interpolation strategy of Equation 6.2 with a method to prevent the creation of unintended intensity values along the motion vector. The principle is that for each pixel in a subfield, we have to 168

184 6.2 PDP motion artifact reduction Accumulated value Desired value Weight Switch Criterion Subfield state a Figure 6.6 b a) The switch criterion decides upon ignition of a pixel in the subfield with the given weight, such that the accumulated value is as close as possible to the desired value. b) The accumulated value along the motion vector over one preceding field period is calculated using interpolations between the subfield pixels. choose between making no light, or a fixed amount of light corresponding to the weight of the subfield. With this choice, we can decide upon ignition of the current pixel such that the intended intensity (e.g. from Equation 6.2) is approached as closely as possible by the actual integration along the motion vector. Figure 6.6a shows the essence of this new algorithm. The switch criterion uses three inputs: the desired (intended) value valid at the time of the subfield, the accumulated (integrated) intensity along the motion vector from previously switched subfields and the weight of the current subfield. The accumulated value can be calculated using motion-compensated interpolation between the pixels in the previous subfields, to get the contribution of non-integer positions on the subfield grid to the integrated intensity. Now, accumulating the light-pulses, e.g. over the last complete field time (see also Figure 6.6b), the accumulated intensity, I acc, at time t n (e.g. the time corresponding to the current subfield) in the event of motion is: I acc ( x, t n ) = r j=1 W j S j ( x ( t n t j T + m) D( x, n f ), n f m) (6.5) where S j ( x, n) is the state of subfield j at position x and time t j in field n f. Because x is not necessarily integer, S j is obtained by interpolation and, can therefore have any value between 0 and 1. The accumulation runs over a number of subfields, r. The integer m indicates in which field the particular subfield is located (m = 0 for the current field, m = 1 for the previous field, etc). In this general form of the algorithm, any subfield distribution can be used and the integration (accumulated value) can be calculated over any number of preceding subfields, as long as the switch criterion guarantees that the integration along the motion vector yields the correct intensity. However, to ensure the stability of the algorithm, it is limited to the subfields belonging to the current field only, i.e. the algorithm cannot look beyond the field period boundaries: 169

185 Chapter 6 Video processing for motion artifact reduction a b c d Figure 6.7 Results of motion compensated subfield generation (simulated perceived images) on the face sequence, using 8 binary weighted subfields. a) no reduction (BWS), b) shifting, c) fetch, d) fill-up. m = 0 and r = N s in Equation 6.5. The processing is done in order of decreasing weight and the switch criterion becomes: S = (I acc + W < I MC ), (6.6) where S is the state of the pixel in the current subfield (weight W ), I A is the value accumulated over the previously calculated, more significant, or higher weighted, subfields (Equation 6.5), and I MC is the desired value for the pixel at the time instance of the current subfield (Equation 6.2). In words: the pixel in the current subfield is switched on when its weight added to the accumulated intensity is smaller than the desired value. At the start of the field, the accumulated value is zero, because the accumulation is limited to within the current field, which is empty at the start. The pixels in the field can only add to the accumulated value, and there is no means to decrease the amount of light whenever there is an overshoot. Therefore, the intensity is filled-up to the desired value, as defined by Equation 6.6. Note that, although the order in which the subfields are processed is sorted by decreasing weight, the algorithm does not specify the subfield distribution, which is separately specified by the combination of subfield times t n and weights W n, where the index n specifies the processing order. Therefore, this algorithm can be combined with any subfield distribution. To decrease disturbances, whenever the estimated motion does not equal the tracking by the viewer (either due to a wrong estimation or for instance saccadic eye movements), the algorithm can be combined with other artifact reduction methods such as those employing a special subfield distribution. This is, however, considered to be outside the scope of this thesis Results Figure 6.7 compares the uncompensated binary weighted subfields (BWS) with the shift, fetch and fill-up motion compensated subfield generation algorithms, using simulations of the face sequence (see Figure 5.44) as perceived by a viewer tracking the motion. 170

186 6.3 LCD temporal response compensation BWS DSF MLS Shift Opt.shift Fill up Error value frame nr. Figure 6.8 Comparison of perceived error ([55]) on the face sequence (45 fields), for several DFC reduction methods. Figure 6.8 compares the fill-up algorithms to the basic BWS distribution, and to several (DFC reduced) subfield driving methods, using an error metric that takes into account the characteristics of the human visual system including motion tracking [55]. This metric essentially calculates the accumulated intensity along the motion trajectory for each pixel in the image, much like Figure 6.6b. The DFC reduction methods are DSF and MLS (see Section 5.9.1), subfield shifting, and an optimized shift method that compensates for rounding errors [55]. 6.3 LCD temporal response compensation As shown in Chapter 5, LCD motion artifacts are not caused by temporal delay effects, but by the reconstruction part of the LCD temporal response: the temporal aperture. The apertures that are typical for LCDs, i.e. relatively regular profiles with a duration that is approximately equal to the frame period, lead to motion blur as the main motion artifact. The following sections describe video processing algorithms that are targeted at reducing motion blur. To define the general properties of a video processing algorithm for motion blur reduction, we will first repeat Equation 5.26, showing that the perceived moving image I f p ( f x ) is obtained from the input image I f c ( f x ) by applying a spatial filter A f ( f x, v): I f p ( f x ) = I f c ( f x ) A f ( f x, v) = I f c ( f x ) sinc(π v f x T h ) (6.7) Here, A f ( f x, v) is the motion aperture in the frequency domain. This shows that motion blur can be seen as a speed-dependent spatial low-pass filtering of the image. We use the example of the ideal hold display throughout the following sections, but other responses will also be considered. The ideal hold 171

187 Chapter 6 Video processing for motion artifact reduction Video input H inv (x,t) Recon- struction Tracking Perceived image Processing Display Eye Figure 6.9 A pre-compensation (inverse) filter is applied to reduce the motion blur effect of the display+eye combination. response has a sinc-shaped frequency response. See Figures 5.17 and 5.25 for some plots of this response. We are interested in a video processing algorithm that compensates for this low-pass filtering in the video domain [49, 15], as illustrated in Figure 6.9. The application of a video processing based method for LCD motion blur reduction has some specific advantages over panel based methods. As mentioned in Section 5.8, panel based methods require better panel specifications, can cause large area flicker, or reduce light output Inverse filtering To find an algorithm that compensates the display characteristic, we define the inverse filter to Equation 6.7: H f inv ( f 1 x ) = sinc(π v f x T h ) (6.8) This is a purely spatial filter, reflecting the observation that the temporal aperture of the display, combined with eye tracking, results in a spatial low-pass filter. The cascade of inverse filter, display and viewer, ideally results in a perceived image that approaches the original image as close as possible. A similar problem is known from image restoration, where the blurring of images due to camera motion is considered [145, 115, 113]. The solution involves a deconvolution of the signal with the blurring kernel, typically by direct execution of Equation 6.8 in the frequency domain. This is however not a good solution for our problem. First of all, these methods are computationally very intensive. Secondly, objects in the image can have many different speeds that also change abruptly at object boundaries. This requires a locally adaptive method that responds quickly to motion vector changes. Therefore, we propose an alternative solution: Motion Compensated Inverse Filtering (MCIF) [83, 82, 49]. 6.4 Motion compensated inverse filtering From the characteristics of the motion aperture ( display + eye filter, Equation 6.7), a number of observations can be made. First, the suppression of high frequencies increases with increasing speed. Second, the vector dot-product v f x 172

6.4 Motion compensated inverse filtering 6 5 display+eye inverse filter approximation 4 H(f x ) 3 2 1 a 0 0 0.1 0.2 0.3 0.4 0.5 f x x b Figure 6.

188 6.4 Motion compensated inverse filtering 6 5 display+eye inverse filter approximation 4 H(f x ) a f x x b Figure 6.10 Amplitude response of a) the display-eye filter (sinc) for a speed of three pixels per frame, the corresponding inverse filter, and a simple approximation. b) the inverse filter as a function of speed. is zero for components f x perpendicular to v, and the filter has no effect on these frequencies. This shows that the motion blur only occurs in the direction of the motion, i.e. only frequency components parallel to the motion direction are attenuated. Therefore, the inherently 2D behavior of the filter can be simplified by only considering frequencies parallel to the motion. This requires that the inverse filter acts along the motion vector, which can be oriented in any direction. Figure 6.10a shows the 1D frequency response, parallel to the motion, of the display-eye combination and of the inverse filter, for a speed of three pixels per frame. Figure 6.10b shows the 1D response of the inverse filter as a function of speed. We further observe that the sinc characteristic can cross zero, at frequencies that depend on the speed (integer multiples of fs 2v ). These zeros correspond to an infinite gain in the inverse filter and cannot be compensated. Therefore, we can only approximate the original, as the inverse filter can never be perfect. Moreover, a practical solution should also limit the computational complexity. We therefore also look for an alternative to typical inverse filtering methods [145, 115, 113], since these require frequency transforms, recursive or iterative techniques, or large matrix inversions Basic implementation Figure 6.10a also shows the response of a simple 3-taps [ 1, 2, 1]/4 high-pass filter added to the original. When the gain of this filter is adjusted, the inverse filter can be approximated to a reasonable extent for different speeds. This results in a very simple compensation system, the basic implementation of a Motion Compensated Inverse Filter (MCIF), as shown in Figure The speed dependent behavior is controlled by motion vectors from a motion estimator. We use a 3-D recursive search block-matcher [43]. ICs based on this estimator have been available at a consumer price for some years [44]. This type of motion estimator has the very important advantage over other popular estimators that it is able to estimate the true motion of objects. True motion 173

Chapter 6 Video processing for motion artifact reduction input + to display HPF direction x gain speed a motion estimation b Figure 6.11 a) Basic motion compensated inverse filtering.

189 Chapter 6 Video processing for motion artifact reduction input + to display HPF direction x gain speed a motion estimation b Figure 6.11 a) Basic motion compensated inverse filtering. The high-pass filter (HPF) is oriented along the direction of the motion (b), and the gain is controlled by the speed of the motion. a Figure 6.12 b a) Amplitude response of a simple MCIF filter, along the motion direction. b) Combined response of MCIF and display+eye. vectors are essential for this application, as this is the motion that best describes the eye tracking of the viewer for each separate object. The MC-inverse filter is oriented along the motion vector by interpolating the 1-D filter, as shown in Figure 6.11, which results in essentially a 2-D filter kernel. The (high-pass) filter can be very simple, i.e. low-order or very few taps, when we adjust the gain with the size of the motion. Figure 6.12 shows the amplitude response of this filter, and the combined result of filter and display+eye (see Figure 6.9). It shows that, compared to Figure 5.25, there is much less suppression of high frequencies. Although we use the example of an ideal hold display, MCIF is not limited to this single response. Filters for other temporal display responses can be derived from the display temporal MTF (Figure 5.17). Particularly, displays that have a modified response for motion blur reduction (Chapter 5), can still benefit from MCIF if the MCIF filter is adapted to fit the display MTF. In most cases, this simply means that the MCIF gain can be lowered, because the display MTF is higher Temporal MTF compensation As discussed in Chapter 5, motion blur directly relates to a reduced temporal MTF of the display. MCIF can therefore be seen as compensating for a non-ideal temporal MTF [75]. 174

190 6.4 Motion compensated inverse filtering Figure 6.13 Sharpness enhancement / spatial MTF correction. High frequencies are boosted by adding a spatial-high-pass filtered version of the input signal. In the spatial dimension, such video processing corrections are very common. The effect of a non-ideal spatial MTF is a loss of resolution, i.e. blurring. Spatial MTF compensation is commonly referred to as sharpness enhancement or unsharp masking [46], but in this respect it can also be seen as a form of spatial aperture correction. Figure 6.13 shows a typical system diagram. This type of processing is usually not able to fully compensate the MTF, because a very low MTF requires high signal amplification that can either run into dynamic range limitations, or can suffer from noise amplification. In the next sections, we will discuss similar effects for the MCIF temporal MTF compensation. The main difference between MCIF and spatial sharpness enhancement is its dependence on motion information. Motion and eye tracking make temporal MTF compensation quite different from the spatial variant. When the spatial approach from Figure 6.13 is directly transformed to the temporal dimension, the spatial HPF becomes a temporal filter. Actually, overdrive is an example of this, using a two-frame HPF, where the gain depends on the new and old frame intensities via a look-up table. However, overdrive can only improve the response up to the limit of the S&H (See Figures 5.14 and 5.29). When applying overdrive with too high gain, ghost images can appear, because the temporal response overshoots the desired value. Following the sampling theorem, a temporal filter can only affect frequencies below the temporal Nyquist frequency. Therefore, such a filter cannot correct motion blur beyond the S&H limit, because for this purpose higher temporal frequencies must also be reached. MCIF achieves this by transforming the temporal effect to the spatial domain with motion information, thereby letting the compensation filter operate in the spatial domain Results The MCIF system of Figures 6.11 and 6.12 was tested on an LCD-TV simulation setup, which consists of a PC-based video streamer that can play back stored sequences in real time, a DVI to LVDS panel interface board, and a 30 inch LCD-TV panel (1280x768@60Hz, without additional processing). Although the display panel had a listed response time of 12 ms, we performed a measurement of the response times for each gray-level transition, and found an average response time of 20 ms. To further increase the response speed, overdrive was used to get the response time to within one frame time. In order to visualize the results in this thesis, we simulated the blur that is seen by an eye tracking viewer. This simulation is based on motion vectors 175

191 Chapter 6 Video processing for motion artifact reduction a b c Figure 6.14 Results of motion compensated inverse filtering: a) original image: the scene pans from left to right across the screen. b) (simulated) motion blur on a sample-and-hold LCD display, as seen by an eye tracking viewer. c) (simulated) image as seen by an eye tracking viewer, after pre-correction with MCIF. 176

192 6.5 Robust MCIF that are obtained from the motion estimator described in Section 6.4. The light intensities as a function of time, resulting from the temporal response of the display, are integrated along the motion trajectory. Figure 6.14 shows an example of the perceived image, for an impulse-type display (the original image) and a sample-and-hold display with and without MCIF correction. This simulation is able to show the effect of the sharpness increase. The improvement is comparable to the one that is observed on the LCD panel mentioned above. The LCD panel simulations showed, as illustrated in Figure 6.14, that the sharpness of moving images has considerably been improved. Nevertheless, some limitations of the basic MCIF implementation can be observed. These limitations are caused by the fact that the filter amplification becomes very high even for moderate speeds. Although the simple filter already makes the necessary approximation to the infinite gains in the ideal filter of Equation 6.8, the filter amplification can still be very high, as shown in Figure For example, at a speed of 3 pixels per frame, a signal at the Nyquist frequency is amplified by a factor of 4. This results in two problems: dynamic range limitations and noise amplification. These are discussed in the next section, where a robust version of MCIF is introduced. 6.5 Robust MCIF The dynamic range limitation is fundamental: we cannot compensate a black-towhite transition with any video processing method, without sacrificing (static) contrast. However, in typical video source material, most transitions will be between intermediate gray-levels, and so there is still room for improvement of these. The modified version of MCIF, described in the following, also reduces the effects of limited dynamic range, by restricting the overall gain of the filter. The noise amplification actually has two causes. Besides the high filter amplification, the motion estimator is likely to find an incorrect vector in image parts that contain no details. These parts will contain noise, which can be undesirably amplified at high gains. Nevertheless, even when the motion has been estimated correctly, noise amplification is visible. If we assume that the viewer follows the motion, the noise appears blurred in the same way as the rest of the image. This should result in a perceived amplitude of the noise that is equal to the amplitude that is present in the (static) original image. This can be illustrated when we simulate the perceived image, i.e. we integrate the intensity over time along the motion trajectory [127]. Figure 6.15 shows the MCIF processed images (the display input), and the simulated perceived images. Noise is clearly visible in the MCIF processed image (Figure 6.15b). In the simulated result after display and eye tracking (Figure 6.15d), this noise has been filtered out. In other words, the display filter compensated for the effect of MCIF, just as we intended (see Figure 6.9). These simulations are very useful to illustrate the reduction of motion blur, but they do not show the noise problem. In practice, when the processed images are shown on an LCD display, noise amplification is clearly visible. The perceived 177

193 Chapter 6 Video processing for motion artifact reduction a b c d Figure 6.15 a) Original image. b) Image processed with basic MCIF. c), d) Simulated perceived images of a), b), respectively. result is sometimes even closer to Figure 6.15b, than Figure 6.15d. The cause is not clearly understood, but may be a reduced accuracy of the eye tracking. We suppose that, when the signal-to-noise ratio is low enough, i.e. particularly in low-detail moving areas, the human visual system is able to ignore the object motion and fix on the noise, since noise has no unique direction of motion. To further understand this effect, detailed perception studies will be required, which is an interesting area for further research. For our purpose this is not required, and we conclude that excessive noise amplification in the MCIF corrected images must be prevented. The next sections describe modifications of the basic MCIF algorithm, to increase noise robustness Noise suppression First, we add traditional noise suppression to the system, as is commonly applied in video noise reduction [46, 50]. Figure 6.16 shows the implementation. This noise reduction is based on a coring operation, Core(x) = x min(max( x t, x), x t ), (6.9) input + to display HPF noise reduction x motion estimation Figure 6.16 Motion compensated inverse filtering, with noise suppression. 178

194 6.5 Robust MCIF Figure 6.17 Amplitude response of speed adaptive MCIF, shown for speeds of 2,4 and 8 pixels per frame. which, controlled by a threshold x t, suppresses only the low-amplitude (high frequency) signal. This will apply the compensation only in regions where there is sufficient detail, as these are also the regions where motion blur is most objectionable. As a further measure to reduce noise amplification, we limit the compensation to a certain maximum speed, typically set around 10 pixels per frame (2.5 s/w at 1280 pixels/w) Adaptive MCIF A further modification to the basic MCIF to increase noise robustness is based on the following observations. First, the human visual system is more sensitive to the lower spatial frequencies, and the higher frequencies generally have a lower signal-to-noise ratio. Second, in common video material, moving objects will not contain the highest frequencies due to the limitations of the camera (camera blur). For this reason, viewers are used to losing some detail at high speed, although not to the extent, i.e. up to such low spatial frequencies as caused by LCD panels. Nevertheless, the basic MCIF system will apply the highest gain to the highest spatial frequencies. Therefore, for higher speeds, we will give priority to compensation of the lowest frequencies, and leave the highest frequencies unchanged. This transforms the high-pass filter (Figure 6.11) into a band-pass filter. After adding the filtered signal to the input, the final MCIF result is a medium-frequency boosting filter, as shown in Figure To limit the amplification of the higher frequencies at high speeds, and only compensate the lowest frequencies, we make the filter response also speed adaptive. This extends the speed dependency of the MC-inverse filter from a simple varying gain and rotating but fixed response filter, to a varying filter response. To achieve this, we change the directional dependent interpolation, as shown in Figure 6.18a. The positions of these interpolated taps vary not only with the direction of the motion vector, but are also located at a larger distance from 179

Chapter 6 Video processing for motion artifact reduction input + to display 2D-interp. 1D HPF noise reduction a motion estimation speed b c Figure 6.

195 Chapter 6 Video processing for motion artifact reduction input + to display 2D-interp. 1D HPF noise reduction a motion estimation speed b c Figure 6.18 Motion compensated inverse filtering, with speed dependent filter. a) Schematic, b) basic tap rotation and interpolation, c) speed adaptive interpolation: tap distance also varies with the size of the motion. the central tap for higher speeds, as shown in Figure 6.18c. This shifts the response of the 1D high-pass filter to lower frequencies. As a side-effect, this no longer requires the gain of the filter to be increased with speed. Furthermore, instead of using a simple 3-tap filter, we use a bandpass-filter with more taps, designed to operate at a certain reference speed. The frequency response of this filter is shown in Figure 6.17, at the speed of 8 pixels per frame. At this speed, the speed-adaptive tap spacing coincides with the pixel spacing. The speed-dependent response results from the fact that, at lower than this reference speed, the filter taps are closer spaced than the pixels, so the frequency response is stretched to higher frequencies (and v.v. at higher speeds) Results The MCIF corrected images of the basic and adaptive MCIF method are shown in Figure Compared to the basic MCIF, noise has been much less amplified in the adaptive version. The blur simulations in Figure 6.19d+f actually do not show this difference, as indicated earlier. The noise is visibly reduced in the MCIF image (the display input), but this does not show up in the simulation of the (perceived) display output image. However, on an actual panel, the difference in noise amplification is very visible, while the sharpness is comparable. The adaptive MCIF algorithm has been applied to real video sequences, which were displayed on a fast LCD panel (see also Section 6.4.3). Motion blur was reduced to a level comparable to a CRT, and noise amplification of the adaptive MCIF was reduced when compared to the basic MCIF. Only for very critical (graphics-like) sequences, motion blur was still visible on the LCD. 180

196 6.6 Conclusions a b c d e f Figure 6.19 Adaptive MCIF. (b,c,e & f are identical to Figure 6.15) a) Image after adaptive MCIF. b) original image, c) after basic MCIF, (d,e,f) Simulated perceived images of (a,b,c), respectively MCIF and MPRT In Chapter 5, we concluded that the display temporal response (MTF) gives all information to characterize its dynamic resolution, i.e. the sharpness of moving images. However, this is not the case for a display system that applies MCIF. The measurement of the temporal MTF, e.g. to determine an MPRT value, of a display system including MCIF is not very useful to determine motion blur of the total system. Take for example, the temporal step response measurement, using a temporally varying but stationary test patch. This measurement will not change when applying MCIF, because the temporal change is not caused by motion. Moreover, and again similar to the spatial case, the algorithm is aimed at achieving a perceptual quality increase. As in other video enhancement algorithms, this is hard to measure physically, e.g. with test patterns. We, therefore, argue that the temporal MTF should be treated in the same way as the spatial MTF: as a measure of display performance, and not as a measure of the performance of a complex video processing system. 6.6 Conclusions Motion artifacts in FPDs can be reduced by specially designed video processing algorithms. These algorithms involve motion estimation and compensation techniques. According to the general model of the temporal display response from Chapter 5, motion artifact reduction algorithms fall in two main categories: temporal delay compensations and temporal aperture compensations. The most interesting displays for temporal delay compensation are color 181

197 Chapter 6 Video processing for motion artifact reduction sequential displays and PDPs, where the delay depends on color component and subfield number, respectively. We have used the temporal display model from the previous chapter, to develop a video processing algorithm to reduce the dynamic false contour artifacts in PDPs. The algorithm uses subfield motion compensation, combined with a fill-up subfield selection criterion. With this algorithm, it is possible to compensate a moving video sequence for display on a PDP. The algorithm does not suffer from rounding errors, holes and double assignments, and is flexible with respect to the subfield distribution. The algorithm is capable of preventing dynamic false contours, without introducing trade-offs in display performance such as number of gray-levels or motion blur. Motion compensated inverse filtering (MCIF) is a video processing based compensation of the display temporal aperture/mtf. On LCDs, it is able to reduce motion blur, caused by the sample and hold effect, for common video material. MCIF provides a solution for motion blur reduction that is complimentary to modifications to the display and backlight module. The compensation of temporal MTF with video processing is different from its spatial counterpart, because it requires transformation to the spatial domain using motion. Being an entirely different approach, the reduction of motion blur through video processing has different advantages and disadvantages with respect to panel based methods. In particular, the advantages are the independence of frame rate, and the fact that it introduces no large-area flicker or light loss. Moreover, the MCIF filters can be adapted to a new temporal MTF if that has been changed by any of the panel-based methods. Noise amplification is the main drawback of MCIF, and we introduced an adaptive MCIF method that increases noise robustness. 182

198 CHAPTER 7 Discussion, conclusions and further work To conclude, this chapter first discusses the contributions of this thesis to the general topic of flat panel display signal processing. Then, the main conclusions from this thesis are summarized in Section 7.2. Finally, Section 7.3 gives an outlook on possible further work. 7.1 Discussion The world has seen some remarkable developments in display technology over the past decade. Since the introduction of flat panel displays as an alternative to the CRT for television, progress has been so fast in LCD and PDP technologies, that FPDs have surpassed CRTs in many aspects. Panel sizes have increased from 20 to over 100 inch, brightness increased from less than 200 to over 800 cd/m 2, contrast ratios went up from 1:100 to over 1:1000, and viewing angle problems have been eliminated. Moreover, FPD resolutions have increased from SD (less than 500 lines) to HD (over 1000 lines) in just a matter of years, paving the way for the introduction of HD-TV. In this respect, the superior performance of FPDs in the spatial dimension is key. The analysis and video processing presented in Chapters 3 and 4 have shown that FPDs clearly have a strong point in the spatial dimension, not only regarding image geometry, but foremost regarding resolution. Making optimal use of the color subpixel arrangement by means of video processing, preferably combined with alternative arrangements, further increases this advantage. At present, the spatial performance of FPDs is already better than that of TV signals, so for this application there is no real need to further increase performance. Nevertheless, there will always be opportunities to apply resolution 183

199 Chapter 7 Discussion, conclusions and further work improvements. First, the increasing use of multi-media, will increase the application range that a display must handle. Computer graphics, games, internet browsing and digital photo viewing are all applications that provide higher resolution than current FPDs. Second, instead of using improved video processing to increase resolution on a display with a certain number of pixels, it can also be used to decrease the number of pixels of a display, i.e. reduce its cost, without affecting resolution. The last remaining aspect where FPDs do not outperform CRTs, is motion portrayal. CRTs never really had problems in this area, so the introduction of FPDs has actually added a new category to display science and related signal processing. This thesis has presented some of the first work in this area. In Chapters 5 and 6, it was shown how temporal characteristics of FPDs can be analyzed with respect to motion portrayal, and how to significantly improve FPD motion portrayal. Here, too, improvements can be achieved by modifying the display, or by video signal processing. In contrast to the spatial dimension, the temporal dimension has been largely underexposed in TV standards. With the improvements that were discussed in this thesis, FPDs have made a significant step forward. More specifically, FPDs are now no longer the major limiting factor in the total TV signal chain with respect to motion portrayal. This gives an ideal starting point to take the total static and dynamic performance of the TV chain to a next level. 7.2 Conclusions In this thesis, the properties of FPDs have been analyzed from a signal processing perspective, and related video signal processing algorithms were developed, with a focus on static and dynamic resolution. The main findings are summarized below. An overview of the basic properties of CRTs and FPDs (LCD and PDP), was presented in Chapter 2, showing that: Display properties can be categorized according to the basic functions of the display system signal chain: opto-electronic effect, spatial and temporal addressing and -reconstruction, and color synthesis (Table 1.1). FPDs execute the same basic functions as the CRT, but with very different characteristics (Table 2.1). FPDs require video format conversion processing in the display signal chain that was not needed with CRTs. This video processing represents the conversion from source format to display format, so there is no definite boundary between video signal processing and display signal processing. Video processing functions in the display chain also provide opportunities for display specific video processing algorithms, to further increase image quality. 184

200 7.2 Conclusions In the main part of this thesis, Chapters 3 to 6, the relation between FPD properties, image quality and FPD specific signal processing has been investigated for static and dynamic resolution Static display resolution Chapter 3 has presented an analysis of the static resolution of color matrix displays (CMDs) by considering their spatial properties. A model of the spatial display signal chain has been presented, and used in the analysis. The analysis showed that: The addressing format, spatial aperture and color subpixel format all have an influence on the perceived resolution of a CMD. Perceived resolution on CMDs is increased when the subpixel arrangement is taken into account by means of subpixel sampling, because this reduces luminance aliasing in exchange for less visible color aliasing. Subpixel sampling reduces the distortions due to the fundamentally nonideal display aperture for frequencies below Nyquist. This effectively increases the Kell factor for color matrix displays with subpixel sampling. The perceived resolution increase relates to an increase in modulation transfer function of the display, resulting in higher sharpness. Signal pre-filtering, aimed at removing the most severe color aliasing, is an essential function in a display signal chain with subpixel sampling. Several alternative (2D) subpixel arrangements were described using unit pixels and lattices, and combined with the display signal model to analyze static resolution. It was shown that: The unit pixel lattice must be rectangular in order to allow matrix addressing. (2D) subpixel arrangements can only increase perceived resolution when subpixel sampling is used. Subpixel sampling on 2D arrangements also increases the Kell factor, which was generalized into a Kell area. For 2D arrangements, the combination of subpixel sampling and signal filtering is even more important to optimize perceived resolution. A video processing algorithm to exploit the increased perceived resolution on CMDs was investigated in Chapter 4, combining subpixel sampling, filtering and scaling into subpixel scaling. Experiments based on simulated displays with several subpixel arrangements showed that: Subpixel scaling is an attractive method to exploit the perceived resolution increase obtainable from the subpixel arrangement of matrix displays. The proposed method is applicable to different subpixel arrangement, poses no constraints on the input resolution, and can be applied for natural images as well as text/graphics. 185

201 Chapter 7 Discussion, conclusions and further work Subpixel scaling is most attractive for downscaling applications, i.e. where the display is the limiting factor for spatial resolution in the display chain. Alternative subpixel arrangements, in particular 2D subpixel arrangements, can only improve perceived resolution if subpixel scaling is a part of the signal processing chain. Subpixel scaling gives a relatively larger improvement on 2D subpixel arrangements than on the (1D) vertical stripe arrangement. 2D subpixel arrangements can provide equal resolution to the vertical stripe subpixel arrangement, with up to 20% less drivers and 50% less subpixels Dynamic display resolution In Chapter 5 an analysis has been presented of the dynamic resolution of FPDs by considering their temporal properties. A model of the temporal aspects of the display signal chain has been presented, and used in the analysis. The temporal display model is divided in two parts: temporal addressing and reconstruction, characterized by a temporal delay, and a temporal aperture, respectively. The temporal aperture was introduced as a tool to characterize the dynamic resolution of displays. The model showed that: Temporal characteristics of a display can lead to artifacts that are either purely temporal (flicker), or that are perceived as spatial effects in moving images ( motion artifacts ). With moving images, temporal characteristics of the display are transformed to spatial effects in the perceived image, due to the human visual system property of eye tracking. A temporal addressing (delay) mismatch between source and display results in spatial position errors in the perceived image. In general, these are not very severe, but particularly for tiled, color sequential and subfield driven displays, artifacts can be very visible and annoying. PDPs exhibit the dynamic false contour motion artifact, because the subfield driving method introduces a subfield dependent response that changes abruptly with gray-level. A short overview of alternative subfield driving schemes for PDP motion artifact reduction was presented. The major part of Chapter 5 concerned the dynamic resolution of LCDs. The general display signal chain model was used to derive a signal chain model for LCDs, which was used to investigate the temporal characteristics of LCDs in relation to motion artifacts. It was shown that: The temporal aperture of a display has a direct relation to motion artifacts, specifically motion blur, via the motion aperture. The temporal aperture of LCDs shows in a direct way that the display response not only consists of the LC response, but that the active matrix addressing (sample-and-hold) and the backlight are also important. 186

202 7.2 Conclusions This clearly shows why decreasing the LC response time is not enough to eliminate motion blur. The temporal aperture can be used as a measure for motion blur, without involving image motion. In other words, motion blur is a spatial effect which is caused by the temporal characteristics of the display. The temporal aperture directly relates to the temporal modulation transfer function (MTF) of a display. The temporal bandwidth, which is derived from the temporal MTF, has a direct relation to motion blur, which can therefore also be seen as a reduced dynamic resolution of the display. The temporal bandwidth gives equivalent results as the proposed motion picture response time (MPRT) standard, allowing the quantification of motion blur in displays to be simplified to a purely time-based measurement and characterization. Several motion blur reduction methods, based on modified display design and driving, were compared using the temporal aperture. These methods, e.g. higher frame rate, black frame insertion, and scanning backlight, improve the dynamic resolution of displays considerably, as indicated by a reduction of the MPRT from 20 to under 10 ms. Motion artifacts can be reduced by changing the temporal display properties, but also by applying video processing algorithms in the display chain. In Chapter 6 the temporal display model from Chapter 5 was used to show that: These video processing algorithms involve motion estimation and compensation techniques. These video processing algorithms fall in two main categories: temporal delay compensations and temporal aperture compensations. Displays that benefit from temporal delay compensation are for example color sequential displays, where the delay depends on color component. A video processing method for motion artifact reduction in PDPs involves motion compensated subfield generation, since the temporal response of PDPs can best be described as a subfield dependent delay. A video processing method for motion artifact reduction in LCDs can be regarded as a temporal aperture (MTF) correction. Temporal MTF correction is different from its spatial counterpart, because it requires transformation to the spatial domain, using motion estimation and motion compensated filtering. Several video processing algorithms for display motion artifact reduction, on LCD and PDP, were developed. The PDP motion artifact (DFC) reduction algorithm uses subfield motion compensation, combined with a fill-up subfield selection criterion. The fill-up algorithm Enables compensation of a moving video sequence for display on a PDP. 187

203 Chapter 7 Discussion, conclusions and further work Does not suffer from rounding errors, holes and double assignments, and is flexible with respect to the subfield distribution. Is capable of preventing dynamic false contours, without introducing tradeoffs in display performance such as number of gray-levels or motion blur. The LCD motion artifact (motion blur) reduction algorithm uses Motion Compensated Inverse Filtering (MCIF). The MCIF algorithm Can reduce motion blur, caused by the sample and hold effect, for common video material. Has noise amplification as its main drawback. An adaptive MCIF method was developed that increases noise robustness. Is complimentary to motion blur reduction methods that require modification of the display and backlight module. Has different advantages and disadvantages over panel based methods. The advantages are, in particular, the independence of frame rate, and the fact that it introduces no large-area flicker or light loss. Can be adapted to a different temporal MTF by adapting the filter responses. 7.3 Further work FPDs have reached the quality level of CRTs and even surpassed it in some areas. This has been achieved by improvements in brightness, contrast and viewing angle, and by the static and dynamic resolution improvement methods described and introduced in this thesis. However, with all these improvements, the displayed image can still be distinguished from a real life scene. As already mentioned in the introduction, this means that there is still room for improvement. For this reason, there will be more work in the area of display systems for many more years to come. This thesis is merely an intermediate point in all these developments. In the area of static display resolution, major new pixel arrangement or subpixel rendering method are considered unlikely. Further work will be targeted at actually implementing optimized subpixel arrangements and related processing in a cost-effective manner in displays. This will on the one hand require changes to standard display manufacturing processes, but there can also be a further efficiency improvement by fully integrating 2D subpixel rendering and scaling. Moreover, there seems to be a trend toward so-called multi-primary displays, which use more than three primary colors. This thesis has focused on RGB displays, but a multi-primary display can for example have, next to RGB, an extra white (RGBW ) or cyan (RGBC) subpixel [94, 26]. For these displays, the color conversion is no longer described by a simple matrix as in Chapter 3, but introduces an extra degree of freedom. This will couple the color conversion, subpixel sampling and filtering in a complex manner. Although the principles of subpixel sampling and filtering should still work on these displays, a further 188

204 7.3 Further work gain can be expected by fully integrating color conversion and subpixel scaling for these displays. Regarding dynamic resolution, there are still numerous directions for further research. In PDPs, an optimized combination of subfield distribution and motion compensated processing is still not implemented. This is mainly due to the closed architecture of PDPs, that does not allow the subfield order to be changed from the outside, and also does not provide details of the (content adaptive) distribution on the outside. In LCDs, there is at the moment a large array of proposed solutions, as described in Chapters 5 and 6. There seems to be no clear winner at the moment, but to provide further improvements, the combination of a number of these methods can be considered. In particular, the combination of higher frame rate driving and MCIF promises to be very effective. Also, the continuing trend toward higher computational complexity of video processing algorithms might make it worthwhile to consider application of some of the traditional inverse filtering methods to the video domain. In particular, the investigation of the observed noise amplification in temporal display aperture correction could deserve some more attention. It could provide basic insights into the psychophysics behind noise and motion perception. Moreover, the trend toward higher frame rates for LCDs will enable a combination with many other forms of video signal processing, both for format conversion, as well as for image quality enhancement in general. Furthermore, there is still a need for more research into simple and effective methods for motion artifact measurement and characterization. The analysis in Chapter 5 has given some input to this process, mainly for LCDs, but there is still no simple metric for motion blur that describes all relevant aspects of the temporal behavior of LCDs in relation to human perception. Also, there is no simple and standardized measure for characterizing dynamic false contours on PDPs. In general, we can also see that the display is no longer a limiting factor in the whole TV chain. Increasing display quality has made the capture and transmission parts more critical, even with the advent of HDTV and digital transmission. In practice, the increased capability of displays to reconstruct real life images, is often only used to accurately reconstruct artifacts from recording and transmission. Therefore, if we ever want to reach the ideal of real life images on electronic screens, the whole chain must be improved, not just the display. Already we see that large, bright, high-resolution FPDs bring out artifacts like noise and digital compression artifacts. We expect that this aspect of display and video processing will receive more attention in the coming years, for example by better adjusting display properties (driving) and processing to the transmission parameters (noise, bit-rate), or even vice versa. Finally, this thesis did not cover all display properties in detail. Developments in bit-depth, color gamut and dynamic range of displays are ongoing, and there too, video processing is likely to have as large an impact on total image quality as do improvements in the display. 189

205 190

206 References [1] Adobe Systems Inc., Adobe Cooltype, products/acrobat/cooltype.html. [2] P. M. Alt and P. Pleshko, Scanning limits for liquid-crystal displays, IEEE trans. Electron Devices, vol. 21, p. 2, [3] Y. Amano, A flat-panel TV display system in monochrome and color, IEEE transactions on Electronic Devices, vol. ED-22(1), pp. 1 7, [4] J. Anderson and B. Anderson, The myth of persistence of vision revisited, J. Film and Video, vol. 45(1), pp. 3 12, [5] P. Barten, Spatio-temporal model for the contrast sensitivity of the human eye and its temporal aspects, in proc. SPIE, vol. 1913, pp [6] P. Barten, Physical model for the contrast sensitivity of the human eye, in proc. SPIE, Human Vision, Visual Processing, and Digital Display III, vol. 1666, pp , [7] P. Barten, Effects of quantization and pixel structure on the image quality of color matrix displays, Journal of the SID, vol. 1(2), pp , [8] B. Baxter and P. Corriveau, PC display resolution matched to the limits of visual acuity, Journal of the Society for Information Display, vol. 13(2), pp , [9] E. B. Bellers and G. de Haan, De-interlacing, A key technology for scan rate conversion, Elsevier Science, 2000, ISBN [10] E. B. Bellers et al., Optimal television scanning format for CRT displays, in IEEE Transactions on Consumer Electronics, pp , [11] T. Benzschawel and W. Howard, Method of and apparatus for displaying a multicolor image, US patent No. 5,341,153, [12] G. Berbecuel, Digital Image Display, John Wiley & sons, 2001, 191

207 References ISBN [13] C. Betrisey et al., Displaced filtering for patterned displays, in SID Digest, vol. 31, pp , [14] S. Bitzakidis, Improvements in the moving image quality of AM-LCDs, Journal of the SID, vol. 1(3), pp , [15] S. Bitzakidis, Speed dependent high pass filter, European patent No. EP , [16] J. F. Blinn, What is a pixel?, in SID Digest, vol. 31, pp , [17] W. den Boer, Active Matrix Liquid Crystal Displays: Fundamentals and Applications, Newnes, 2005, ISBN [18] J. P. Boeuf, Plasma display panels: physics, recent developments and key issues, J. Phys. D: Appl. Phys., vol. 36, p. R53R79, [19] C. H. Brown Elliott, T. L. Credelle, S. Han, M. H. Im, M. F. Higgins, and P. Higgins, Development of the PenTile Matrix TM color AMLCD subpixel architecture and rendering algorithms, Journal of the Society for Information Display, vol. 11(1), pp , [20] C. H. Brown Elliott et al., Active matrix display layout optimization for sub-pixel image rendering, in Int. Display Manuf. Conf., pp , [21] C. H. Brown Elliott et al., Co-optimization of color AMLCD subpixel architecture and rendering algorithms, in SID Digest, vol. 33, pp , [22] C. H. Brown Elliott et al., Color flat panel display sub-pixel arrangements and layouts for sub-pixel rendering with increased modulation transfer function, World patent application No. WO 03/060870, [23] C. H. Brown Elliott et al., Arrangements of color pixels for full color imaging devices with simplified addressing, US patent application No. 6,903,754, [24] L. M. Chen and S. Hasegawa, Influence of pixel-structure noise on image resolution and color for matrix display devices, Journal of the SID, vol. 1(1), pp , [25] T. Chen and P. P. Vaidyanathan, Recent developments in multidimensional multirate systems, IEEE Transactions on Circuits and Systems for Video Technology, vol. 3(2), pp , [26] E. Chino et al., Development of wide-color-gamut mobile displays with four-primary-color LCDs, in SID Digest, vol. 37, pp , [27] T. A. C. M. Claassen et al., On the transposition of linear time-varying discrete-time networks and its application to multirate digital systems, Philips Journal of Research,, pp , [28] J. C. Dainty and R. Shaw, Image Science, Principles, Analysis, and Evaluation of Photographic-Type Imaging Processes, Academic Press., London,

208 References [29] S. Daly, Analysis of subtriad addressing algorithms by visual system models, in SID Digest, vol. 32, pp , [30] R. van Dijk and T. Holtslag, Motion compensation in plasma displays, in Int.Displ.Workshop, pp , [31] R. Dodge, Five types of eye movement in the horizontal meridian plane of the field of regard., Am.J.Physiol., vol. 8, pp , [32] E. Dubois, The sampling and reconstruction of time-varying imagery with application in video systems, IEEE, vol. 73(4), pp , [33] A. van den Enden and N. A. M. Verhoeckx, Discrete-time signal processing, Prentice Hall, 1989, ISBN [34] P. G. Engeldrum, Psychometric Scaling: A Toolkit for Imaging Systems Development, Imcotek Press, Winchester, MA, 2000, ISBN [35] Y.-P. Eo, S.-J. Ahn, and S.-U. Lee, Histogram-based subfield LUT selection for reducing dynamic false contour in PDPs, in SID Digest, vol. 36, pp , [36] H. S. Fairman, M. H. Brill, and H. Hemmendinger, How the CIE 1931 color matching functions were derived from Wright-Guild data, Color Res. Appl., vol. 22(1), pp , [37] J. E. Farrell, Predicting flicker thresholds for video display terminals., SID Digest, vol. 18, pp , [38] R. Feigenblatt, Full color imaging on amplitude color mosaic displays, in proc. SPIE, vol. 1075, pp , [39] N. Fisekovic, T. Nauta, H. Cornelissen, and J. Bruinink, Improved motion-picture quality of AMLCDs using scanning backlight, in Asia Display/Int.Displ.Workshop, pp , [40] T. Genova, Television history - the first 75 years, [41] N. Gershenfeld, The Physics of Information Technology, Cambridge University Press, 2000, ISBN [42] S. Gibson, Subpixel font rendering, [43] G. de Haan, True motion estimation with 3-D recursive search blockmatching, IEEE Transactions on Circuits and Systems for Video Technology, vol. 3(5), pp , [44] G. de Haan, IC for motion-compensated de-interlacing, noise reduction, and picture-rate conversion, in IEEE Transactions on Consumer Electronics, vol. 45, pp , [45] G. de Haan, Large-display video format conversion, Journal of the SID, vol. 8(1), pp , [46] G. de Haan, Video Processing for Multimedia Systems, University Press, Eindhoven, the Netherlands, 2000, ISBN [47] G. de Haan and E. B. Bellers, Deinterlacing an overview, Proc. of the IEEE, vol. 86(9), pp ,

209 References [48] G. de Haan, J. Kettenis, and B. Deloore, IC for motion compensated 100hz TV, with a smooth motion movie-mode, IEEE Transactions on Consumer Electronics, vol. 42, pp , [49] G. de Haan and M. A. Klompenhouwer, An overview of flaws in emerging television displays and remedial video processing, in IEEE Transactions on Consumer Electronics, vol. 47(3), pp , [50] G. de Haan and M. A. Klompenhouwer, Anti motion blur display, US patent No. 6,930,676, [51] J. Hagerman, Optimum spot size for raster-scanned monochrome CRT displays, Journal of the SID, vol. 1(3), pp , [52] Z. Hara and N. Shiramatsu, Improvement in the picture quality of moving pictures for matrix displays, Journal of the SID, vol. 8(2), pp , [53] C. Hentschel, Video-Signalverarbeitung, B.G. Teubner, Stuttgart, Germany, 1998, ISBN X. [54] I. Heynderickx and E. H. A. Langendijk, Image quality comparison of PDP, LCD, CRT and LCoS projection, in SID Digest, vol. 36, pp , [55] T. H. A. M. Holtslag, J. Hoppenbrouwers, and R. van Dijk, A comparison of motion artifact reduction methods in PDPs, in Int.Displ.Workshop, pp , [56] S. Hong et al., Enhancement of motion image quality in LCDs, in SID Digest, vol. 35, pp , [57] Y. Hosoya and S. L. Wright, High resolution LCD technologies for the IBM T220/T221 monitor, in SID Digest, vol. 33, pp , [58] S. C. Hsu, The Kell factor, past and present, SMPTE journal,, pp , Feb [59] R. W. G. Hunt, The Reproduction of Colour, Voyageur Press, MN, 2004, ISBN [60] R. W. G. Hunt, Imaging performance of displays: Past, present, and future, Journal of the SID, vol. 13(12), pp , [61] J. Hutchinson, Plasma display panels: The colorful history of an Illinois technology, ECE alumni news, University of Illinois, vol. 36(1), [62] Y. Igarashi et al., Proposal of the perceptive parameter motion picture response time (MPRT), in SID Digest, vol. 34, pp , [63] Y. Igarashi et al., Summary of moving picture response time (MPRT) and futures, in SID Digest, vol. 35, pp , [64] C. Infante, On the modulation transfer function of matrix displays, Journal of the SID, vol. 1(4), pp , [65] T. Inoue, H. Motousu, and S. Mikoshiba, Simulation and analysis of overall raster moire patterns on color CRTs, in SID Digest, vol. 33, pp ,

210 References [66] International Telecommunication Union, recommendation ITU-R BT.601-5, [67] K. Jack, Video Demystified, LLH Technology Publishing, Eagle Rock, VA, 2001, ISBN [68] J. Janssen, J. Stessen, and P. de With, An advanced sampling rate conversion technique for video and graphics signals, in Image Processing and Its Applications, vol. 2, pp , [69] John Walker, How many dots has it got?, documents/howmanydots/. [70] K. Kariya, Y. Kanazawa, and T. Hirose, ALIS method for high-resolution color PDPs, Journal of the SID, vol. 10(1), pp , [71] I. Kawahara and K. Sekimoto, Dynamic gray scale control to reduce motion-picture disturbance for high-resolution PDPs, in SID Digest, vol. 30, pp , [72] P. A. Keller, The Cathode Ray Tube, Technology, History, and Applications, Palisades Press, 1991, ISBN [73] N. Kimura et al., New technologies for large-sized high-quality LCD TV, in SID Digest, vol. 36, pp , [74] M. A. Klompenhouwer, Temporal impulse response and bandwidth of displays in relation to motion blur, in SID Digest, vol. 36, pp , [75] M. A. Klompenhouwer, Temporal MTF of displays and related video processing, in Int.Conf.Imag.Proc., vol. II, pp , [76] M. A. Klompenhouwer, Comparison of LCD motion blur reduction methods using temporal impulse response and MPRT, in SID Digest, vol. 37, pp , [77] M. A. Klompenhouwer and G. de Haan, Optimally reducing motion artifacts in plasma displays, in SID Digest, vol. 31, pp , [78] M. A. Klompenhouwer and G. de Haan, Color error diffusion: Accurate luminance from coarsely quantized displays, in SID Digest, vol. 32, pp , [79] M. A. Klompenhouwer and G. de Haan, Subpixel image scaling for color matrix displays, Journal of the SID, vol. 11(1), pp , [80] M. A. Klompenhouwer and G. de Haan, Video, display and processing, in SID Digest, vol. 35, pp , [81] M. A. Klompenhouwer, G. de Haan, and R. A. Beuker, Subpixel image scaling for color matrix displays, in SID Digest, vol. 33, pp , [82] M. A. Klompenhouwer and L. Velthoven, LCD motion blur reduction with motion compensated inverse filtering, in SID Digest, vol. 35, pp , [83] M. A. Klompenhouwer and L. Velthoven, Motion blur reduction for liquid 195

211 References crystal displays: Motion compensated inverse filtering, in proc. SPIE, Vis.Comm.Imag.Proc., vol. 5308, pp , [84] J. H. Kranz and L. D. Silverstein, Color matrix display image quality: The effects of luminance and spatial sampling, in SID Digest, vol. 21, pp , [85] R. J. Krauzlis, Recasting the smooth pursuit eye movement system, J. Neurophysiol., vol. 91, pp , [86] K. E. Kuijk, Combining passive- and active-matrix addressing of LCDs, Journal of the SID, vol. 4(1), pp. 9 17, [87] K. Kumagawa and A. Takimoto, Fast response OCB-LCD for TV applications, in SID Digest, vol. 33, pp , [88] T. Kurita, Moving picture quality improvement for hold-type AM-LCDs, in SID Digest, vol. 32, pp , [89] T. Kurita et al., Consideration on perceived MTF of hold type display for moving images, in Int.Displ.Workshop, pp , Society for Information Display, [90] T. Kurita et al., Effect of motion compensation on color breakup reduction and consideration of its use for cinema displays, in Int.Displ.Workshop, pp , [91] E. H. A. Langendijk and I. Heynderickx, Optimal and acceptable color ranges of display primaries for mobile applications, in SID Digest, vol. 34, pp , [92] Lee et al., A color halftoning algorithm for low-bit flat panel displays, in Int.Conf.Imag.Proc., vol. 2, pp , [93] B. W. Lee et al., LCDs: How fast is enough?, in SID Digest, vol. 32, pp , [94] B. W. Lee et al., Implementation of RGBW color system in TFT-LCDs, in SID Digest, vol. 35, pp , [95] E. Lueder, Liquid Crystal Displays : Addressing Schemes and Electro- Optical Effects, John Wiley & Sons, 2001, ISBN [96] D. L. MacAdam, ed., Selected papers on Colorimetry - Fundamentals, SPIE Optical Engineering press, Bellingham, WA, 1993, ISBN [97] T. Makino et al., Improvement of video image quality in AC-plasma display panels by suppressing the unfavorable coloration effect with sufficient gray shades capability, in Asia Display, pp , [98] R. Manduchi et al., Multi-stage sampling structure conversion for video signals, IEEE Transactions on Circuits and Systems for Video Technology, vol. 3(5), pp , [99] D. Marr, Vision: a computational investigation into the human representation and processing of visual information, Freeman, San Fransisco, 1982, ISBN [100] J. M. Mersereau et al., The processing of periodically sampled multidi- 196

212 References mensional signals, IEEE transactions on Accoustics, Speech and Signal Processing, vol. ASSP-31(1), pp , [101] Microsoft Corporation, ClearType information, [102] Microsoft Corporation, Font hinting, typography/truetypehintingintro.mspx. [103] S. Mikoshiba, Dynamic false contours on PDPs - fatal or curable?, in Int.Displ.Workshop, pp , [104] C. Mittermayer and A. Steininger, On the determination of dynamic errors for rise time measurement with an oscilloscope, IEEE Tr.on Instr.and Meas., vol. 48(6), pp , [105] M. Mori et al., Mechanism of color breakup in field-sequential-color projectors, Journal of the SID, vol. 7(4), pp , [106] K. T. Mullen, The contrast sensitivity of human colour vision to red/green and blue/yellow chromatic gratings, J. Physiology, (359), pp , [107] H. Nakamura et al., A novel wide-viewing-angle motion-picture LCD, in SID Digest, vol. 29, pp , [108] T. Nose et al., A black stripe driving scheme for displaying motion pictures on LCDs, in SID Digest, vol. 32, pp , [109] O. A. Ojo and G. de Haan, Robust motion-compensated video upconversion, IEEE Transactions on Consumer Electronics, vol. 43(4), pp , [110] K. Oka and Y. Enami, Moving picture response time (mprt) measurement system, in SID Digest, vol. 35, pp , [111] H. Okumura, A new low-image-lag drive method for large size LCTVs, Journal of the SID, vol. 1(3), pp , [112] W. C. O Mara, Liquid Crystal Flat Panel Displays: Manufacturing Science & Technology, Van Nostrand Reinhold, 1993, ISBN [113] M. K. Ozkan, A. M. Tekalp, and M. I. Sezan, POCS-based restoration of space-varying blurred images, IEEE Transactions on Image Processing, vol. 3(4), pp , [114] D. Parker, The dynamic performance of CRT and LCD displays, in Display systems, design and application, A. L.W. MacDonald, ed., ch. 18, pp , John Wiley & sons, Chichester, England, [115] A. Patti, A. Tekalp, and M. Sezan, A new motion compensated reduced order model kalman filter for space-varying restoration of progressive and interlaced video, IEEE Transactions on Image Processing, vol. 7(4), pp , [116] M. J. Powell et al., A 6-in. full-color liquid-crystal television using an active matrix of amorphous-silicon TFTs, Journal of the SID, vol. 29(3), pp ,

213 References [117] C. A. Poynton, Gamma and its disguises: The nonlinear mappings of intensity in perception, CRTs, film and video, SMPTE journal,, pp , Dec [118] C. A. Poynton, Motion portrayal, eye tracking, and emerging display technology, 30th SMPTE Advanced Motion Imaging Conference,, pp , [119] C. A. Poynton, Digital Video and HDTV, Elsevier Science, San Fransisco, 2003, ISBN [120] J. G. Proakis, Advanced Digital Signal Processing, Macmillan publishing company, 1992, ISBN [121] R. Rajae-Joordens and J. Engel, Paired comparisons in visual perception studies using small sample sizes, Displays, vol. 26(1), p. 1, [122] G. Rajeswaran et al., Active matrix low temperature poly-si TFT / OLED full color displays: Development status, in SID Digest, vol. 31, pp , [123] H. de Ridder, Naturalness and image quality: Towards pereptually optimal color reproduction of natural scenes, in SID Digest, vol. 32, pp , [124] J. G. Robson, Spatial and temporal contrast sensitivity functions of the visual system, J. Opt. Soc. Amer., vol. 56(8), p , [125] B. E. Rogowitz, The psychophysics of spatial sampling, in proc.spie/spse conf. Electronic Imaging and Devices, pp , [126] R. Samadani et al., Periodic plane tilings: Application to pixel layout simulations for color flat-panel displays, Journal of the SID, vol. 2(2), pp , [127] D. Sasaki, M. Imai, and H. Hayama, Motion picture simulation for designing high-picture-quality hold-type displays, in SID Digest, vol. 33, pp , [128] O. H. Schade Sr., Image quality: A comparison of photographic and television systems, J.SMPTE, (100), pp , [129] W. F. Schreiber, Introduction to color television - part I, in Proc. of the IEEE, vol. 87(1), pp , [130] H. Schröder and H. Blume, One- and Multidimensional Signal Processing, John Wiley & sons, Chichester, 2000, ISBN [131] P. Seats and B. Gnade, The evolution of flat panel cathode ray tubes, in SID Digest, vol. 32, pp , [132] K. Sekiya and H. Nakamura, Eye-trace integration effect on the perception of moving pictures and a new possibility for reducing blur on hold-type displays, in SID Digest, vol. 33, pp , [133] G. Sharma, Digital color imaging, IEEE Transactions on Image Processing, vol. 6(7), pp , [134] T. Shigeta, Improvement of moving-video image quality on PDPs by re- 198

214 References ducing the dynamic false contour, in SID Digest, vol. 29, pp , [135] Y. Shimodaira, Fundamental phenomena underlying artifacts induced by image motion and the solutions for decreasing the artifacts on FPDs, in SID Digest, vol. 34, pp , [136] L. D. Silverstein et al., Image quality and visual simulation of color matrix displays, in Soc.Autom.Eng., p. paper , [137] L. D. Silverstein et al., A psychophysical evaluation of pixel mosaics and gray-scale requirements for color matrix displays, in SID Digest, vol. 20, pp , [138] L. D. Silverstein et al., Effects of spatial sampling and luminance quantization on the image quality of color matrix displays, J.Opt.soc.Am.A, vol. 7(10), pp , [139] A. A. S. Sluyterman, Resolution aspects of the shadow-mask pitch for TV applications, in SID Digest, vol. 24, pp , [140] A. A. S. Sluyterman, Improved character representation by non-square pixel grids, in SID Digest, vol. 31, pp , [141] A. A. S. Sluyterman and E. P. Boonekamp, Architectural choices in a scanning backlight for large size LC-TVs, in SID Digest, vol. 36, pp , [142] J. Someya and Y. Igarashi, A review of MPRT measurement method for evaluating motion blur of LCDs, in Int.Displ.Workshop, pp , [143] T. Sugiura, EBU color filter allows LCDs to catch up with CRTs in color, in SID Digest, vol. 32, pp , [144] K. Sunohara et al., Reflective color LCD composed of stacked films of encapsulated liquid crystal (SFELIC), in SID Digest, vol. 29, pp , [145] A. M. Tekalp, Digital Video Processing, Prentice Hall, 1995, ISBN [146] The early television foundation, Early color television, color.html. [147] The early television foundation, Mechanical TV sets of the 20s and 30s, [148] L. L. Thurstone, A law of comparative judgment, Psychological Review, vol. 34, p. 273, [149] K. Toda, An equalizing pulse technique for reducing gray scale disturbances of PDPs below the minimum visual perception level, in Euro Display, pp , [150] T. Tokunaga, H. Nakamura, and H. Suzuki, Development of new driving method for AC-PDPs, in Int.Displ.Workshop, pp , [151] G. J. Tonge, Time sampled motion portrayal, Conference on Image Pro- 199

215 References cessing and its Applications, (265), pp , [152] H. van Trigt and A. Siepel, Behind the Picture, Philips Components BV, Eindhoven, the Netherlands, 1998, ISBN [153] R. Tuenge et al., A field-sequential color VGA AMEL display, Journal of the SID, vol. 5(4), pp , [154] H. Uchiike and T. Hirakawa, Color plasma displays, Proc. of the IEEE, vol. 90(4), pp , [155] H. Uchiike and T. Hirakawa, Historical view and current status of plasma displays, in IEEE Industry Applications Conference, vol. 1, pp , [156] R. A. Ulichney, Dithering with blue noise, Proc. of the IEEE, vol. 76(2), pp , [157] N. C. van der Vaart et al., Towards large-area full-color active-matrix printed polymer OLED television, Journal of the SID, vol. 13(1), pp. 9 16, [158] Video Electronics Standards Association, Flat panel display measurements, version 2.0, VESA , (FPDM2), [159] R. H. Vollmerhausen and R. G. Driggers, Analysis Of Sampled Imaging Systems, SPIE, 2000, ISBN [160] J. J. Vos and P. L. Walraven, On the derivation of the foveal receptor primaries, Vision Research, vol. 11, pp , [161] B. A. Wandell, Foundations of Vision, Sinauer Associates Inc., Sunderland, Massachusetts, 1995, ISBN [162] J. Westerink, Perceived Sharpness in Static and Moving Images. PhD thesis, Technical University of Eindhoven, the Netherlands, [163] G. Wyszecki and W. S. Stiles, Color science, John Wiley & sons, 2000, ISBN [164] B. L. Xu et al., Improvement in PDP image quality by suppressing dynamic false contours while keeping high brightness, in SID Digest, vol. 34, pp , [165] T. Yamaguchi et al., Degradation of moving-image quality in PDPs: Dynamic false contours, Journal of the SID, vol. 4(4), pp , [166] T. Yamaguchi, K. Toda, and S. Mikoshiba, Improvement in PDP picture quality by three-dimensional scattering of dynamic false contours, in SID Digest, vol. 27, pp , [167] T. Yamamoto et al., Guiding principles for high quality motion picture in AMLCDs applicable to TV, in SID Digest, vol. 31, p , [168] J.-H. Yu, Y.-C. Lin, and H.-P. D. Shieh, Display image quality evaluation with multi-scale human visual model, in SID Digest, vol. 32, pp ,

216 APPENDIX A Scanning Scanning is the process of converting the continuous space-time image intensity I c ( x, t) into a 1D video signal I v (t). The scanning process is mathematically described by the following formula: x(t) = (x(t), y(t)) = (W f l t mod W, Hf fd t mod H) = f fd t(w N l, H) T mod (W, H) (A.1) Where x(t) is the scanning path ( raster ), W and H are the image width and height, respectively, and the position is kept within the image bounds by the modulo operator, X mod Y = {X ny n Z 0 (X ny ) < Y } (A.2) Figure A.1 shows some examples of a scanning raster in the 3D space-time image space, and also illustrates that scanning represents a sampling of the space-time continuous image intensity I c into discrete lines and fields: I v ( x, t) = I c ( x, t) only if [ x, t] = [ x s (t), t] (A.3) In CRTs, mod W and mod H in Equation A.2 represent the electron beam flyback, respectively horizontal and vertical. This is indicated by the dashed lines in the Figure. In practice, the fly-back in CRTs takes time, but this is neglected in this simple model [130]. Since the physical size of the recorded image has no direct relation to the size of the displayed image, the image width and height are often expressed in arbitrary units (such as W=4, H=3), or normalized units H = 1 [picture height], W = 1.33 [ph]. W/H = AR is the image aspect ratio (See Section 4.2). Scanning represents a sampling of the continuous space-time image intensity I c into separate lines and fields: I s ( x, t) = I c ( x, t) only if [ x, t] = [ x s (t), t] (A.4) 201

Appendix A Scanning a b c Figure A.1 Scanning, an example with 9 lines and 4 fields. a) The scanning raster in space: the scan runs from left to right and from top to bottom.

The signal after scanning is a mix of continuous and discrete signals: the vertical and temporal dimensions are discrete (sampled), while the horizontal dimension is still continuous (analogue).

217 Appendix A Scanning a b c Figure A.1 Scanning, an example with 9 lines and 4 fields. a) The scanning raster in space: the scan runs from left to right and from top to bottom. b) The continuous 3D space is reduced to 1D, where the scan defines a sequence of time-discrete fields. c) The scanned positions can vary in each field: interlace. The signal after scanning is a mix of continuous and discrete signals: the vertical and temporal dimensions are discrete (sampled), while the horizontal dimension is still continuous (analogue). The video signal is therefore also referred to as analogue video. The video format is therefore characterized by the number of fields per second (giving the field frequency, f fd ), and the number of lines per image height (N l, giving the line frequency f l = f fd N l ). In Eq. A.1, the generation of an interlaced signal is simply described by setting the number of lines per picture height to a non-integer number N l = k/2, with k N (e.g ) [130]. As shown in Figure A.1, this causes the vertical fly-back to occur in the middle of a horizontal line, splitting the line in a first half at the bottom of the screen and the second half at the top. The first full horizontal line is then positioned in between the top two lines of the previous frame, resulting in an interlaced scanning raster. We are inclined to see each field as a separate time instance in which the intensity of all positions are specified. With a scan that is continuous in time, assuming a field is valid at only one instant is actually an approximation, similar to neglecting the tilt of horizontal scan lines. 202

218 APPENDIX B Displaced sampling versus sampling displaced signals In Section 3.5.1, two descriptions of the display signal chain with subpixel sampling were given. The first description, displaced sampling of the signal (Equation 3.44, I os, Figure B.1a), changes the sampling process and the addressing process, as it was introduced in the display model of Section The second description, sampling of the displaced signal (Equation 3.49, I as, Figure B.1b), changes the sampling process, but does not change the display model. In this appendix, it is shown that these two descriptions are equivalent, i.e. both are described by Equation The difference is a matter of interpretation of the digital image signal and the function of a digital display. Equivalence of displaced sampling and sampling delay To show that both descriptions are identical, we rewrite Equation 3.49 into Equation This is possible because of the following property of the delay a Figure B.1 b Two descriptions of the subpixel adapted color matrix display process: a) displaced sampling, b) subpixel sampling and addressing. 203

219 Appendix B Displaced sampling versus sampling displaced signals element D(x) (C(x) is an arbitrary matrix): C(x) = A(x)B(x) C(x x) = A(x x)b(x x) D(x) C(x) = D(x) [A(x)B(x)] (B.1) = [D(x) A(x)] [D(x) B(x)] (B.2) which holds if A( x) is a square matrix (and the left hand side exists). Equation 3.49 becomes: [ [ I as ( x) = D a ( x) S( x) D s ( x) I ]] c ( x) [ ] [ [ = D a ( x) S( x) D a ( x) D s ( x) I ]] c ( x) [ = D a ( x) S( x)] Ic ( x) (B.3) = I os ( x) since the cascade of delays D a ( x) and D s ( x) cancels. This shows that the subpixel sampled image after addressing, I as, is identical to the displaced sampled image, I os. Equation B.3 provides the simplest form for I as ( x). Difference between displaced sampling and sampling delay The difference between using I os and I ss + I as is subtle. The signal after sampling is represented by a series of δ-impulses. The pulses for the three components in I os are at different positions, whereas in I ss, pulses of RGB are at the same location. However, the series of δ-impulses is merely a useful mathematical description of the sampled signal, to allow analysis of aliasing effects. The sampled signal is in fact a set of numbers. From this set, the sample positions are not immediately visible, as in the impulse representation. Therefore, I ss more closely matches the array description than I os. It is equivalent to re-grouping the individual RGB arrays into an array of full color pixels, as it is expected by the addressing. Using the sub-pixel sampled signal does not change the addressing process, but it provides RGB signals that have been corrected for the addressing offset. Equation 3.49 describes the sample/address system shown in Figure B.1b, which explicitly shows the discrete signal, I ss, that forms the input to the display. In the displaced sampled case (Equation 3.44, Figure B.1a), this is not so obvious, but this description provides a simplified equation, which is more practical for the following discussion. 204

220 APPENDIX C Display simulation To evaluate the results from subpixel scaling, the display itself is an essential part of the system. This is a large difference with traditional image processing, where images can be evaluated relatively independent from the reproducing device (computer monitor, print, etc.). Therefore, images processed with subpixel scaling must either be viewed on the actual display they were intended for, or we must simulate the display. As a further consequence of this, it is very difficult, if not impossible, to construct an error measure to evaluate image quality. In the field of image processing, it is common to use the mean square error, or peak signal-to-noise ratio (PSNR) to evaluate the performance of processing algorithms. For our application, the display itself should be part of the calculation, i.e. the PSNR should be calculated between a discrete input image, and output light. This problem has yet to be solved (and the solution validated), so it is outside the scope of this thesis. We therefore have to limit the evaluation to simulated examples, and subjective experiments. Simulation is useful in those cases where the displays are not available, e.g. because they have not yet been produced, or in cases where we want to compare the effect of different SPAs and the available displays differ in too much other properties (color gamut, brightness, size, etc). To simulate a display, we construct a simulated image of the display. This simulated image can be regarded as a virtual photograph of the display, and must have a resolution that is high enough to contain the details of the SPA. The display simulation receives the subpixel sampled image as input, and essentially simulates the display addressing and reconstruction process, as illustrated in Figure C.1. The basic procedure is to use a number of simulated pixels per display subpixel. Each simulated pixel contains only one primary color, with the intensity of the corresponding subpixel on the display. For example, a VS display of N x xn y pixels can be simulated with an image of 3N x x3n y pixels, as shown in Figure 205

Appendix C Display simulation input image processing (subpixel scaling) subpixel ren- dered image (subpixel drive values) display light camera display image

2 (Left) rendered RGB pixels, used as input for (right) a simulated unit pixel: a: VS, b: DN, c: PT1, d: PT6. C.

The input for the simulation is a subpixel rendered image, which contains a (drive) value for each subpixel on the display.

the subpixels into RGB subpixels, as also shown in Figure C.2. The grouping corresponds to an interface format for the display addressing, i.e. which display subpixel belongs to which RGB value in the rendered image.

221 Appendix C Display simulation input image processing (subpixel scaling) subpixel ren- dered image (subpixel drive values) display light camera display image display simulation Figure C.1 Display simulation. a b c d Figure C.2 (Left) rendered RGB pixels, used as input for (right) a simulated unit pixel: a: VS, b: DN, c: PT1, d: PT6. C.2, which also shows the simulated unit pixels for DN, PT1 and PT6. The input for the simulation is a subpixel rendered image, which contains a (drive) value for each subpixel on the display. The rendered images can in principle just contain a list of subpixel values (for each unit pixel), but to store and view them using common image formats, we group the subpixels into RGB subpixels, as also shown in Figure C.2. The grouping corresponds to an interface format for the display addressing, i.e. which display subpixel belongs to which RGB value in the rendered image. This defines a number of rendered RGB pixels for each unit pixel. Note that (for PT6) some RGB values in the rendered image are not assigned to a subpixel in the unit. This is just a consequence of storing the rendered image, which does not have to contain the same number of subpixels for R,G and B, in an RGB format. The unused values are not calculated in practice, and can be set to zero 206

222 Appendix C Display simulation a b c Figure C.3 Scaling of a simulated displayed image. a) Unscaled simulated VS, b) scaled simulated VS, c) unscaled simulated PT1. All the simulated displays have the same number of subpixels, while the simulated images have the same pixel size. The scaled VS has equal SP A to the (unscaled) PT1 when both simulations are displayed side-by-side. in the rendered image 1. Simulation scaling The simulated unit pixels for different SPAs can have very different number of pixels, so the images of the simulated displays will also be of different size. To compare simulated images with, for example, identical number of subpixels per area (Section 3.6.4), we scale simulated images of different SPAs to the same size scaled simulated images. The scaled simulated images should still contain enough resolution to capture the essential features of the SPA. Moreover, the scaling should not introduce significant artifacts. Particularly, the pixel structure of the simulated display, can cause severe aliasing in the scaled simulated image 2. The scaling factor should therefore be chosen such, that this aliasing is prevented. The larger the scaled simulated image, the less these artifacts. On the other hand, simulated scaled images should be as small as possible to allow simulated displays with a reasonable number of pixels 3. We found that a VS scaled simulated unit size of approximately 4.5x4.5 pixels prevents scaling artifacts in the other SPAs with equal subpixels per area. Figure C.3 shows the scaling of a VS simulated display image, to contain the same number of subpixels per area as a PT1 simulation. When the simulation scaling factor is larger, some of the features of the SPA are lost, but the resulting image can still convey a lot of the resolution impression that the display would have. This is because the useful displayed image information is still only at frequencies around the display Nyquist, which can be simulated with a only little more than 1 simulated pixel per display pixel. Besides reduced image size, these low resolution simulations have another benefit. High resolution simulated images will typically only reach 1/3 of the 1 The simulation process can even be misused as a subpixel sampling step by choosing the appropriate rendered image format 2 Very much the same effect as the color aliasing that is introduced by the subpixel scaling itself 3 Depending on the resolution of the display that is used to view the simulated images. We use an LCD with 3840x2400 pixels (produced by IBM), to allow maximum simulated display resolution. 207

Appendix C Display simulation a b Figure C.4 a) Low and b) High resolution simulated displayed image, with respectively 2x2 and 5x5 simulated pixels per (primitive) unit pixel.

223 Appendix C Display simulation a b Figure C.4 a) Low and b) High resolution simulated displayed image, with respectively 2x2 and 5x5 simulated pixels per (primitive) unit pixel. The low resolution simulation has lost the details of the SPA, but the intensity is higher and the resulting displayed image is still largely unaltered (when viewed from a distance that represents normal viewing distance, ca. 85 times the image width, i.e. ca. 4 m, in this case.) maximum brightness, because each pixel uses only one of the primary colors. At lower resolution simulations, the subpixels blend together, which allows the intensity to be increased. However, for simulations where accuracy is important, e.g. for use in perceptual experiments, high resolution simulation is preferred. Figure C.4 shows an example simulation with high and low resolution. 208

224 APPENDIX D General subpixel sampling and display spectrum In the calculations in Section 3.5.1, sampling and addressing were each others inverses: D a ( x) = D s ( x), i.e. D s ( x)d a ( x) = 1. In general, we can sample the RGB components with one set of delays D s, and have addressing with a different delay D a. This basic system was already described by Equation 3.49, which we write in frequency domain without variables, and add the aperture: I f das = A f D f a [ S -1 [ D f s I f c ]] (D.1) Transforming this to Y U V space, we obtain: [ ]] I f ydas = MA f D f a S -1 [D f s M -1 I f yc [ [ = A f Φ f a S -1 Φ f ]] s Iyc f (D.2) with { Φ f a = MD f a Φ f s = D f s M -1 (D.3) This is the general form that covers the two special cases of Eqs and To see this, we re-write Equation D.2: [ ]] I f ydas = MA f D f a S -1 [D f s M -1 I f yc [ ]] = A f MD f ad f s D f-1 s S -1 [D f s M -1 I f yc [[ = A f MD f ad f s D f-1 s S -1] [ ]] D f-1 s D f s M -1 I f yc [[ = A f MD f ad f s D f-1 s S -1] ]] [M -1 I f yc [[ = AMD f ad f s Ds f-1 M -1 S -1] I ] yc f (D.4) 209

225 Appendix D General subpixel sampling and display spectrum With pixel sampling (Equation 3.57), D f s = 1, D f-1 s = (D f s ) -1 = 1, and M -1 can be moved outside the convolution. Now Φ (Equation 3.56) is located at the addressing process only. With subpixel sampling (Equation 3.55), D f s = D f-1 a, M can be moved inside the convolution, and Φ is located at the sampling process only. We see that in the general case, the cross-talk is split over the sampling and addressing process, leading to two crosstalk matrices, Φ a and Φ s. Since we now have two instances of Φ, we cannot collect the matrices M and M -1 around a single delay matrix. And even more general, we can delay and sample in any color space XY Z, where [XY Z] T = N[Y UV ] T : with [ I f yds = MA f D f a M -1 N -1[ S -1 [ [ = AΦ f an S -1 Φ f ]] sniyc f [ D f s NI ]]] yc f (D.5) (D.6) { Φ f an = MD f am -1 N -1 Φ f sn = D f s N (D.7) Note that the display addressing process is still defined on RGB components, therefore the sampled signal is converted from XY Z to RGB with the matrix M -1 N -1. As expected, Equation D.7 reduces to Equation D.3 when XY Z = RGB, since then N = M -1, and M -1 N -1 =

226 List of symbols and abbreviations The numbers refer to the pages of first occurrence. Abbreviations AC Alternating Current; indicates high frequency signals BDI Black Data Insertion (LCD drive method)..151 BET Blurred Edge Time..144 BEW Blurred Edge Width..143 BFI Black Frame Insertion (LCD drive method)..150 BW Bandwidth..147 CMD Color Matrix Display CRT Cathode Ray Tube DC Direct Current; indicates (very) low frequency signals DFC Dynamic False Contour (PDP)..158 DN Delta-Nabla (SPA) DSF Duplicated Subfields (PDP drive method)..160 E-BET Extended Blurred Edge Time..144 FPD Flat Panel Display GFI Gray Frame Insertion (LCD drive method)..151 HD High Definition ( 720 lines) HFR High Frame Rate (LCD drive method)..149 HVS Human Visual System IPS In-Plane Switching (LC mode) LC Liquid Crystal LCD Liquid Crystal Display LC-RT Liquid Crystal Response Time..128 LD Linear Drive (PDP drive method)

227 List of symbols and abbreviations MCIF Motion Compensated Inverse Filtering..172 MLS Multiple Level Subfields (PDP drive method)..160 MPRT Motion Picture Response Time..143 MTF Modulation Transfer Function N-BET Normalized Blurred Edge Time..144 PAL Phase Alternating Line (European video standard) PDP Plasma Display Panel PT PenTile (SPA) SB Scanning Backlight (LCD)..153 SD Standard Definition (< 720 lines) S&H Sample and Hold..127 SPA Subpixel Arrangement SRC Sample-Rate Conversion TFT Thin Film Transistor TN Twisted Nematic (LC mode) TV Television VA Vertically Aligned (LC mode) VS Vertical Stripe (SPA) Y/C Luminance (Y) / chrominance (C) (color space) Symbols Symbols x, y and t, also used in subscripts, indicate horizontal, vertical and temporal dimensions, respectively. General AR = W H Display aspect ratio d v Viewing distance f = [f x, f y ] Spatial frequency f f Field frequency f l Line frequency f N = 1 2 f s Nyquist frequency f sx, f sy Spatial sampling frequency f t Temporal frequency..137 I Intensity signal I a Addressed intensity signal (id. for I(x), etc.) I c Continuous (original) intensity signal I d Displayed intensity signal I s Sampled intensity signal

228 List of symbols and abbreviations I( x) (Still, monochrome) image signal I( x) = [R( x), G( x), B( x)] T RGB color image signal I y ( x) = [Y ( x), U( x), V ( x)] T Y UV color image signal M; Iy = MI Color transform matrix (RGB to Y UV ) N = [N x, N y ] Number of pixels N l = f l /f f Number of lines p = [p x, p y ] Pixel pitch R = Ndv W Visual resolution R, G, B Red, Green, Blue color signal components T Frame period V Display drive signal (voltage) W, H Display width, height x = [x, y] Spatial coordinate Y, U, V... Luminance-chrominance (Y-UV) color signal components α vis Visual angle PDPs Number of subfields N s Subfield state S n {0, 1} Subfield delay t n [0, T ]..157 Subfield weight W n Spatial display properties A( x) Aperture function AR p = py p x Pixel aspect ratio AR U = uy u x = mypy m xp x Unit pixel aspect ratio D( x) Delay matrix D U Data (column) drivers per unit pixel DR U = U x D U + U y R U Total drivers per display I ds Sampled delayed signal I f ( f) = F{I( x)} Image frequency spectrum I os Displaced sampled signal I ss Subpixel sampled signal L = [ v T 1 v T 2 ] Unit pixel lattice basis m Unit pixel to primitive unit pixel multiplex factor n = [n 1, n 2 ] Index of a point on the lattice R U Row (select) drivers per unit pixel S( x) Sampling function S A Subpixels per area S U Subpixels per unit pixel

229 List of symbols and abbreviations u = [u x, u y ] = [m x p x, m y p y ] Pitch of unit pixel U = [U x, U y ] Number of unit pixels in display σ Gaussian spot width Φ f = M D f am Phase crosstalk matrix Temporal display properties A e ( x, t) Perceived aperture (projection on retina)..134 A m ( x, v) Motion aperture..135 A s ( x) Spatial display aperture..134 A se ( x) Perceived spatial aperture..134 A t (t) Temporal display aperture..133 A(x, t) = A s ( x)a t (t) Spatio-temporal display aperture..134 B x (x) Spatial blurred edge profile..145 B t (t) Temporal blurred edge profile..145 f 3dB Bandwidth (-3dB)..147 H f inv Inverse filter transfer function..172 I acc Accumulated intensity..169 I av Temporal average intensity..168 I e Intensity signal on retina..140 I m Intensity signal of moving image..140 I MC Motion compensated intensity..168 I p Perceived intensity signal..141 Med Median..168 MPRT BET = BET Blurred edge time based MPRT..146 MPRT BW = 0.35/f 3dB Bandwidth based MPRT..147 S I t General step signal..128 S t (t) Normalized step signal..128 S(x) Spatial step signal..145 T h Hold time..142 (x d, t d ) Spatial position on display..124 (x e, t e ) Spatial position on eye..124 t c Addressing delay at the camera..125 t d Addressing delay at the display

230 Summary Televisions (TVs) have shown considerable technological progress since their introduction almost a century ago. Starting out as small, dim and monochrome screens in wooden cabinets, TVs have evolved to large, bright and colorful displays in plastic boxes. It took until the turn of the century, however, for the TV to become like a picture on the wall. This happened when the bulky Cathode Ray Tube (CRT) was replaced with thin and light-weight Flat Panel Displays (FPDs), such as Liquid Crystal Displays (LCDs) or Plasma Display Panels (PDPs). However, the TV system and transmission formats are still strongly coupled to the CRT technology, whereas FPDs use very different principles to convert the electronic video signal to visible images. These differences result in image artifacts that the CRT never had, but at the same time provide opportunities to improve FPD image quality beyond that of the CRT. This thesis presents an analysis of the properties of flat panel displays, their relation to image quality, and video signal processing algorithms to improve the quality of the displayed images. To analyze different types of displays, the display signal chain is described using basic principles common to all displays. The main function of a display is to create visible images (light) from an electronic signal (video), requiring display chain functions like opto-electronic effect, spatial and temporal addressing and reconstruction, and color synthesis. The properties of these functions are used to describe CRT, LCDs, and PDPs, showing that these displays perform the same functions, using different implementations. These differences have a number of consequences, that are further investigated in this thesis. Spatial and temporal aspects, corresponding to static and dynamic resolution respectively, are covered in detail. Moreover, video signal processing is an essential part of the display signal chain for FPDs, because the display format will in general no longer match the source format. In this thesis, it is investigated how specific FPD properties, especially related to spatial and temporal addressing and reconstruction, affect the video signal processing chain. A model of the display signal chain is presented, and applied to analyze FPD spatial properties in relation to static resolution. In particular, the effect of the color subpixels, that enable color image reproduction in FPDs, is analyzed. The 215

231 Summary perceived display resolution is strongly influenced by the color subpixel arrangement. When taken into account in the signal chain, this improves the perceived resolution on FPDs, which clearly outperform CRTs in this respect. The cause and effect of this improvement, also for alternative subpixel arrangements, is studied using the display signal model. However, the resolution increase cannot be achieved without video processing. This processing is efficiently combined with image scaling, which is always required in the FPD display signal chain, resulting in an algorithm called subpixel image scaling. A comparison of the effects of subpixel scaling on several subpixel arrangements shows that the largest increase in perceived resolution is found for two-dimensional subpixel arrangements. FPDs outperform CRTs with respect to static resolution, but not with respect to dynamic resolution, i.e. the perceived resolution of moving images. Life-like reproduction of moving images is an important requirement for a TV display, but the temporal properties of FPDs cause artifacts in moving images ( motion artifacts ), that are not found in CRTs. A model of the temporal aspects of the display signal chain is used to analyze dynamic resolution and motion artifacts on several display types, in particular LCD and PDP. Furthermore, video signal processing algorithms are developed that can reduce motion artifacts and increase the dynamic resolution. The occurrence of motion artifacts is explained by the fact that the human visual system tracks moving objects. This converts temporal effects on the display into perceived spatial effects, that can appear in very different ways. The analysis shows how addressing mismatches in the chain cause motion-dependent misalignment of image data, e.g. resulting in the dynamic false contour artifact in PDPs. Also, non-ideal temporal reconstruction results in motion blur, i.e. a loss of sharpness of moving images, which is typical for LCDs. The relation between motion blur, dynamic resolution, and temporal properties of LCDs is analyzed using the display signal model in the temporal (frequency) domain. The concepts of temporal aperture, motion aperture and temporal display bandwidth are introduced, which enable characterization of motion blur in a simple and direct way. This is applied to compare several motion blur reduction methods, based on modified display design and driving. This thesis further describes the development of several video processing algorithms that can reduce motion artifacts. It is shown that the motion of objects in the image plays an essential role in these algorithms, i.e. they require motion estimation and compensation techniques. In LCDs, video processing for motion artifact reduction involves a compensation for the temporal reconstruction characteristics of the display, leading to the motion compensated inverse filtering algorithm. The display chain model is used to analyze this algorithm, and several methods to increase its performance are presented. In PDPs, motion artifact reduction can be achieved with motion compensated subfield generation, for which an advanced algorithm is presented. 216

232 Samenvatting Signaalbewerking voor Platte Beeldschermen Analyse en Algoritmen voor Verbeterde Statische en Dynamische Resolutie Televisies (TVs) hebben een opmerkelijke technologische vooruitgang geboekt sinds hun introductie bijna een eeuw geleden. Begonnen als kleine, donkere, kleurloze schermen in houten kastjes, zijn TVs geëvolueerd naar grote, heldere, kleurige beeldschermen in plastic dozen. Pas bij het aanbreken van deze eeuw is de TV veranderd in een bewegend schilderij, mogelijk gemaakt door het vervangen van de forse kathodestraalbuis ( Cathode Ray Tube, CRT) door dunne en lichtgewicht platte beeldschermen ( Flat Panel Displays, FPDs), zoals vloeibaar-kristalschermen ( Liquid Crystal Displays, LCDs) of plasmabeeldschermen ( Plasma Display Panels, PDPs). Echter, het TV systeem en zendformaat is nog altijd sterk gekoppeld aan de CRT technologie, waar FPDs geheel andere principes gebruiken om het electronisch videosignaal om te zetten naar zichtbare beelden. Dit leidt tot artefacten in het beeld op FPDs waar de CRT nooit last van had, maar ook tot de mogelijkheid om de beeldkwaliteit op FPDs te verbeteren voorbij wat mogelijk is met de CRT. Dit proefschrift presenteert een analyse van de eigenschappen van platte beeldschermen, hun relatie tot beeldkwaliteit, en videosignaalbewerkingsalgoritmen die de kwaliteit van de getoonde beelden verbeteren. Om verschillende types beeldschermen te analyseren, wordt de beeldsignaalketen beschreven aan de hand van enkele basisprincipes. De hoofdfunctie van een beeldscherm is het creëren van zichtbare beelden (licht) uitgaande van een electronisch signaal (video). Dit vereist functies in de beeldsignaalketen zoals opto-electronisch effect, spatiële en temporele adressering en reconstruc- 217

233 Samenvatting tie, en kleursynthese. De eigenschappen van deze functies worden gebruikt om de CRT, LCD en PDP technologieën te beschrijven, waaruit blijkt dat deze beeldschermen dezelfde functies uitvoeren, echter gebruik makend van verschillende implementaties. Deze verschillen hebben een aantal gevolgen, die verder worden onderzocht in dit proefschrift. Spatiële en temporele aspecten, corresponderend met respectievelijk statische en dynamische resolutie, worden in detail behandeld. Hiernaast is videosignaalbewerking een essentieel onderdeel van de beeldsignaalketen voor FPDs, omdat het beeldschermformaat niet langer overeenkomt met het bronformaat. In dit proefschrift wordt onderzocht hoe specifieke eigenschappen van FPDs, in het bijzonder gerelateerd aan spatiële en temporele adressering en reconstructie, de beeldsignaalketen beïnvloeden. Een model van de beeldsignaalketen wordt gepresenteerd, en toegepast om de relatie tussen statische resolutie en de eigenschappen van FPDs te onderzoeken. In het bijzonder wordt het effect geanalyseerd van de kleursubpixels, die de kleurweergave mogelijk maken. De waargenomen beeldschermresolutie wordt sterk beïnvloed door het kleursubpixelpatroon. Wanneer dit patroon wordt meegenomen in de signaalketen, zal de waargenomen beeldschermresolutie van FPDs toenemen, waardoor ze op dit punt de CRT duidelijk voorbij streven. Het beeldsignaalketenmodel wordt gebruikt om oorzaak en effect van deze verbetering te onderzoeken, ook voor alternatieve subpixelpatronen. Echter, deze resolutietoename is niet mogelijk zonder videosignaalbewerking. De benodigde bewerking kan efficiënt gecombineerd worden met beeldschaling, welke altijd aanwezig is in de FPD signaalketen, hetgeen resulteert in het subpixel beeldschaling algoritme. Een vergelijk van de effecten van subpixel beeldschaling op verschillende subpixelpatronen laat zien dat de grootste toename in waargenomen resolutie wordt gevonden voor tweedimensionale patronen. Waar FPDs CRTs achter zich laten met betrekking tot statische resolutie, is dit niet het geval voor dynamische resolutie, i.e. de waargenomen resolutie van bewegende beelden. Natuurgetrouwe weergave van bewegende beelden is een belangrijke voorwaarden voor een TV beeldscherm, maar de temporele eigenschappen van FPDs veroorzaken artefacten in bewegende beelden ( bewegingsartefacten ) die niet voorkomen in CRTs. Een model van de temporele aspecten van de beeldsignaalketen wordt gebruikt om de dynamische resolutie en bewegingsartefacten van verschillende beeldschermtypen, met name LCD en PDP, te onderzoeken. Verder worden beeldsignaalbewerkingsalgoritmen ontwikkeld die bewegingsartefacten verminderen en dynamische resolutie verbeteren. Het ontstaan van bewegingsartefacten wordt verklaard door het feit dat het menselijk visueel systeem bewegende objecten volgt. Dit zet temporele effecten op het beeldscherm om in waargenomen spatiële effecten, die zich op verscheidene manieren kunnen voordoen. Een analyse laat zien dat adresseringsfouten in de keten bewegingsafhankelijke verplaatsingen van beeldinformatie veroorzaken. Dit leidt bij de subbeeldgeneratie in PDPs bijvoorbeeld tot het dynamisch vals contour artefact. Ook leidt een niet-ideale temporele reconstructiekarakteristiek, zoals voorkomend in LCDs, tot een verlies van scherpte bij bewegende beelden. De relatie tussen deze bewegingsonscherpte, dynamische resolutie, en temporele eigenschappen van LCDs wordt geanalyseerd met behulp 218

234 Samenvatting van het beeldsignaalketenmodel in het temporele (frequentie) domein. De temporele apertuur, bewegingsapertuur en temporele beeldschermbandbreedte worden geïntroduceerd als concepten die de karakterisatie van bewegingsonscherpte op een eenvoudige en directe manier mogelijk maken. Dit wordt toegepast bij de vergelijking van een aantal bewegingsonscherptereductiemethoden, die gebaseerd zijn op aanpassingen aan beeldschermontwerp en -aansturing. Dit proefschrift beschrijft verder de ontwikkeling van een aantal videosignaalbewerkingsalgoritmen die bewegingsartefacten kunnen verminderen. Het wordt aangetoond dat de beweging van objecten in het beeld een essentiële rol speelt in deze algoritmen, waardoor ze technieken als bewegingsschatting en bewegingscompensatie nodig hebben. Beeldbewerking voor bewegingsonscherptereductie voor LCDs houdt een compensatie voor de temporele reconstructiekarakteristiek in, hetgeen leidt tot het bewegingsgecompenseerd inverse filtering algoritme. Het beeldsignaalketenmodel wordt gebruikt om dit algoritme te analyseren, en een aantal methoden om de werking hiervan te verbeteren worden gepresenteerd. In PDPs kan bewegingsartefactreductie worden bereikt met bewegingscompenseerde subbeeldgeneratie, waarvoor een geavanceerd algoritme wordt gepresenteerd. 219

235 220

236 Acknowledgments Finally, the plan seems to come together. After a very long period of hard work, and many times when I was sure I was chasing the impossible, I made it to writing this section. And although writing a PhD thesis is a very lonely job, it was not without the support of many people that I was able to pull it off. I would like to thank all of them. First of all, I want to express my gratitude to my promotor, and mentor at Philips Research from the very first hour, prof.dr.ir. Gerard de Haan. I am grateful that you provided me the opportunity to write this thesis. Your enthusiasm got me started in the first place, but your guidance and wealth of experience in the field of video processing as well as in writing books, made it possible to bring it to an end. I also want to thank my copromotor, dr.ir. Gerben Hekstra. We have worked together on display related topics for many years, which was always a great pleasure. You were only involved in the supervision of this thesis during the last stages, but nevertheless I have learned a lot from you during the many discussions we had. Next, I want to thank prof.dr.ir. Ralph Otten, for kindly providing supervision as my second promotor, and the other members of the PhD committee: prof.dr. Ingrid Heynderickx, prof.dr. Hartmut Schröder and prof.dr.ir. Peter de With, for reviewing and approving the manuscript. Ingrid, your very concise reading of the manuscript, together with your experience in visual perception, was very helpful. And the enthusiastic reaction of prof. Schröder came at just the right moment. Furthermore, I want to thank the management of Philips Research Eindhoven, in particular my previous and current group leaders, Geert Depovere and Marc op de Beeck, in the first place for enabling the research that led to this thesis, but also for providing the opportunity to write this thesis. This thesis would also not have been possible without the work I did together with my colleagues at the Nat.Lab. In particular, I want to thank Leojan Velthoven, for being my roommate and member of the DISCO project for many years. Our effort to increase the image quality of LCDs has led to many inspiring and insightful moments. Without you, and in particular your skills in 221

237 Acknowledgments programming, building demos and inventing, this work would have been a lot less successful. The cooperation with Erno Langendijk on the perception studies was highly appreciated, as were our many discussions on the interaction between displays and perception. Further, I want to thank Jurgen Hoppenbrouwers, Roy van Dijk and Roel van Woudenberg, Nebojsa Fisekovic, Tore Nauta and Frank Budzelaar for the work we did together on various aspects of FPDs. I am also grateful to the people I worked together with on the project: Nico Cordes, Arnold van Keersop, and to my current project members and roommates: Frank van Heesch and Yingrong Xie. Working together with you on a daily basis was always very rewarding, and I hope we continue to enjoy it, and be successful at the same time, in the future. The topic of display signal processing is far too interesting and far too big to deal with by myself, but without you it would also be much less fun. The other colleagues from the video processing groups, and all other people I had the pleasure of working with at the Nat.Lab., are far too many to mention. Particularly those magic moments together with you where inventions are created inspire me a lot, but also the mindless chat at the coffee table makes work just that more interesting. Also, I want to thank my colleagues from Philips CE, Philips Lighting and NXP, for many interesting discussions, for the trust you have given in our research, but also for the hard work you put in to make real products out of our crazy ideas. At the risk of forgetting somebody, I want to thank Seyno Sluyterman, Rimmert Wittebrood, Aleksandar Ševo, Jeroen Stessen, Age van Dalfsen, Eric Funke, Rob Beuker, Anthony Ojo, Reinder Smid, Johan Janssen, Haiyan He, and Erwin Bellers (also for being my first roommate and excellent example of how to combine work and PhD). And I am grateful to our patent experts at Philips IP&S, in particular Alexander van Eeuwijk, Kees Gravendeel and Mark Mertens, for turning my ideas into patents, and for all their last-minute reviews. I also want to thank my carpool mates, Eddy Reniers, Han Dentener en Jan Snelders, for all the hours, underway and in traffic jams, between Breda and Eindhoven, and for all your patience when I was again the last one to arrive. However, the one that has suffered the most from the sometimes hopeless process of writing this thesis, and also the one who has, without a doubt, contributed the most, is my loving wife Muriëlle. During the many, many evenings and nights that I spent behind the computer, you did not only have to miss me dearly, but you also ran the household by yourself, next to a demanding and turbulent time in your own job. Furthermore, you stood by my side with a lot of good advice, where your experience from your own PhD, and your incredible skills in planning, were most helpful. I realize that you are right when you have the feeling that you have just done a PhD for the second time. Your patience and trust in me certainly pulled me through. I will do the best I can to make it all up to you, and I hope I will have a very, very long time together with you to do it. Thanks! 222

Biography Michiel Adriaanszoon Klompenhouwer was born in Purmerend, the Netherlands, on December 6, 1973.

238 Biography Michiel Adriaanszoon Klompenhouwer was born in Purmerend, the Netherlands, on December 6, He studied Applied Physics at the University of Twente in Enschede, the Netherlands, from which he received an M.Sc. degree in Subsequently, he joined the Video Processing Systems group at Philips Research Laboratories in Eindhoven, the Netherlands, where he is currently a senior scientist. His research has focused on the combination of video signal processing and flat panel display technology. In this area, he covered many different subjects, ranging from PDPs to LCDs, and from motion compensated display artifact reduction to digital color halftoning algorithms. In 2002 he received the first prize at the Philips Research Eureka inventors fair and a Chester Sall award for best paper in the IEEE transactions on consumer electronics. He has been awarded 4 patents, and has about 30 more pending. He has published 17 papers in international conferences and journals. He was an invited speaker at the SID, ICIP and IMID conferences. He is a reviewer for IEEE and SID journals, and he is a member of the technical program committee for the SID and IDW conferences. 223

239

240

241

Raster Graphics. Overview קורס גרפיקה ממוחשבת 2008 סמסטר ב' What is an image? What is an image? Image Acquisition. Image display 5/19/2008.

Raster Graphics. Overview קורס גרפיקה ממוחשבת 2008 סמסטר ב' What is an image? What is an image? Image Acquisition. Image display 5/19/2008. Overview Images What is an image? How are images displayed? Color models How do we perceive colors? How can we describe and represent colors? קורס גרפיקה ממוחשבת 2008 סמסטר ב' Raster Graphics 1 חלק מהשקפים