Acknowledgements. quaere verum

Size: px
Start display at page:

Download "Acknowledgements. quaere verum"

Transcription

1 Summary This project aims to produce software that can convert images of scanned pages of Braille into ASCII text. The Braille alphabet consists of 3x2 matrices of points raised above the surface of paper. Each letter, number, and punctuation mark is rendered using either one or two matrices (where in the case of two matrices the first is a modifier capital letter, for example). Appendix B has the exact details. Braille can come on double-sided pages, and has a number of different grammars for instance, Grade 2 (also known as contracted Braille) is similar to shorthand in that a single character may have different meanings depending on the context. These are considered as possible extensions to the minimum requirements of this project. The challenge is part of the computer vision domain, specifically, optical character recognition (OCR). Image processing techniques that have been in use over the last few decades can be applied to solve this problem successfully. The aims of this project are categorised as follows: A discussion of the problem, offering a review of the relevant literature in this area and which techniques will be of use The creation of internal representations of each Braille character, which may then be used in the final software The creation of a system that can convert scans of Braille pages into ASCII text. Once created, the software will be tested and evaluated. Its limitations will be discussed, along with an outline of any possible future improvements. Finally, a conclusion is offered, along with a reflection on the entire project experience. i

2 Acknowledgements My thanks (in no particular order) go to the following people: my supervisors James Handley and Matthew Hubbard, for offering their help and advice, and coming up with the project idea; Roger Boyle, for posing more questions than answers in the Speech and Image Processing module; Andy Bulpitt, for making the Computer Vision module interesting and whose coursework gave me a head start on this project; and Nick Efford, who wrote the Java graphics libraries used herein. quaere verum ii

3 Contents Summary...i Acknowledgements...ii Chapter 1 : Introduction Motivation Report Structure Minimum Requirements Possible Enhancements Project Schedule Summary...3 Chapter 2 : Background Research Introduction Image Processing Techniques Noise Removal and Suppression Contrast Enhancement Segmentation Morphological Image Processing Skew Detection Summary...7 Chapter 3 : Implementation Introduction Initial Character Representation and Image Processing Internal Braille Character Representation File Format Image Pre-Processing Segmentation: Thresholding Morphological Operations: Dilation and Erosion Conclusion Image Understanding: Character Recognition Using Projection to Calculate Cell Coordinates Intermediate Braille Representation Character Matching Conclusion Extensions to the Minimum Requirements Alternative Braille Grammars: Braille Grade Skew Detection and Correction Strengthening of the Character Matching Algorithm: Backtracking Handling Large Braille Pages Conclusion Summary...33 Chapter 4 : Testing Introduction Test Plan...35 iii

4 4.2.1 Minimum Requirements Testing Project Extensions Testing Summary of Test Results Analysis and Discussion of Test Results Summary...39 Chapter 5 : Evaluation Introduction Project Evaluation Software Evaluation Minimum Requirements Extended Software Features Further Work and Possible Improvements Summary...42 Chapter 6 : Conclusion...43 References...44 Appendix A : Project Reflection...47 Appendix B : The Braille Alphabet...48 Appendix C : Test Results...49 iv

5 Chapter 1: Introduction The project is essentially an OCR problem, where the aim is to get a computer to recognise characters (in this case, Braille matrices) in a digitised image. Images are obtained by scanning Braille pages using a flatbed scanner. Character recognition has been an active area of computer science research since the late 1950s. It was initially perceived as an easy problem, but turned out to be a much harder than initially anticipated. Although OCR has been in use for some time (the United States Postal Service, for example, has been using OCR machines to sort mail since 1965) it will be many decades, if ever, before computers will be able to read all documents with the same accuracy as human beings. Optical Braille recognition is a simple subset of the OCR area in general. After initial image processing, the main challenge lies in actually recognising and matching individual Braille character cells to the ones stored internally. The creation of internal representations of each letter or character is straightforward, as each one is simply a binary 3x2 matrix. Pre-processing can be done by applying a number of textbook algorithms, but it is the image understanding part that is going to pose the biggest challenge. In this project, Nick Efford s com.pearsoneduc.ip graphics libraries from Digital Image Processing: a Practical Introduction Using Java ( Pearson Education Limited 2000) are utilised in the implementation of the software. 1.1 Motivation The potential of OCR systems is enormous because they enable users to harness the power of computers to access printed documents. Such documents don t have to contain normal text, but can include Braille. In the case of this project, anyone who works with blind people but does not read Braille can benefit from the software. This includes teachers, lecturers, organisations communicating with blind individuals, and computerised Braille libraries. 1.2 Report Structure The structure used is quite general to a typical final year project report, and roughly follows the stages of developing the actual software. Chapter 2 discusses the research in this area, along with the relevant literature. Chapter 3 follows the actual implementation (coding) of the software in detail, discusses which solutions and algorithms were used, and why. Chapter 4 summarises the test results. Chapter 5 evaluates the outcome of the project and the produced solution. Possible future improvements are outlined. A conclusion is offered in Chapter 6, along with a reflection on the entire project experience in Appendix A. The Braille alphabet is reproduced in Appendix B. Finally, the full test results are shown in Appendix C. 1

6 1.3 Minimum Requirements These are set out in the Mid-Project Report, and are as follows: A discussion of the problem to be solved, outlining the likely problems and detailing the solutions The creation of internal (to the computer) Braille character representations that can then be converted into ASCII text The creation of a system that can transliterate (under perfect conditions) scanned in pages of Braille into ASCII text 1.4 Possible Enhancements The possible enhancements concern the software itself, and include the addition of extra features to the program. The ability to handle double-sided Braille The ability to translate different types of Braille grammars Detection (and correction) of small amounts of skew present in the images Correction of defects (such as shadows on page) Handling of large Braille pages than do not fit on an A4 flatbed scanner 1.5 Project Schedule Table 1 sets out the projected schedule. TASK START DATE EXPECTED DATE OF COMPLETION Preparation Oct 2003 Feb 2004 Project preference form Oct 2003 Oct 2003 Minimum requirements Oct 2003 Oct 2003 Background reading Oct 2003 Feb 2004 System Implementation, Testing and Evaluation Feb 2004 Apr 2004 Design & coding Feb 2004 March 2004 Testing March 2004 Apr

7 Evaluation March 2004 Apr 2004 Write-up Dec 2003 Apr 2004 Mid-project report Dec 2003 Dec 2003 Draft chapter and cable of contents Feb 2004 Mar 2004 Final Report Dec 2003 Apr 2004 Table 1: Projected schedule of completion 1.6 Summary To achieve the stated goals, a number of processing techniques chosen after background research will be applied to the acquired image. The Braille dots will be segmented out of the scan, processed, and translated into ASCII text. Any possible enhancements will be implemented. Finally, the software will be tested and evaluated. 3

8 Chapter 2: Background Research Chapter 2: Background Research 2.1 Introduction The problem to be solved is a computer vision task, and involves areas of artificial intelligence. This chapter details some of the relevant literature in this area. The image processing algorithms and techniques are discussed, along with how and why they may be useful in this project. Virtually all of the algorithms discussed are implemented in the com.pearsoneduc.ip library. This has the obvious advantage of code reuse, allowing most of the effort to be concentrated on the choice of techniques to be used and leaving the specific implementation details aside. Braille recognition is a simpler computer vision problem when compared to some other pattern recognition tasks such as handwriting or fingerprint identification. Typically, image processing is split into two parts: low-level image processing, and high-level image understanding. Low-level methods usually use very little knowledge about the image contents and typically include noise filtering, feature extraction, and image sharpening. High-level processing is based on knowledge and goals (Sonka et al. 1999, p. 3). OCR is a subset of the general pattern recognition problem, and consists of roughly three parts: preprocessing, feature extraction, and discrimination. The principal problem in OCR can be narrowed down to understanding the concept of a character s shape and the mechanism that identifies any instantiation of this concept (Mori et al. 1999, p. 3). 2.2 Image Processing Techniques Noise Removal and Suppression The techniques discussed here are called neighbourhood operations, i.e., operations in which the new value calculated for a pixel depends on its neighbourhood as well as its original value. Such filtering techniques can be effectively used to remove various types of noise in digital images (Umbaugh 1998, p. 159). An excellent technique for removing certain types of noise (such as impulse noise) is a median filter (Efford 2000, p. 175). This operation is a rank filtering technique that smoothes out both input images. However, it generates a black border around each image when no special border generation algorithm exists, something that will have to be dealt with later. Mean filtering reduces amplitude of noise within an image. It also has the effect of giving the image a softer appearance, effectively blurring it. It is essentially a low-pass filter, and performs Gaussian smoothing of an image. Other noise filtering techniques include hybrid filters, such as the -trimmed mean filter (which sorts values from a neighbourhood into ascending order, discards a certain number of these values from either end of the list and outputs the mean of the remaining values); adaptive filters (which change 4

9 Chapter 2: Background Research their behaviour in response to variations in local image properties); and minimum and maximum filters (where we select the bottom- or top-ranked grey level from the neighbourhood as the output value) Contrast Enhancement This sections deals with modifications of grey level values of an image. The assumption is that the image is in black and white, with a number of different grey levels present. Depending on how the image is stored, the greater the number of bits is used to represent the grey level value, the greater the number of grey levels present in the image. The relationship may be defined as b n = 2 where n is the number of grey levels and b is the number of bits used. Grey level mapping is one of the simplest, yet most useful, image processing techniques. It falls under the category of point processes because each new pixel s grey level value is calculated independently of its neighbours. The simplest mappings use a general expression for brightness and contrast modification: g ( x, y) = af ( x, y) + b where b is a constant bias we may use to add to pixel values and hence change the brightness (if b < 0 then the overall brightness is decreased), f(x, y) is a linear mapping (it can be any function that has a one to one mapping), and a is a constant gain that we may use to increase (if a > 1) or decrease (if a < 1) the contrast. Histogram modification is a classic technique used in image processing. It uses a histogram of an image (which shows the distribution of grey levels in an image). Generally speaking, a histogram with a wide spread has a high contrast, while a histogram with a low spread has a low contrast. Similarly a histogram clustered at the low end of the range is dark, and vice-versa. Hence a number of processing methods are available: histogram shrinking (which compresses the histogram); histogram sliding (which slides the histogram in one direction, thus brightening or darkening it); and histogram stretching. A variation of the last technique is worth focusing on: also known as histogram equalisation, it defines a non-linear mapping of grey levels that results in optimal improvement in contrast (Efford 2000, p. 124). It redistributes the grey levels and allocates more of them where there are the most pixels, and fewer where there are fewer pixels. This has the effect of flattening the frequency distribution, tends to increase the contrast in the most heavily populated regions of the histogram, and often reveals previously hidden detail (Efford 2000, p. 124). These techniques are what are known as full-frame histogram equalisation techniques; their main drawback is that the global image properties may not be appropriate under a local context (Sonka et al. 1999, p. 100). It is possible to acquire a histogram for a local fixed-size neighbourhood of a pixel, equalise it, and then use the result to compute a new grey level value for that pixel (Pizer et al. 1987). 5

10 Chapter 2: Background Research Segmentation Image segmentation methods look for objects that either have a measure of homogeneity within themselves or have some measure of contrast with the objects on their border. Their goal is to find regions that represent objects or meaningful parts of objects. Complete segmentation (which is rare) provides a set of disjoint regions, each corresponding to a different real-world object; partial segmentation provides some regions which do not directly correspond to real-world objects (Umbaugh 1998, p. 80; Bulpitt 2003, p. 5). Segmentation techniques may be categorised into three categories: edge-based techniques (e.g., border tracing, Hough transforms), region based techniques (region growing and splitting/shrinking), and global techniques. Since the images being dealt with in this project are relatively simple, the first two categories are likely to be unnecessary. The focus will fall on the last category, or more specifically, a technique known as thresholding. Thresholding transforms a dataset containing values that vary over some range into a new dataset containing just two values. Input that falls below a specified threshold value is mapped to one of the output values; input above a specified threshold is mapped to the other output value. Sonka et al. (1998, p. 124) state that if objects do not touch each other, and if their grey-levels are clearly distinct from background grey-levels, thresholding is a suitable segmentation method. Both properties are satisfied in the case of scanned-in Braille pages. However, correct threshold selection is absolutely vital for successful threshold segmentation (Sonka et al. 1999, p. 124; Efford 2000, p. 253). In general, there are two approaches to automatic threshold selection: statistical and model-based (Mori et al. 1999, p.105). A simple method of threshold selection is to take an average of all the greylevels in an image. This is a good choice and works well as well (Mori et al. 1999, p. 107). A method that may be worth considering is called p-tile thresholding. It makes an assumption that we have some prior knowledge of some property of an image after segmentation, a good example being text printed on paper. If we know that the text covers 1/p of the sheet area, we may then easily choose a threshold T, based on the image histogram, such that 1/p of the image area has grey-level values less than T and the rest has grey-level values larger than T (Sonka et al. 1999, p.127). Finally, a good method for automatically selecting a threshold which is based on approximation of the histogram of an image is called optimal thresholding (Sonka et al. 1999, p. 128). It results in minimum error segmentation (Chow and Kaneko 1972; Rosenfeld and Kak 1982), however deciding whether a histogram is bi-modal (which is required for this method to work) may not be straightforward (Rosenfeld and de la Torre 1983). 6

11 Chapter 2: Background Research Morphological Image Processing Morphological techniques may be used for non-linear smoothing and feature enhancement (Efford 2000, p. 271). They operate on a binary image, and use a small template called a structuring element, which is positioned at all possible locations in the image and compared with the corresponding neighbourhood of pixels. Where the template intersects or fits within the neighbourhood (depending on the type of operation being performed), the resulting output pixel has a non-zero value. The most basic of these techniques are erosion and dilation. Erosion can remove unwanted, smallscale features from a binary image, while dilation has the opposite effect of enhancing features of interest. The two operations are a complement of each other, that is, they have opposite effects 2.3 Skew Detection One of the possible enhancements to the minimum requirements is the ability to detect (and correct) small amounts of rotation present in images. Several methods exist for skew detection in the OCR domain. Most are based on the Hough Transform (Hinds et al., 1990; Le et al., 1994; Yan, 1993), the Fourier Transform (Hase and Hoshino, 1985; Postl, 1986), and projection (Ciardiello et al., 1988; Baird, 1987). Fourier transform-based techniques are unreliable and difficult (Nicel 2000, p. 15); Hough Transform-based approaches are usually computationally expensive (Gatos et al. 1996, p. 1). Other approaches include connected component clustering and correlation-based algorithms. Nicel (2003, pp ) describes each method and summarises the relative advantages and disadvantages of each approach. 2.4 Summary The various image processing techniques that exist and are relevant to this project have been outlined. In order to meet the minimum requirements, input images will have to be processed and correctly segmented. Chapter 3 describes in detail how the segmented image will then be used to calculate Braille dot coordinates. An intermediate representation of the Braille page will be constructed, which will finally be translated into ASCII text. As an extension to the minimum requirements, a technique for detecting the amount of rotation present in an image will be chosen and implemented. A this stage (end of February), all the background research work has been completed. The project is on schedule. 7

12 Chapter 3: Implementation Chapter 3: Implementation 3.1 Introduction This chapter describes the implementation of the working system. The image processing algorithms and techniques used are discussed. Their use is justified and linked to background research work. The following are examined in detail: problems that have arisen during this phase; their solutions; modifications to the original approach; and the reasons for not using certain methods. Section 3.4 examines extensions to minimum requirements that have and have not been put into practice. The implementation approximately follows the original framework. Internal representations of each Braille character are constructed. Image pre-processing in the form of a median filter is carried out to suppress or remove any impulse noise. Features of interest are segmented out and enhanced using thresholding and morphological operations (erosion and dilation). High-level image understanding is carried out with the use of projection, which also helps with detecting any rotation present in the original image. An intermediate representation of the Braille page is constructed. A search algorithm is then implemented to match the characters on the intermediate form to the ones stored internally. 3.2 Initial Character Representation and Image Processing Internal Braille Character Representation Each Braille character consists of a 3x2 matrix. The dots are numbered 1-6 in the manner shown in Appendix B. The design of the character representation is a binary array, with 0 representing an absence of a dot, and 1 representing its presence. This design is very simple, but effective. Each character is defined at the start of the program. Characters may be added or modified easily in the future. Arrays allow for efficient comparison, something that will be useful at later stages when a constructed Braille character is compared to every possible match in the alphabet File Format Initially, the decision to work with greyscale images only was taken. In a fashion similar to traditional OCR problems (i.e., those involving printed text), there is little point in scanning in full colour pages of what is essentially a binary image. It was therefore assumed that the user will not use, for example, RGB images as input. Extra colour in such scans will yield no useful extra information to work with for obvious reasons Braille texts tend to come on paper of one colour only. Colour images are also typically three times larger in size. The chosen file format was Portable Network Graphics (PNG). The majority of scanners allow the user to save images using this file extension. Using a format without compression was not considered 8

13 Chapter 3: Implementation due to very large file sizes. A greyscale Windows bitmap scan of an A4 page at 600 dots per inch (dpi) is over 30 megabytes in size; the same file in RGB colour is over 100 megabytes. The com.pearsoneduc.ip.io package provides an interface that allows a choice of different file extensions, the most popular being JPEG and PNG. PNG was chosen over JPEG as it produces better quality images while file sizes remain reasonable, in the order of under one megabyte for an A4 scan at 150 dpi. Also, because of the way JPEG compression works, images degrade in quality each time they are modified. The figures below shows the differences between the two file types. Figure 3-1: JPEG. Note the continuous tones have been reproduced, but visible artefacts surround text. Also, there is ghosting outside vertical window edges. Figure 3-2: PNG. Text is clear and sharp. Continuous tones are reproduced. Following initial scans, a possible project enhancement was rejected: working with double-sided Braille. Even a high resolution (600 dpi) scan in a lossless, full-colour format, revealed there was very little difference between Braille dots that were raised above the page, and those that were recessed below it. Figure 3-3 (on the next page) shows the image, along with a close-up of a relevant area (highlighted with a dashed border in Figure 3-4). 9

14 Chapter 3: Implementation Figure 3-3: Original full-page, full-colour, double-sided scan at 600dpi (over 100 megabytes). Figure 3-4: Close-up of the dashed area from Figure

15 Chapter 3: Implementation As can be clearly seen, the difference between the raised and the recessed dots is negligible, especially on the right side of Figure Image Pre-Processing The pre-processing stage is often standard in OCR systems. In this case, a median filter with a 3x3 neighbourhood is chosen first. It reduces impulse noise and has the effect of smoothing the grey levels in the scan, something that is useful as it makes the page itself more uniform in appearance. This will assist in distinguishing the features of interest (i.e., the Braille dots) from the background. However, because of the way a median filter works, a black border is generated around the image: something that will have to be dealt with at a later stage. Initial tests showed that the dots are clearly distinctive from the page. For this reason, contrastenhancing techniques will not be used Segmentation: Thresholding Edge- and region-based segmentation techniques were found to be unnecessary. The former find borders between regions, while the latter construct regions directly. The images dealt with in this project are simple enough to use a global segmentation technique. The global knowledge about these images may be represented as a histogram of image features (Sonka et al. 1999, p. 123); here, grey levels will be used as they distinguish features of interest: Braille dots are darker (and hence have lower grey level values) than the rest of the page. As previously stated, the correct choice of a threshold level is absolutely vital to the success of this technique. A preliminary training set of test images showed little variation in shadows across the page. Scanners provide a uniform light source as they scan an image, hence illumination variance is minimised. This simplifies the problem of choosing a correct threshold value when compared to images from other sources (such as a webcam). It also suggests that regional thresholding, where an image is split into regions and each region is then thresholded independently of one another, is not necessary. Initially, the threshold value was chosen manually. Next page shows a typical scanned image (Figure 3-5), together with a correct threshold resulting in successful segmentation (Figure 3-6). Note that some impulse noise remains at the bottom left (highlighted). Figure 3-7 shows an incorrect threshold value for comparison. Here, the value chosen was too high and some residual shadow (due to the page not being completely flat against the scanner face) remains. 11

16 Chapter 3: Implementation Figure 3-5: A typical scanned image. Figure 3-6: The result of a correct threshold. 12

17 Chapter 3: Implementation Figure 3-7: The result of the threshold value being too high. An automatic threshold selection method was then implemented. Optimal thresholding was initially considered as a good candidate as it results in the smallest number of pixels being mis-segmented (Gonzalez and Wintz, 1987; Rosenfeld and Kak, 1982). The threshold is set as the minimum grey level between the maxima of two or more normal distributions. Figure 3-8 (from Sonka et al., 1999) shows: (a) probability distributions of background objects, and (b) corresponding histograms and optimal threshold. Figure 3-8: Grey level histograms approximated by two normal distributions. 13

18 Chapter 3: Implementation The main difficulty with this technique, as stated in Chapter 2, is deciding whether or not the histogram is bi-modal. Figure 3-9 shows a histogram that is typical of the images being worked with Frequency Grey level value Figure 3-9: Grey level histogram of the image in Figure 3-5. At first glance it seems that the histogram is, indeed, bi-modal. Figure 3-10 shows a close-up of the relevant area Frequency Grey level value Figure 3-10: The bi-modal histogram of Figure 3-5 (close-up). 14

19 Chapter 3: Implementation Although this seems to confirm the choice of optimal thresholding as the preferred method for selecting the threshold level, actually using the value (252 in this case) yields no useful result (Figure 3-11). Figure 3-11: Threshold using value calculated with optimal thresholding. Figure 3-12 shows a histogram of another typical image Frequency Grey level value Figure 3-12: Histogram of another typical image. 15

20 Chapter 3: Implementation Although the histogram is relatively similar to Figure 3-10 in that it shows there are fewer darker pixels present (those representing the Braille dots) than there are brighter ones (those representing the rest of the page), it does not exhibit two clear maxima with a distinct minimum in-between. The reason for this is the fact that the grey level distributions of the foreground and the background are not represented by normal distribution curves. Therefore, optimal thresholding was found not to be a suitable method for selecting the global threshold value. P-tile thresholding was the next approach to be considered for automatic threshold selection. The training set of initial scans was then used to determine the proportion of the page covered by Braille. After some experimentation, it was found that just under 4% of the page belonged to the Braille characters, which shared the same characteristic of being darker than the surrounding area. The threshold value T may now be chosen automatically using a cumulative frequency histogram of the grey level values, such that 4% of the image has grey level values less than T and the rest has grey level values larger than T. This method gave excellent results, and in the majority of cases resulted in correct automatic segmentation of the images. Two problems surfaced at this stage, however. Firstly, some noise still remained; it was the same in nature to the circled area in Figure 3-9. Using another 3x3 median filter virtually eliminated this problem (this operation makes the reasonable assumption that the Braille dots are larger than impulse noise.) Figure 3-13 shows the result: note the absence of noise (circled) and the black border generated by the second median filter. Figure 3-13: The result of a second median filter (after thresholding). 16

21 Chapter 3: Implementation The second problem can be seen below in Figure Figure 3-14: The top half of a segmented image. A black, vertical line is present on the left side of this image, distinct from the border generated by the median filter. The reason for it is a shadow cast on the page as it is being scanned, resulting from a typical Braille page being larger than A4 size (pages being too wide, rather than too long). Some of the page is therefore invariably not flat against the glass surface of the scanner as the image is being acquired (Figure 3-15). Figure 3-15: A typical Braille page too large to fit on a scanner. In Figure 3-14, the left side of the page was sticking out. This lifted it slightly, casting a shadow which was approximately as dark as the Braille dots. Although manually lowering the threshold value found previously with p-tile thresholding alleviated the situation somewhat, most of the shadow 17

22 Chapter 3: Implementation remained. Reducing the threshold value further resulted in incorrect segmentation with some of the dots in the output image simply disappearing or becoming very small. The shadow would have to be dealt with at a later stage, as a possible extension of the project. Of course, manually cropping the input image to exclude the affected area eliminated this problem entirely Morphological Operations: Dilation and Erosion Binary images that result from segmentation may contain imperfections; morphological processing techniques can remove these imperfections (Efford, 2000, p.271). These techniques are used here to enhance the size of the Braille dots after thresholding. Any dots that may have been flattened on the page tend to stand out less, which results in their reduced size in an output image. A decision was also made to invert the binary image after this operation, so that the dots are white and the background black. This simply makes the image conceptually more logical, as the presence of dots is now indicated by on pixels (with a grey level value of 255 in an 8-bit image), and their absence by off pixels (with a grey level value of 0). Figure 3-16 shows an example output before the dilation. Some problem areas are highlighted. Figure 3-16: A thresholded image before dilation. Figure 3-17 (on the next page) shows the same image after dilation. Clearly, the features of interest have been enhanced. The structuring element used for this operation was a 5x5 disk, because dilation by a disk enlarges the dots and smoothes any convex corners. Still, this presented another problem. The dots became very bold, and most of them are now touching. Obviously it is desirable to keep them separated, so another operation had to be performed. After experimenting with a further erosion operation and different structuring elements (using the disk again would be pointless as it would 18

23 Chapter 3: Implementation effectively reverse the dilation), empirical evidence suggested an erosion with a cross-shaped structuring element. This showed good results. It separated the dots slightly, while maintaining the enhancing effect of the original dilation, although some dots became more square in appearance. Figure 3-17: The image after dilation Conclusion The low-level processing techniques chosen have worked well. P-tile thresholding was found to be the best method of automatically selecting the threshold level and correctly segmenting the image. Median filtering worked well and removed most of the noise present. Special consideration will have to be given to shadows produced on the page due to Braille pages being wider than the scanner s surface. Morphological image operations also worked well in enhancing the relevant features. The resulting output so far is a binary image with well-defined white dots on a black background. A possible project enhancement was rejected: double-sided Braille. This was due to negligible difference between Braille dots that are raised and those that are recessed on a page. The project is running on schedule, with the completion of unit testing of the methods implemented so far (middle of March). However, the next part (higher-level image understanding and character recognition) is where the main difficulties are likely to lie. 3.3 Image Understanding: Character Recognition Using Projection to Calculate Cell Coordinates Braille pages have a characteristic of being very regular, that is, they have an ordered, consistent appearance. Each character s 3x2 matrix is exactly the same in size. This makes the recognition task 19

24 Chapter 3: Implementation easier than, for example, recognising printed characters on a page where different letters have slightly different dimensions, and certainly easier than recognising handwriting. This regularity can be exploited to make the task of matching Braille characters easier. Projection is a method of mapping a 2-dimensional shape onto 1 dimension. Typically, projections are generated parallel to the abscissa (horizontal projection) and the ordinate (vertical projection), and are performed on binary images. Mathematically, a horizontal projection p h and a vertical projection p v are defined as = j p h ( i) f ( i, j) and p v ( j) = f ( i, j) i where (i, j) are image coordinates. A horizontal projection is calculated by scanning the image in a left-to-right, top-to-bottom order. For each y-coordinate of the image, a counter contains the total number of corresponding hits (in this case, white pixels) along the x-axis. Nicel (2000, p. 21) outlines the simple algorithm used: for every y coordinate do for every x coordinate do if pixel (x,y) is white then increment counterarray[y]; output counterarray[y] A vertical projection is obtained in a similar manner, swapping the x and y coordinates in the first two lines of the algorithm. Two arrays are used to store the projections. Due to borders around the image generated by the median filters used previously, a small modification of the above algorithm was made: scanning begins at (x+3) and (y+3), and terminates at (width-3) and (height-3), respectively, where width and height are dimensions of the image. Projection allows us to automatically calculate Braille cell sizes and the positions of the Braille dots, assuming that the image is not skewed. We plot a graph of each projection. The graph exhibits a peaky appearance, with peaks in the vertical projection corresponding to location of Braille columns, and peaks in the horizontal projection corresponding to location of Braille lines. Figure 3-18 (on the next page) demonstrates the concept. Note that the image itself is scaled and rotated 90 anti-clockwise for illustration purposes. Its horizontal projection is shown on top of it. A clear correlation between the peaks and the location of the Braille lines can be seen. The height of the peaks shows how many hits the projection made. For example, line 8 of the page (y-coordinates of around 350) has fewer characters on it and hence the peaks are smaller. The peaks are grouped in threes (each group representing a line of Braille), since each Braille character is 3 dots high. The vertical projection graph of the same image is shown in Figure For clarity, the x-axis is scaled to the range [0:500]. As expected, the peaks are grouped in twos, representing the 2-dot width of each Braille character. 20

25 Chapter 3: Implementation Horizontal Projection Frequency y-coordinates Figure 3-18: A binary image (rotated 90 anti-clockwise) and its horizontal projection. 21

26 Chapter 3: Implementation 250 Vertical Projection 200 Frequency x-coordinates Figure 3-19: Vertical projection of the image in Figure Each peak on the horizontal projection graph corresponds to the most likely y-coordinate location of a line of Braille. Similarly, each peak on the vertical projection graph correspond to the most likely x- coordinate of each Braille column. Using this information, each graph is scanned for maxima, and the corresponding x- and y-values are noted. Detecting maxima was not trivial due to the fact that some Horizontal Projection Frequency y-coordinates Figure 3-20: The flat maxima of a horizontal projection. peaks exhibited a flat appearance (Figure 3-20), partly due to the previous erosion with a cross-shaped element. There was also the possibility of local maxima that did not correspond to a true peak. The search algorithm therefore had to compare more than one value either side of the possible candidate currently being investigated; also, once a peak was found, the algorithm skipped forward a small 22

27 Chapter 3: Implementation number of values to ensure that a particularly flat peak does not get registered as two separate maxima. Fortunately, no local maxima were present in any of the projection graphs examined. This method also exhibits some robustness with respect to the original dilation causing some dots to connect with its neighbours, as separate peaks were still registered for each dot. Braille letters cannot be printed arbitrarily at any location on a page. They are locked to a grid, and only a certain number of possible locations exists. If one were to imagine a character where every dot of a cell is embossed (such a character does not exist, although if dot 6 was on in the letter q, this character would be the result see Appendix B: The Braille Alphabet ), and then the character was repeatedly printed at every possible grid location on the page, the two arrays would contain the x and y coordinates of every cell s dot. The results of analysing the projection graphs were two arrays with the possible x- and y-coordinates of all the Braille characters on the page. Note this does not mean the dots are actually present at these locations. Projection analysis merely provides us with hypothetical locations of each dot; however it does reduce the possible search space quite substantially from every single pixel in an image. It is up to later stages of image analysis to determine whether or not a dot is actually present at the possible coordinates, and which character it belongs to. Figure 3-21 demonstrates the concept. The first six hypothetical dot locations are shown where the dashed lines cross. Clearly, the dots are not present at those locations. Figure 3-21: Using projection analysis to determine possible locations of Braille dots Intermediate Braille Representation The intermediate representation of the page is a two-dimensional character array. This form was chosen because it will allow quick searching and matching of Braille letters when compared to searching through the image itself. An o in the array indicates the presence of a dot, and a indicates its absence. A simple algorithm searches for Braille dots through all possible dot locations (which were determined using projection). The algorithm checks if the pixel at the current (x, y) 23

28 Chapter 3: Implementation position is white. If it is, it writes an o to the array, if it is not, it writes a instead. At first, an alternative algorithm that was thought to be more robust was used, as a single white pixel at a wrong location would result in a false positive. The alternative would check the value of a pixel at positions (x, y), (x-1, y), (x+1, y), (x, y-1), (x, y+1) and only if they were all white would a positive match be recorded. After experimentation, however, it was found that the original version worked much better. The modified algorithm missed out too many dots because of tiny (less than 0.1 ) variations in page orientation, while the original version encountered no spurious positives as almost no noise was present in segmented images at the locations that were being checked. Thus, the intermediate representation of the image in Figure 3-21 would be as follows: O O O O - - O O - - O O O O O - - O - 3 O O O O O - O O O O O O 4 - O - O - O - O - - O - - O - O 5 O O - O O O O O O - O O - - O - O O 7 - O - O O O - O O O O O - O - Table 2: Intermediate array representation of Figure Note the intermediate form does not try to distinguish actual 3x2 Braille cells; it merely writes a flat form to 2-dimensional array, which is now ready for the next stage: character matching Character Matching The algorithm operates on the intermediate representation of the page. The pseudo algorithm is outlined below. Initialise an empty, temporary Braille cell B; for every 3 rd y array index for every 2 nd x array index do matchcharacter(x,y); The matchcharacter routine then checks for the presence of an o character at the following indices, corresponding to the 3x2 Braille cell: (x, y), (x+1, y), (x, y+1), (x+1, y+1), (x, y+2), (x+1, y+2), 24

29 Chapter 3: Implementation and modifies B accordingly. Finally, it compares B to every stored character in its dictionary (defined at the start of the program) and if a match is found, it writes the result to an output stream. If B is still empty, indicating an absence of a character, a space is written. To add some robustness, if a match is not made (or an unknown character is encountered), a * is printed. This algorithm is somewhat naïve; it takes advantage of the fact that no Braille letter has a completely empty left column in its cell (dots 1, 2 and 3), nor does it have a completely empty top row (dots 1 and 4). It makes a reasonably accurate assumption that no single line of text will be composed entirely of punctuation marks, and that there are no peculiar features on the page such as line breaks (which, although rare, do exist). It also assumes that the user does not cut off half a column of Braille by placing the page on the scanner incorrectly. If any of these conditions are true, however, the algorithm gets derailed and fails badly after the condition is met. A solution to this problem, along with skew detection and correction, is implemented at a later stage as an extension to the minimum requirements Conclusion The approach taken was found to work very well with perfect images. It gave accurate translations of images that contained very little (less than 0.1 ) rotation and skew, and contained no shadows caused by an edge of the page not being completely flat against the scanner face as the image was acquired. This meant, however, that the images usually had to be doctored slightly, i.e., cropped to remove particularly dark edges, or artificially rotated to remove any skew. The software does not need to be told (as commercial software does) what Braille cell size is used in the document; it can calculate that automatically. The one peculiarity of the Braille grammar used was the fact that the question mark character is exactly the same as the opening quote character (dots 2, 3, and 6), meaning the software will not be able to distinguish between the two. At this stage (end of March), the minimum requirements have been met. A working system has been implemented that, in perfect conditions, translates scanned-in Braille pages into ASCII text. One of the methods used (specifically, projection) also provided a way of solving the image skew detection problem (discussed in the next section). A decision was then made to implement a feature that can detect and correct small rotations in scanned images. 3.4 Extensions to the Minimum Requirements Alternative Braille Grammars: Braille Grade 2 Initially, alternative Braille grammars were considered as a possible extension to the project. After some research it transpired that handling other grammars was no longer a part of the computer vision domain, where it would pose a slightly different challenge to the original solution, but more a natural language processing task involving statistical analysis of the input and likely some sort of probability 25

30 Chapter 3: Implementation matching. The reason for this is the fact that in Braille Grade 2, for example (which is essentially shorthand Braille), one Braille character can have many different meanings. Also, a sequence of characters can mean more than one thing, for instance cd can mean could and rcv can mean receive. The meaning depends on the context, hence this is more of a statistical language processing problem. Therefore, alternative Braille grammars will not be considered as a possible extension Skew Detection and Correction Cardiello et al. (1998) use horizontal projection to determine the rotation angle for which the mean square deviation is maximised. Because projection is already being used in this project, Cardiello et al. s approach will be used. To evaluate the projection method a correctly segmented image with no skew was rotated by 1 clockwise (Figure 3-22, with horizontal lines drawn for comparison), and its horizontal projection calculated (Figure 3-23, on the next page). Figure 3-22: A segmented image with a clockwise rotation of 1 degree. The image was then rotated anti-clockwise by 1, so that there was no skew present. Its projection profile is shown in Figure It is clear that where there is no rotation present, the graph has sharp, well-defined peaks and troughs. Although the maxima are approximately the same, the minima are much lower. This is due to the Braille dots lining up more accurately on the page when there is no skew, resulting in fewer hits in-between the dots of each cell. A decision was taken to work with 26

31 Chapter 3: Implementation horizontal projections only. Vertical projections exhibit similar properties dependent on the amount of skew present in an image, but due to the spacing between the Braille cells, horizontal projections have clearer gaps between each maximum. 180 Horizontal Projection Frequency y-coordinates Figure 3-23: Horizontal projection profile of image Figure 3-22 (1 rotation). 250 Horizontal Projection 200 Frequency y-coordinates Figure 3-24: Horizontal projection profile of image in Figure 3-22 (0 rotation). 27

32 Chapter 3: Implementation The example is somewhat artificial in that rotation present in real-world scans is likely to be lower (around 0.2 ). The projection graphs still demonstrate the same properties (Figure 3-25); note the abscissa has been restricted to show relevant features), which can be exploited. A simple technique of calculating which projection profile represents the least rotation would be an average of the minima and maxima; however Nicel (2003, pp ) found this to be unreliable. Hence, following Cardiello et al. s work, variance is used instead. This is a much more robust statistical method, and was found to work well Differences in Projection Profiles "0 Degree Rotation" "0.2 Degree Rotation" Frequency y-coordinates Figure 3-25: Difference in projection profiles (small change in rotation). The skew detection method employed uses a brute-force approach, which, although computationally expensive, works well. The input image is rotated n to n times through an angle, with the aim of maximising the variance of the horizontal projection at each stage. A bilinear (first-order) interpolation scheme is used for rotation. It computes the output pixel grey level as a distanceweighted function of the grey levels of the four pixels surrounding the calculated point in the input image (Efford 2000, p. 240). As a result, it produces smoother and visually more pleasing results than zero-order interpolation. n and are parameters that are hard-coded into the program, but may be varied. 5 was the initial value chosen for n, and 0.2 the value for. This means the image will be rotated from -1 through to +1, in 0.2 increments. These values were found experimentally, and gave good results. may be reduced to calculate the skew more precisely; n may be increased to allow for a greater range of skew detection than ±1. This will, of course, be at the cost of increased computation time. 28

33 Chapter 3: Implementation The main drawback of this method is lack of speed. The image is rotated before any pre-processing is done, meaning that filtering, thresholding, etc., are repeated many times. It would be more efficient to perform rotation on the final segmented binary image. The reason for not doing so is shown below (Figure 3-26). White areas are clearly seen around the edges of the image, which are due to the borders generated by median filters and the way rotation is calculated afterwards. These areas throw the projection off somewhat and reduce the accuracy of translation. Figure 3-26: Image rotated after pre-processing. In contrast, applying the rotation before pre-processing removes the problem entirely, as can be seen below in Figure Figure 3-27: Image rotated before pre-processing. 29

Chapter 17. Shape-Based Operations

Chapter 17. Shape-Based Operations Chapter 17 Shape-Based Operations An shape-based operation identifies or acts on groups of pixels that belong to the same object or image component. We have already seen how components may be identified

More information

NON UNIFORM BACKGROUND REMOVAL FOR PARTICLE ANALYSIS BASED ON MORPHOLOGICAL STRUCTURING ELEMENT:

NON UNIFORM BACKGROUND REMOVAL FOR PARTICLE ANALYSIS BASED ON MORPHOLOGICAL STRUCTURING ELEMENT: IJCE January-June 2012, Volume 4, Number 1 pp. 59 67 NON UNIFORM BACKGROUND REMOVAL FOR PARTICLE ANALYSIS BASED ON MORPHOLOGICAL STRUCTURING ELEMENT: A COMPARATIVE STUDY Prabhdeep Singh1 & A. K. Garg2

More information

Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition

Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition Hetal R. Thaker Atmiya Institute of Technology & science, Kalawad Road, Rajkot Gujarat, India C. K. Kumbharana,

More information

8.2 IMAGE PROCESSING VERSUS IMAGE ANALYSIS Image processing: The collection of routines and

8.2 IMAGE PROCESSING VERSUS IMAGE ANALYSIS Image processing: The collection of routines and 8.1 INTRODUCTION In this chapter, we will study and discuss some fundamental techniques for image processing and image analysis, with a few examples of routines developed for certain purposes. 8.2 IMAGE

More information

Image Filtering. Median Filtering

Image Filtering. Median Filtering Image Filtering Image filtering is used to: Remove noise Sharpen contrast Highlight contours Detect edges Other uses? Image filters can be classified as linear or nonlinear. Linear filters are also know

More information

Scrabble Board Automatic Detector for Third Party Applications

Scrabble Board Automatic Detector for Third Party Applications Scrabble Board Automatic Detector for Third Party Applications David Hirschberg Computer Science Department University of California, Irvine hirschbd@uci.edu Abstract Abstract Scrabble is a well-known

More information

Using Curves and Histograms

Using Curves and Histograms Written by Jonathan Sachs Copyright 1996-2003 Digital Light & Color Introduction Although many of the operations, tools, and terms used in digital image manipulation have direct equivalents in conventional

More information

Exercise questions for Machine vision

Exercise questions for Machine vision Exercise questions for Machine vision This is a collection of exercise questions. These questions are all examination alike which means that similar questions may appear at the written exam. I ve divided

More information

SECTION I - CHAPTER 2 DIGITAL IMAGING PROCESSING CONCEPTS

SECTION I - CHAPTER 2 DIGITAL IMAGING PROCESSING CONCEPTS RADT 3463 - COMPUTERIZED IMAGING Section I: Chapter 2 RADT 3463 Computerized Imaging 1 SECTION I - CHAPTER 2 DIGITAL IMAGING PROCESSING CONCEPTS RADT 3463 COMPUTERIZED IMAGING Section I: Chapter 2 RADT

More information

Lane Detection in Automotive

Lane Detection in Automotive Lane Detection in Automotive Contents Introduction... 2 Image Processing... 2 Reading an image... 3 RGB to Gray... 3 Mean and Gaussian filtering... 5 Defining our Region of Interest... 6 BirdsEyeView Transformation...

More information

CoE4TN4 Image Processing. Chapter 3: Intensity Transformation and Spatial Filtering

CoE4TN4 Image Processing. Chapter 3: Intensity Transformation and Spatial Filtering CoE4TN4 Image Processing Chapter 3: Intensity Transformation and Spatial Filtering Image Enhancement Enhancement techniques: to process an image so that the result is more suitable than the original image

More information

DIGITAL IMAGE PROCESSING Quiz exercises preparation for the midterm exam

DIGITAL IMAGE PROCESSING Quiz exercises preparation for the midterm exam DIGITAL IMAGE PROCESSING Quiz exercises preparation for the midterm exam In the following set of questions, there are, possibly, multiple correct answers (1, 2, 3 or 4). Mark the answers you consider correct.

More information

in the list below are available in the Pro version of Scan2CAD

in the list below are available in the Pro version of Scan2CAD Scan2CAD features Features marked only. in the list below are available in the Pro version of Scan2CAD Scan Scan from inside Scan2CAD using TWAIN (Acquire). Use any TWAIN-compliant scanner of any size.

More information

Version 6. User Manual OBJECT

Version 6. User Manual OBJECT Version 6 User Manual OBJECT 2006 BRUKER OPTIK GmbH, Rudolf-Plank-Str. 27, D-76275 Ettlingen, www.brukeroptics.com All rights reserved. No part of this publication may be reproduced or transmitted in any

More information

Study and Analysis of various preprocessing approaches to enhance Offline Handwritten Gujarati Numerals for feature extraction

Study and Analysis of various preprocessing approaches to enhance Offline Handwritten Gujarati Numerals for feature extraction International Journal of Scientific and Research Publications, Volume 4, Issue 7, July 2014 1 Study and Analysis of various preprocessing approaches to enhance Offline Handwritten Gujarati Numerals for

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Digital Imaging Fundamentals Christophoros Nikou cnikou@cs.uoi.gr Images taken from: R. Gonzalez and R. Woods. Digital Image Processing, Prentice Hall, 2008. Digital Image Processing

More information

Raster Based Region Growing

Raster Based Region Growing 6th New Zealand Image Processing Workshop (August 99) Raster Based Region Growing Donald G. Bailey Image Analysis Unit Massey University Palmerston North ABSTRACT In some image segmentation applications,

More information

Digital Image Fundamentals. Digital Image Processing. Human Visual System. Contents. Structure Of The Human Eye (cont.) Structure Of The Human Eye

Digital Image Fundamentals. Digital Image Processing. Human Visual System. Contents. Structure Of The Human Eye (cont.) Structure Of The Human Eye Digital Image Processing 2 Digital Image Fundamentals Digital Imaging Fundamentals Christophoros Nikou cnikou@cs.uoi.gr Those who wish to succeed must ask the right preliminary questions Aristotle Images

More information

Digital Image Fundamentals. Digital Image Processing. Human Visual System. Contents. Structure Of The Human Eye (cont.) Structure Of The Human Eye

Digital Image Fundamentals. Digital Image Processing. Human Visual System. Contents. Structure Of The Human Eye (cont.) Structure Of The Human Eye Digital Image Processing 2 Digital Image Fundamentals Digital Imaging Fundamentals Christophoros Nikou cnikou@cs.uoi.gr Images taken from: R. Gonzalez and R. Woods. Digital Image Processing, Prentice Hall,

More information

Image Enhancement using Histogram Equalization and Spatial Filtering

Image Enhancement using Histogram Equalization and Spatial Filtering Image Enhancement using Histogram Equalization and Spatial Filtering Fari Muhammad Abubakar 1 1 Department of Electronics Engineering Tianjin University of Technology and Education (TUTE) Tianjin, P.R.

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Lecture # 5 Image Enhancement in Spatial Domain- I ALI JAVED Lecturer SOFTWARE ENGINEERING DEPARTMENT U.E.T TAXILA Email:: ali.javed@uettaxila.edu.pk Office Room #:: 7 Presentation

More information

Image Processing for feature extraction

Image Processing for feature extraction Image Processing for feature extraction 1 Outline Rationale for image pre-processing Gray-scale transformations Geometric transformations Local preprocessing Reading: Sonka et al 5.1, 5.2, 5.3 2 Image

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Digital Imaging Fundamentals Christophoros Nikou cnikou@cs.uoi.gr Images taken from: R. Gonzalez and R. Woods. Digital Image Processing, Prentice Hall, 2008. Digital Image Processing

More information

Chapter 6. [6]Preprocessing

Chapter 6. [6]Preprocessing Chapter 6 [6]Preprocessing As mentioned in chapter 4, the first stage in the HCR pipeline is preprocessing of the image. We have seen in earlier chapters why this is very important and at the same time

More information

A Review of Optical Character Recognition System for Recognition of Printed Text

A Review of Optical Character Recognition System for Recognition of Printed Text IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 3, Ver. II (May Jun. 2015), PP 28-33 www.iosrjournals.org A Review of Optical Character Recognition

More information

Digital Image Processing. Lecture # 6 Corner Detection & Color Processing

Digital Image Processing. Lecture # 6 Corner Detection & Color Processing Digital Image Processing Lecture # 6 Corner Detection & Color Processing 1 Corners Corners (interest points) Unlike edges, corners (patches of pixels surrounding the corner) do not necessarily correspond

More information

Computer Graphics (CS/ECE 545) Lecture 7: Morphology (Part 2) & Regions in Binary Images (Part 1)

Computer Graphics (CS/ECE 545) Lecture 7: Morphology (Part 2) & Regions in Binary Images (Part 1) Computer Graphics (CS/ECE 545) Lecture 7: Morphology (Part 2) & Regions in Binary Images (Part 1) Prof Emmanuel Agu Computer Science Dept. Worcester Polytechnic Institute (WPI) Recall: Dilation Example

More information

Improved SIFT Matching for Image Pairs with a Scale Difference

Improved SIFT Matching for Image Pairs with a Scale Difference Improved SIFT Matching for Image Pairs with a Scale Difference Y. Bastanlar, A. Temizel and Y. Yardımcı Informatics Institute, Middle East Technical University, Ankara, 06531, Turkey Published in IET Electronics,

More information

Automatic Licenses Plate Recognition System

Automatic Licenses Plate Recognition System Automatic Licenses Plate Recognition System Garima R. Yadav Dept. of Electronics & Comm. Engineering Marathwada Institute of Technology, Aurangabad (Maharashtra), India yadavgarima08@gmail.com Prof. H.K.

More information

Impulse noise features for automatic selection of noise cleaning filter

Impulse noise features for automatic selection of noise cleaning filter Impulse noise features for automatic selection of noise cleaning filter Odej Kao Department of Computer Science Technical University of Clausthal Julius-Albert-Strasse 37 Clausthal-Zellerfeld, Germany

More information

CONTENTS. Chapter I Introduction Package Includes Appearance System Requirements... 1

CONTENTS. Chapter I Introduction Package Includes Appearance System Requirements... 1 User Manual CONTENTS Chapter I Introduction... 1 1.1 Package Includes... 1 1.2 Appearance... 1 1.3 System Requirements... 1 1.4 Main Functions and Features... 2 Chapter II System Installation... 3 2.1

More information

Image Processing Lecture 4

Image Processing Lecture 4 Image Enhancement Image enhancement aims to process an image so that the output image is more suitable than the original. It is used to solve some computer imaging problems, or to improve image quality.

More information

MAV-ID card processing using camera images

MAV-ID card processing using camera images EE 5359 MULTIMEDIA PROCESSING SPRING 2013 PROJECT PROPOSAL MAV-ID card processing using camera images Under guidance of DR K R RAO DEPARTMENT OF ELECTRICAL ENGINEERING UNIVERSITY OF TEXAS AT ARLINGTON

More information

Non Linear Image Enhancement

Non Linear Image Enhancement Non Linear Image Enhancement SAIYAM TAKKAR Jaypee University of information technology, 2013 SIMANDEEP SINGH Jaypee University of information technology, 2013 Abstract An image enhancement algorithm based

More information

Real Time Word to Picture Translation for Chinese Restaurant Menus

Real Time Word to Picture Translation for Chinese Restaurant Menus Real Time Word to Picture Translation for Chinese Restaurant Menus Michelle Jin, Ling Xiao Wang, Boyang Zhang Email: mzjin12, lx2wang, boyangz @stanford.edu EE268 Project Report, Spring 2014 Abstract--We

More information

Applying mathematics to digital image processing using a spreadsheet

Applying mathematics to digital image processing using a spreadsheet Jeff Waldock Applying mathematics to digital image processing using a spreadsheet Jeff Waldock Department of Engineering and Mathematics Sheffield Hallam University j.waldock@shu.ac.uk Introduction When

More information

Vehicle Number Plate Recognition with Bilinear Interpolation and Plotting Horizontal and Vertical Edge Processing Histogram with Sound Signals

Vehicle Number Plate Recognition with Bilinear Interpolation and Plotting Horizontal and Vertical Edge Processing Histogram with Sound Signals Vehicle Number Plate Recognition with Bilinear Interpolation and Plotting Horizontal and Vertical Edge Processing Histogram with Sound Signals Aarti 1, Dr. Neetu Sharma 2 1 DEPArtment Of Computer Science

More information

License Plate Localisation based on Morphological Operations

License Plate Localisation based on Morphological Operations License Plate Localisation based on Morphological Operations Xiaojun Zhai, Faycal Benssali and Soodamani Ramalingam School of Engineering & Technology University of Hertfordshire, UH Hatfield, UK Abstract

More information

LAB MANUAL SUBJECT: IMAGE PROCESSING BE (COMPUTER) SEM VII

LAB MANUAL SUBJECT: IMAGE PROCESSING BE (COMPUTER) SEM VII LAB MANUAL SUBJECT: IMAGE PROCESSING BE (COMPUTER) SEM VII IMAGE PROCESSING INDEX CLASS: B.E(COMPUTER) SR. NO SEMESTER:VII TITLE OF THE EXPERIMENT. 1 Point processing in spatial domain a. Negation of an

More information

CHAPTER 4 LOCATING THE CENTER OF THE OPTIC DISC AND MACULA

CHAPTER 4 LOCATING THE CENTER OF THE OPTIC DISC AND MACULA 90 CHAPTER 4 LOCATING THE CENTER OF THE OPTIC DISC AND MACULA The objective in this chapter is to locate the centre and boundary of OD and macula in retinal images. In Diabetic Retinopathy, location of

More information

Images and Graphics. 4. Images and Graphics - Copyright Denis Hamelin - Ryerson University

Images and Graphics. 4. Images and Graphics - Copyright Denis Hamelin - Ryerson University Images and Graphics Images and Graphics Graphics and images are non-textual information that can be displayed and printed. Graphics (vector graphics) are an assemblage of lines, curves or circles with

More information

Malaysian Car Number Plate Detection System Based on Template Matching and Colour Information

Malaysian Car Number Plate Detection System Based on Template Matching and Colour Information Malaysian Car Number Plate Detection System Based on Template Matching and Colour Information Mohd Firdaus Zakaria, Shahrel A. Suandi Intelligent Biometric Group, School of Electrical and Electronics Engineering,

More information

Digital Image Processing

Digital Image Processing Digital Image Processing Part 2: Image Enhancement Digital Image Processing Course Introduction in the Spatial Domain Lecture AASS Learning Systems Lab, Teknik Room T26 achim.lilienthal@tech.oru.se Course

More information

Digital Imaging and Image Editing

Digital Imaging and Image Editing Digital Imaging and Image Editing A digital image is a representation of a twodimensional image as a finite set of digital values, called picture elements or pixels. The digital image contains a fixed

More information

Digital Image Processing 3/e

Digital Image Processing 3/e Laboratory Projects for Digital Image Processing 3/e by Gonzalez and Woods 2008 Prentice Hall Upper Saddle River, NJ 07458 USA www.imageprocessingplace.com The following sample laboratory projects are

More information

DIGITAL IMAGE PROCESSING (COM-3371) Week 2 - January 14, 2002

DIGITAL IMAGE PROCESSING (COM-3371) Week 2 - January 14, 2002 DIGITAL IMAGE PROCESSING (COM-3371) Week 2 - January 14, 22 Topics: Human eye Visual phenomena Simple image model Image enhancement Point processes Histogram Lookup tables Contrast compression and stretching

More information

Detection and Verification of Missing Components in SMD using AOI Techniques

Detection and Verification of Missing Components in SMD using AOI Techniques , pp.13-22 http://dx.doi.org/10.14257/ijcg.2016.7.2.02 Detection and Verification of Missing Components in SMD using AOI Techniques Sharat Chandra Bhardwaj Graphic Era University, India bhardwaj.sharat@gmail.com

More information

Computer Graphics Fundamentals

Computer Graphics Fundamentals Computer Graphics Fundamentals Jacek Kęsik, PhD Simple converts Rotations Translations Flips Resizing Geometry Rotation n * 90 degrees other Geometry Rotation n * 90 degrees other Geometry Translations

More information

More image filtering , , Computational Photography Fall 2017, Lecture 4

More image filtering , , Computational Photography Fall 2017, Lecture 4 More image filtering http://graphics.cs.cmu.edu/courses/15-463 15-463, 15-663, 15-862 Computational Photography Fall 2017, Lecture 4 Course announcements Any questions about Homework 1? - How many of you

More information

Evaluation of image quality of the compression schemes JPEG & JPEG 2000 using a Modular Colour Image Difference Model.

Evaluation of image quality of the compression schemes JPEG & JPEG 2000 using a Modular Colour Image Difference Model. Evaluation of image quality of the compression schemes JPEG & JPEG 2000 using a Modular Colour Image Difference Model. Mary Orfanidou, Liz Allen and Dr Sophie Triantaphillidou, University of Westminster,

More information

CSC 320 H1S CSC320 Exam Study Guide (Last updated: April 2, 2015) Winter 2015

CSC 320 H1S CSC320 Exam Study Guide (Last updated: April 2, 2015) Winter 2015 Question 1. Suppose you have an image I that contains an image of a left eye (the image is detailed enough that it makes a difference that it s the left eye). Write pseudocode to find other left eyes in

More information

A Fast Segmentation Algorithm for Bi-Level Image Compression using JBIG2

A Fast Segmentation Algorithm for Bi-Level Image Compression using JBIG2 A Fast Segmentation Algorithm for Bi-Level Image Compression using JBIG2 Dave A. D. Tompkins and Faouzi Kossentini Signal Processing and Multimedia Group Department of Electrical and Computer Engineering

More information

CSC 170 Introduction to Computers and Their Applications. Lecture #3 Digital Graphics and Video Basics. Bitmap Basics

CSC 170 Introduction to Computers and Their Applications. Lecture #3 Digital Graphics and Video Basics. Bitmap Basics CSC 170 Introduction to Computers and Their Applications Lecture #3 Digital Graphics and Video Basics Bitmap Basics As digital devices gained the ability to display images, two types of computer graphics

More information

Image Enhancement in Spatial Domain

Image Enhancement in Spatial Domain Image Enhancement in Spatial Domain 2 Image enhancement is a process, rather a preprocessing step, through which an original image is made suitable for a specific application. The application scenarios

More information

Table of contents. Vision industrielle 2002/2003. Local and semi-local smoothing. Linear noise filtering: example. Convolution: introduction

Table of contents. Vision industrielle 2002/2003. Local and semi-local smoothing. Linear noise filtering: example. Convolution: introduction Table of contents Vision industrielle 2002/2003 Session - Image Processing Département Génie Productique INSA de Lyon Christian Wolf wolf@rfv.insa-lyon.fr Introduction Motivation, human vision, history,

More information

MEASUREMENT OF ROUGHNESS USING IMAGE PROCESSING. J. Ondra Department of Mechanical Technology Military Academy Brno, Brno, Czech Republic

MEASUREMENT OF ROUGHNESS USING IMAGE PROCESSING. J. Ondra Department of Mechanical Technology Military Academy Brno, Brno, Czech Republic MEASUREMENT OF ROUGHNESS USING IMAGE PROCESSING J. Ondra Department of Mechanical Technology Military Academy Brno, 612 00 Brno, Czech Republic Abstract: A surface roughness measurement technique, based

More information

Number Plate Recognition Using Segmentation

Number Plate Recognition Using Segmentation Number Plate Recognition Using Segmentation Rupali Kate M.Tech. Electronics(VLSI) BVCOE. Pune 411043, Maharashtra, India. Dr. Chitode. J. S BVCOE. Pune 411043 Abstract Automatic Number Plate Recognition

More information

Chapter 8. Representing Multimedia Digitally

Chapter 8. Representing Multimedia Digitally Chapter 8 Representing Multimedia Digitally Learning Objectives Explain how RGB color is represented in bytes Explain the difference between bits and binary numbers Change an RGB color by binary addition

More information

Prof. Vidya Manian Dept. of Electrical and Comptuer Engineering

Prof. Vidya Manian Dept. of Electrical and Comptuer Engineering Image Processing Intensity Transformations Chapter 3 Prof. Vidya Manian Dept. of Electrical and Comptuer Engineering INEL 5327 ECE, UPRM Intensity Transformations 1 Overview Background Basic intensity

More information

Virtual Restoration of old photographic prints. Prof. Filippo Stanco

Virtual Restoration of old photographic prints. Prof. Filippo Stanco Virtual Restoration of old photographic prints Prof. Filippo Stanco Many photographic prints of commercial / historical value are being converted into digital form. This allows: Easy ubiquitous fruition:

More information

A Study On Preprocessing A Mammogram Image Using Adaptive Median Filter

A Study On Preprocessing A Mammogram Image Using Adaptive Median Filter A Study On Preprocessing A Mammogram Image Using Adaptive Median Filter Dr.K.Meenakshi Sundaram 1, D.Sasikala 2, P.Aarthi Rani 3 Associate Professor, Department of Computer Science, Erode Arts and Science

More information

Color and More. Color basics

Color and More. Color basics Color and More In this lesson, you'll evaluate an image in terms of its overall tonal range (lightness, darkness, and contrast), its overall balance of color, and its overall appearance for areas that

More information

An Efficient Color Image Segmentation using Edge Detection and Thresholding Methods

An Efficient Color Image Segmentation using Edge Detection and Thresholding Methods 19 An Efficient Color Image Segmentation using Edge Detection and Thresholding Methods T.Arunachalam* Post Graduate Student, P.G. Dept. of Computer Science, Govt Arts College, Melur - 625 106 Email-Arunac682@gmail.com

More information

Laboratory 1: Uncertainty Analysis

Laboratory 1: Uncertainty Analysis University of Alabama Department of Physics and Astronomy PH101 / LeClair May 26, 2014 Laboratory 1: Uncertainty Analysis Hypothesis: A statistical analysis including both mean and standard deviation can

More information

Fundamentals of Multimedia

Fundamentals of Multimedia Fundamentals of Multimedia Lecture 2 Graphics & Image Data Representation Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Outline Black & white imags 1 bit images 8-bit gray-level images Image histogram Dithering

More information

Automatic Locating the Centromere on Human Chromosome Pictures

Automatic Locating the Centromere on Human Chromosome Pictures Automatic Locating the Centromere on Human Chromosome Pictures M. Moradi Electrical and Computer Engineering Department, Faculty of Engineering, University of Tehran, Tehran, Iran moradi@iranbme.net S.

More information

VEHICLE LICENSE PLATE DETECTION ALGORITHM BASED ON STATISTICAL CHARACTERISTICS IN HSI COLOR MODEL

VEHICLE LICENSE PLATE DETECTION ALGORITHM BASED ON STATISTICAL CHARACTERISTICS IN HSI COLOR MODEL VEHICLE LICENSE PLATE DETECTION ALGORITHM BASED ON STATISTICAL CHARACTERISTICS IN HSI COLOR MODEL Instructor : Dr. K. R. Rao Presented by: Prasanna Venkatesh Palani (1000660520) prasannaven.palani@mavs.uta.edu

More information

Image Optimization for Print and Web

Image Optimization for Print and Web There are two distinct types of computer graphics: vector images and raster images. Vector Images Vector images are graphics that are rendered through a series of mathematical equations. These graphics

More information

Techniques for Generating Sudoku Instances

Techniques for Generating Sudoku Instances Chapter Techniques for Generating Sudoku Instances Overview Sudoku puzzles become worldwide popular among many players in different intellectual levels. In this chapter, we are going to discuss different

More information

image Scanner, digital camera, media, brushes,

image Scanner, digital camera, media, brushes, 118 Also known as rasterr graphics Record a value for every pixel in the image Often created from an external source Scanner, digital camera, Painting P i programs allow direct creation of images with

More information

COMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES

COMPARATIVE PERFORMANCE ANALYSIS OF HAND GESTURE RECOGNITION TECHNIQUES International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 9, Issue 3, May - June 2018, pp. 177 185, Article ID: IJARET_09_03_023 Available online at http://www.iaeme.com/ijaret/issues.asp?jtype=ijaret&vtype=9&itype=3

More information

MATLAB 6.5 Image Processing Toolbox Tutorial

MATLAB 6.5 Image Processing Toolbox Tutorial MATLAB 6.5 Image Processing Toolbox Tutorial The purpose of this tutorial is to gain familiarity with MATLAB s Image Processing Toolbox. This tutorial does not contain all of the functions available in

More information

An Improved Method of Computing Scale-Orientation Signatures

An Improved Method of Computing Scale-Orientation Signatures An Improved Method of Computing Scale-Orientation Signatures Chris Rose * and Chris Taylor Division of Imaging Science and Biomedical Engineering, University of Manchester, M13 9PT, UK Abstract: Scale-Orientation

More information

Preprocessing of Digitalized Engineering Drawings

Preprocessing of Digitalized Engineering Drawings Modern Applied Science; Vol. 9, No. 13; 2015 ISSN 1913-1844 E-ISSN 1913-1852 Published by Canadian Center of Science and Education Preprocessing of Digitalized Engineering Drawings Matúš Gramblička 1 &

More information

An Improved Bernsen Algorithm Approaches For License Plate Recognition

An Improved Bernsen Algorithm Approaches For License Plate Recognition IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) ISSN: 78-834, ISBN: 78-8735. Volume 3, Issue 4 (Sep-Oct. 01), PP 01-05 An Improved Bernsen Algorithm Approaches For License Plate Recognition

More information

Lane Detection in Automotive

Lane Detection in Automotive Lane Detection in Automotive Contents Introduction... 2 Image Processing... 2 Reading an image... 3 RGB to Gray... 3 Mean and Gaussian filtering... 6 Defining our Region of Interest... 10 BirdsEyeView

More information

Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester

Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester www.vidyarthiplus.com Anna University, Chennai B.E./B.TECH DEGREE EXAMINATION, MAY/JUNE 2013 Seventh Semester Electronics and Communication Engineering EC 2029 / EC 708 DIGITAL IMAGE PROCESSING (Regulation

More information

Capturing and Editing Digital Images *

Capturing and Editing Digital Images * Digital Media The material in this handout is excerpted from Digital Media Curriculum Primer a work written by Dr. Yue-Ling Wong (ylwong@wfu.edu), Department of Computer Science and Department of Art,

More information

Introduction to Image Analysis with

Introduction to Image Analysis with Introduction to Image Analysis with PLEASE ENSURE FIJI IS INSTALLED CORRECTLY! WHAT DO WE HOPE TO ACHIEVE? Specifically, the workshop will cover the following topics: 1. Opening images with Bioformats

More information

Automated Detection of Early Lung Cancer and Tuberculosis Based on X- Ray Image Analysis

Automated Detection of Early Lung Cancer and Tuberculosis Based on X- Ray Image Analysis Proceedings of the 6th WSEAS International Conference on Signal, Speech and Image Processing, Lisbon, Portugal, September 22-24, 2006 110 Automated Detection of Early Lung Cancer and Tuberculosis Based

More information

Colour Profiling Using Multiple Colour Spaces

Colour Profiling Using Multiple Colour Spaces Colour Profiling Using Multiple Colour Spaces Nicola Duffy and Gerard Lacey Computer Vision and Robotics Group, Trinity College, Dublin.Ireland duffynn@cs.tcd.ie Abstract This paper presents an original

More information

Unit 4.4 Representing Images

Unit 4.4 Representing Images Unit 4.4 Representing Images Candidates should be able to: a) Explain the representation of an image as a series of pixels represented in binary b) Explain the need for metadata to be included in the file

More information

5/17/2009. Digitizing Color. Place Value in a Binary Number. Place Value in a Decimal Number. Place Value in a Binary Number

5/17/2009. Digitizing Color. Place Value in a Binary Number. Place Value in a Decimal Number. Place Value in a Binary Number Chapter 11: Light, Sound, Magic: Representing Multimedia Digitally Digitizing Color Fluency with Information Technology Third Edition by Lawrence Snyder RGB Colors: Binary Representation Giving the intensities

More information

Image Capture TOTALLAB

Image Capture TOTALLAB 1 Introduction In order for image analysis to be performed on a gel or Western blot, it must first be converted into digital data. Good image capture is critical to guarantee optimal performance of automated

More information

TDI2131 Digital Image Processing

TDI2131 Digital Image Processing TDI2131 Digital Image Processing Image Enhancement in Spatial Domain Lecture 3 John See Faculty of Information Technology Multimedia University Some portions of content adapted from Zhu Liu, AT&T Labs.

More information

Sampling Rate = Resolution Quantization Level = Color Depth = Bit Depth = Number of Colors

Sampling Rate = Resolution Quantization Level = Color Depth = Bit Depth = Number of Colors ITEC2110 FALL 2011 TEST 2 REVIEW Chapters 2-3: Images I. Concepts Graphics A. Bitmaps and Vector Representations Logical vs. Physical Pixels - Images are modeled internally as an array of pixel values

More information

1.Discuss the frequency domain techniques of image enhancement in detail.

1.Discuss the frequency domain techniques of image enhancement in detail. 1.Discuss the frequency domain techniques of image enhancement in detail. Enhancement In Frequency Domain: The frequency domain methods of image enhancement are based on convolution theorem. This is represented

More information

TECHNICAL DOCUMENTATION

TECHNICAL DOCUMENTATION TECHNICAL DOCUMENTATION NEED HELP? Call us on +44 (0) 121 231 3215 TABLE OF CONTENTS Document Control and Authority...3 Introduction...4 Camera Image Creation Pipeline...5 Photo Metadata...6 Sensor Identification

More information

IMAGE ENHANCEMENT IN SPATIAL DOMAIN

IMAGE ENHANCEMENT IN SPATIAL DOMAIN A First Course in Machine Vision IMAGE ENHANCEMENT IN SPATIAL DOMAIN By: Ehsan Khoramshahi Definitions The principal objective of enhancement is to process an image so that the result is more suitable

More information

The Camera Club. David Champion January 2011

The Camera Club. David Champion January 2011 The Camera Club B&W Negative Proccesing After Scanning. David Champion January 2011 That s how to scan a negative, now I will explain how to process the image using Photoshop CS5. To achieve a good scan

More information

Image and Video Processing

Image and Video Processing Image and Video Processing () Image Representation Dr. Miles Hansard miles.hansard@qmul.ac.uk Segmentation 2 Today s agenda Digital image representation Sampling Quantization Sub-sampling Pixel interpolation

More information

Digital Image Processing. Lecture # 3 Image Enhancement

Digital Image Processing. Lecture # 3 Image Enhancement Digital Image Processing Lecture # 3 Image Enhancement 1 Image Enhancement Image Enhancement 3 Image Enhancement 4 Image Enhancement Process an image so that the result is more suitable than the original

More information

Image compression with multipixels

Image compression with multipixels UE22 FEBRUARY 2016 1 Image compression with multipixels Alberto Isaac Barquín Murguía Abstract Digital images, depending on their quality, can take huge amounts of storage space and the number of imaging

More information

Digitizing Color. Place Value in a Decimal Number. Place Value in a Binary Number. Chapter 11: Light, Sound, Magic: Representing Multimedia Digitally

Digitizing Color. Place Value in a Decimal Number. Place Value in a Binary Number. Chapter 11: Light, Sound, Magic: Representing Multimedia Digitally Chapter 11: Light, Sound, Magic: Representing Multimedia Digitally Fluency with Information Technology Third Edition by Lawrence Snyder Digitizing Color RGB Colors: Binary Representation Giving the intensities

More information

Compression Method for Handwritten Document Images in Devnagri Script

Compression Method for Handwritten Document Images in Devnagri Script Compression Method for Handwritten Document Images in Devnagri Script Smita V. Khangar, Dr. Latesh G. Malik Department of Computer Science and Engineering, Nagpur University G.H. Raisoni College of Engineering,

More information

Transform. Processed original image. Processed transformed image. Inverse transform. Figure 2.1: Schema for transform processing

Transform. Processed original image. Processed transformed image. Inverse transform. Figure 2.1: Schema for transform processing Chapter 2 Point Processing 2.1 Introduction Any image processing operation transforms the grey values of the pixels. However, image processing operations may be divided into into three classes based on

More information

Automatic Electricity Meter Reading Based on Image Processing

Automatic Electricity Meter Reading Based on Image Processing Automatic Electricity Meter Reading Based on Image Processing Lamiaa A. Elrefaei *,+,1, Asrar Bajaber *,2, Sumayyah Natheir *,3, Nada AbuSanab *,4, Marwa Bazi *,5 * Computer Science Department Faculty

More information

Visible Light Communication-based Indoor Positioning with Mobile Devices

Visible Light Communication-based Indoor Positioning with Mobile Devices Visible Light Communication-based Indoor Positioning with Mobile Devices Author: Zsolczai Viktor Introduction With the spreading of high power LED lighting fixtures, there is a growing interest in communication

More information

Image Extraction using Image Mining Technique

Image Extraction using Image Mining Technique IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719 Vol. 3, Issue 9 (September. 2013), V2 PP 36-42 Image Extraction using Image Mining Technique Prof. Samir Kumar Bandyopadhyay,

More information

Automatic Counterfeit Protection System Code Classification

Automatic Counterfeit Protection System Code Classification Automatic Counterfeit Protection System Code Classification Joost van Beusekom a,b, Marco Schreyer a, Thomas M. Breuel b a German Research Center for Artificial Intelligence (DFKI) GmbH D-67663 Kaiserslautern,

More information