AN INTEGRATED SYSTEM FOR HANDWRITTEN DOCUMENT IMAGE PROCESSING

Size: px

Start display at page:

Download "AN INTEGRATED SYSTEM FOR HANDWRITTEN DOCUMENT IMAGE PROCESSING"

Mary Dawson
6 years ago
Views:

1 International Journal of Pattern Recognition and Artificial Intelligence Vol. 17, No. 4 (2003) c World Scientific Publishing Company AN INTEGRATED SYSTEM FOR HANDWRITTEN DOCUMENT IMAGE PROCESSING E. KAVALLIERATOU, N. DROMAZOU, N. FAKOTAKIS and G. KOKKINAKIS Wire Communications Laboratory, University of Patras, Patras, Greece ergina@wcl.ee.upatras.gr In this paper we attempt to face common problems of handwritten documents such as nonparallel text lines in a page, hill and dale writing, slanted and connected characters. Towards this end an integrated system for document image preprocessing is presented. This system consists of the following modules: skew angle estimation and correction, line and word segmentation, slope and slant correction. The skew angle correction, slope correction and slant removing algorithms are based on a novel method that is a combination of the projection profile technique and the Wigner Ville distribution. Furthermore, the skew angle correction algorithm can cope with pages whose text line skew angles vary, and handle them by areas. Our system can be used as a preprocessing stage to any handwriting character recognition or segmentation system as well as to any writer identification system. It was tested in a wide variety of handwritten document images of unconstrained English and Modern Greek text from about 100 writers. Additionally, combinations of the above algorithms have been used in the framework of the ACCeSS system (European project LE , aiming at the automatic processing of application forms of insurance companies) as well as in the processing of GRUHD and IAM-B databases for automating the procedure of extracting data. Keywords: Image feature; document image; skew angle estimation; slant correction. 1. Introduction A major problem of pattern recognition is the Optical Character Recognition (OCR). The OCR systems attempt to facilitate the everyday use of computers aiming at the transformation of large amounts of documents, either printed or handwritten, into electronic form for further processing. Since the early 1960s, numerous systems have been proposed. In contemporary times, the recognition of printed and handprinted isolated characters is performed with accuracy that exceeds 90%. 22 However, the recognition of unconstrained handwriting still remains an open research problem. Typical problems that complicate the automatic recognition of unconstrained handwriting are the nonparallel text lines in a page, the hill and dale writing, as well as the slanted and connected characters. These problems are illustrated in Fig

2 618 E. Kavallieratou et al. non-parallels text lines hill and dale writing Slanted and connected characters Fig. 1. Typical problems of handwritten documents. Currently, there are two main trends in the recognition of unconstrained handwriting. Some researchers perform character segmentation before the actual character recognition, 6 while others avoid the segmentation stage. 18 Nevertheless, whatever the application and the methodology are, the preprocessing stage, facing the problems mentioned in the previous paragraph, is very important both for character segmentation and character recognition. An insufficient solution of these preprocessing tasks may cause considerable losses of accuracy in any further processing procedure. Several methods have been proposed to solve the problem of skew detection. 10,11,15,16,20,21,27,29 Analytical reviews regarding these methods can be found in Refs. 2, 12 and 19. Although many of these approaches are satisfactorily accurate and fast, they are designed to handle documents in general. Consequently, they are not always able to deal with specific particularities of a handwritten document, such as the correction of nonparallel text lines within the same page. Nevertheless, Spitz 25 presents a technique able to detect multiple skew angles, applied mostly to printed compressed documents. The few proposed methods for handwritten or mixed (i.e. handwritten and printed) documents concern specific applications. 30 Chin 7 analyzes this problem and underlines that the methods that handle printed pages successfully do not manage to handle handwritten pages that well, or even if they do the computational cost is greater. On the other hand, the most commonly used method for slant estimation is the calculation of the average angle of near-vertical strokes. 5,6,18 This approach requires the detection of the edges of the characters and its accuracy strongly depends on the specific characters that compose the word. Shridar 24 presents two more methods

3 Handwritten Document Image Processing 619 Document image Skew Correction Line and Word segmentation Slope Correction Slant Removal Word image Fig. 2. The proposed system. for slant estimation and correction. In the first one, the vertical projection profile is used while the second one makes use of the chain code method. However, the evaluation of the slant correction approaches may be subjective since the slants can vary even within a single word. Additionally, in the relevant literature a slant correction procedure is rarely evaluated separately, so that comparative results cannot be given. In Refs. 23 and 24, two different slope correction methods are proposed. The first makes use of the least-squared error line fit to the bottom profile of the line image and the second attempts to find the baseline of the word in order to correct its skewing angle. In this paper, we present an integrated system, shown in Fig. 2, for handwritten document image analysis appropriate to be used as a preprocessor for handwritten character recognition systems. In particular, the presented system copes with three problems: (i) Skew estimation and correction of the document image. That is, the angle formed by the horizontal axis of the document image and the text lines. A new method based on the Wigner Ville distribution of the horizontal projection profile of the document image is proposed. Our approach is capable of localizing the areas of the handwritten document images with different skew angles automatically and handling each of them separately. Moreover, it is fast and accurate. (ii) Line and word segmentation. A variation of an already existing approach described by Shridar and Kimura 24 is followed since it combines ease of implementation and high accuracy results. (iii) Slope and slant correction. The term slope of a word is referred to the angle formed by the horizontal axis of the word and the corresponding text line while the term slant of characters is referred to characters whose vertical strokes are inclined regarding the vertical axis of the word. A similar method with the one used for skew estimation is followed. This approach is based on a simple algorithm and achieves very low response time cost. Moreover, it is not dependent on specific characters. In the next section a brief presentation of the Wigner Ville distribution, as it is used in the proposed system, is given. The modules of the system, mentioned above, are described in detail in Secs Finally, some experimental results and the conclusions drawn by this study are included in Secs. 6 and 7, respectively.

4 620 E. Kavallieratou et al. 2. Wigner Ville Distribution The Wigner Ville (WV) time and frequency distribution, W (t, ω), is a well-known representative of the energy distributions and member of the Cohen s class W (t, ω) = T s (t kt )e 2j ωkt s(t + kt ) π k= where s(t) represents the analytical signal and 1/T the sampling frequency. For a discrete signal s(n) and T = 1, the Wigner Ville Distribution (WVD) can be expressed as: W (n, θ) = 1 s (n k)2 2jθk s(n + k). π k= Moreover, the horizontal and vertical projection profiles of a black-and-white document image f(x, y) are: P h (x) = y f(x, y) P ν (y) = x f(x, y). By introducing the above equations in the WVD of a discrete signal, the spacefrequency distribution is expressed as: W (y, θ) = 1 P (y k)e 2jθk P (y + k). π k= The Wigner Ville function is a particularly popular distribution due to the large amount of desirable mathematical properties it satisfies. 26 Mecklenbrauker 8 proves the uniqueness of the WVD in the sense that it is that single energy distribution that possesses all the started desirable properties. This justifies the numerous applications of WVD in pattern recognition, 3,9 synthesis, 28 seismic signal, 4 and optics 14 etc. In the case of our application, the superiority of WVD over the other Cohen s class distributions has been proved in Ref. 12. In more detail, the tests with several distributions, members of the Cohen s class, proved WVD to be the best compromise between computational cost and accuracy. 3. Document Skew Angle Estimation and Correction As already mentioned, skew estimation and correction are performed in the first stage of our system by using the projection profile technique and the WVD. Specifically, the maximum intensity of the WVD of the horizontal histogram of a document image is used as the criterion for estimating its skew angle. Moreover, once the dominating skew angle of the entire document page has been detected and

5 Handwritten Document Image Processing 621 Fig. 3. Example of document page. corrected, the areas with different skew angles are localized by the deep and wide valleys of the curve of maximum intensity, and the skew estimation procedure is repeated for each area. The term dominant skew angle, here, is used for the skew angle that is met in a wider scale in the document image, as it will be later shown. The horizontal histogram of a properly oriented document page presents the maximum peaks and the most intent alternations between peaks and dips than any other histogram-by-angle of the same page. 21,24 The WVD of the histograms represent their time-frequency distribution, where in this case the time increases according to the height of the page. Consequently, the WVD presents maximum intensity for the histograms of 0 and 180 corresponding to the properly-oriented and the reverse one respectively, which contain the major peaks and dips alternations.

6 622 E. Kavallieratou et al. Fig. 4. The horizontal histograms and the corresponding curves of maximum intensity for several skew angles concerning the document image of Fig. 4. The closer the skew angle to 0 and 180, the larger the values of the maximum intensity. Hence our algorithm is able to detect skew angles varying between 89 and +89 in respect to the notional horizontal axis of properly-oriented document. Otherwise, the resultant page would be the reverse one. In Fig. 4, the horizontal histograms and the corresponding curves of maximum intensity for several skew angles, concerning the document image of Fig. 3, are shown. It has to be noted that the calculation of the histograms and the further application of the Wigner Ville function for every skew angle are not required, since it would imply unnecessary computational complexity and cost. Instead, at the beginning, an initial step (big step) is used for a first estimation, and then a rotation degree by degree is applied within a smaller range. The big step should be such that the whole area between +89 and 89 and only that should be covered. In the opposite case, if the covered area is greater than that of ±89, it is possible to end up with the page reversed. On the other hand, if the covered area is smaller than that and the skew angle greater than the covered area, the system will fail to correct the page. Then, in order to cover the whole area between +89 and 89, we calculated the sum of necessary rotations for various steps, including the rotations by the big step as well as the remaining rotations in the smaller range. The results are illustrated in Fig. 5. We note that for a big step of 12, the number of required rotations is minimized. In order to make use of it, the WVD and its maximum intensity are calculated for each histogram. Thus, a curve of maximum intensity corresponds to each angle. The

Handwritten Document Image Processing 623 amount of rotations 180 160 140 120 100 80 60 40 20 0 5 10 15 20 25 30 35 40 45 big step Fig. 5. The relation between big step and rotations.

7 Handwritten Document Image Processing 623 amount of rotations big step Fig. 5. The relation between big step and rotations. The required rotations are minimized for a big step equal to 12. Fig. 6. The document image of Fig. 1 after the skew angle correction. curve that presents the most maxima throughout the space domain is selected. Moreover, in order to focus on the peaks of the curves we check only the curve values above a threshold. A tenth of the maximum peak was proved to be a good threshold in our experiments over 2000 documents (see Sec. 6).

8 624 E. Kavallieratou et al. In handwritten documents more than one skew angle are very often met. Most of the proposed approaches handle this case by estimating and correcting the skewing according to the dominant angle without taking the skew angle variations into consideration. Our approach deals with this problem, by processing the page by areas. Initially, the dominant skew angle, as already mentioned, is calculated and the page is corrected according to that. This correction is essentially in order to proceed to the localization of the areas with different skew angle on the page. The segmentation into areas is performed by taking into account that the change of the angle of successive text lines causes wider valleys than the usual of the histogram which are translated into dips at the curve of maximum intensity of its Wigner Ville distribution. In order to localize these dips and use them as boundaries of areas of different skew angle, we follow this procedure: The WVDs and the corresponding curves of maximum intensity for an angle range of ± 6 around the dominant skew angle are calculated. A step of 0.2 is used. Step smaller than that proved unnecessarily, since it cannot be detected in handwritten pages. The average value of the curves of maximum intensity is calculated. A threshold depending on the average value is determined (the 3/4 of the average value of the average curve is a good threshold). a The dips of the average curve of maximum intensity below this threshold define the boundaries of the areas. The dominant curve of maximum intensity in each area provides the corresponding skew angle. The angle of 6 either way of the document angle was used here in order to cover the omitted area of the initial step. Moreover, the assumption, that the variation in skew angles in handwritten text does not usually exceed that area, has been made here. However, any angle greater than that can be used, if necessary. In case the text features multiple changes in the line width, the above threshold may divide the page into more than the appropriate areas. Nevertheless, this is not an important shortcoming since each zone will be corrected separately. In Fig. 6, the document image of Fig. 1 after the skew angle correction is shown. 4. Line and Word Segmentation 4.1. Line segmentation The horizontal projection profile is most commonly employed for the line segmentation. If the page is skew corrected and the lines well separated, the histogram a Detailed experiments over unconstrained handwritten text from 1000 documents of GRUHD database derived this threshold. Values greater than that failed to localize in some cases all the areas with variation in skew angle while values smaller than that tend to split the text in many areas without variations in skew angle.

9 Handwritten Document Image Processing 625 will present well-separated peaks and valleys. However, in handwritten documents this method can be problematic because of the hill and dale writing as well as the ascenders and descenders of the characters that very often begin in a line and end up in another. Thus, the valleys of the histogram are not very clear. And a separation algorithm based strictly on this technique would cause the loss of important information (part of lines) and the introducing of noise (portions of ascenders and descenders). Although problematic, this method can give a good estimation of the number of the lines included in a document page. And this is the information that we extract from the horizontal projection profile of the page. The valleys with minima less than a certain threshold (60 pixels found to be a good threshold, Fig. 7) are considered to be likely beginners of line segments. Subsequently the area is examined pixel by pixel until an entire white path is outlined. The area of the valley is examined starting from the left edge of the page. While white pixels are met we continue towards the right edge of the page. When black pixels are met we move up and down looking for alternative paths. The zone of search is limited between the peaks of the histogram of the previous and next possible areas of text lines. In cases that two lines are merged because of long ascenders or descenders the local histogram is calculated and the path goes on passing from the minimum of the histogram. When we reach the right edge a crooked line has been outlined consisting of as much white pixels as possible. In Fig. 8, the document image of Fig. 1 after the line segmentation is shown. 120 success 100 rate (%) threshold (pixels) Fig. 7. The success rate of localization of line segments according to the used threshold after experiments performed using GRUHD database. 13

10 626 E. Kavallieratou et al. Fig. 8. The document image of Fig. 1 after the line segmentation Word Segmentation The vertical projection profile of each text line is employed in order to perform word segmentation. In contrast to the line segmentation, in handwriting the words are almost always well separated. However, the valleys of the histogram cannot be certain word boundaries, since untouched characters can also cause dips. These two cases can be distinguished by examining the width of the valley. In normal handwriting a gap between words is not normally shorter than the mean character width, while a gap in the word is unlikely to be longer than that. By mean width of character, we consider the width of characters such as a, b, c, d, etc. excluding the characters i, l, j, m, w that are either too narrow (i, l), or too wide (m, w). Thus the width of a character is a good criterion for the estimation of the word boundaries. Although the character width differs between characters and writers, a rough estimation of the mean width could be made by accepting that excluding the ascenders and descenders the characters with mean width (as defined above), present width equal to their height. Then by locating the height of the main body of the words, we have an approximate but satisfactory approach of the required threshold. In order to achieve that, we calculated the horizontal projection profile of a part of a text line. Although a part with more than several characters is appropriate, a tenth of the text line was used in our case. The upper and lower parts of the line where the value of the histogram falls under the 1/3 of its peak value (threshold extracted experimentally) are excluded. The remaining height is the required threshold. The valleys of the vertical projection profile of a text line with width greater than the above threshold are considered to be the boundaries between words. If it is known that the whole document is written by one person this estimation is enough to be carried out just once in a page otherwise it is safer to be done several times.

11 Handwritten Document Image Processing 627 Table 1. Statistical values measured over 1000 documents of GRUHD 13 database scanned at 300 dpi. Mean character width Mean character height Mean character spacing Mean word spacing 21,7 pixels 23,1 pixels 8,7 pixels 46,3 pixels Error Error Fig. 9. The document image of Fig. 1 after the word segmentation. In Fig. 9, the document image of Fig. 1 after the word segmentation is shown. 5. Slope and Slant Correction 5.1. Slope correction The skew angle estimation and correction of the document improves the whole document image and facilitates the further processing stages, especially if it handles the page by zones in respect to the different skew angles. However, the problem introduced by the hill and dale writing, still remains since despite the fact that the text lines have been corrected by the dominant angle, skewed words may still exist. As already mentioned our method deals with this problem by using a similar method to the document skew estimation (Sec. 3), based on the horizontal projection profile of the word and the WVD. The idea here is that since the envelope of the histogram is getting smoother when the word tends to be oriented at the vertical position (±90 ), and presents most peaks at the horizontal position (0 and 180 ), the corresponding WVD presents maximum intensity in the latter case. Thus, our algorithm consists of the following steps: (1) The word image is rotated in the range of ±89 using a one-degree step. (2) For each angle, the horizontal projection profile as well as the WVD are calculated.

12 628 E. Kavallieratou et al. (3) The curve of maximum intensity is extracted from each WVD. (4) The acquired curves are compared and the one with the maximum peak is selected. The corresponding rotation angle is the estimated skew angle of the word. It has to be noted that this algorithm is not able to handle very short words (i.e. less than four characters) that contain ascenders and/or descenders. In such cases the application of the presented algorithm may cause undesirable slopes. Due to this fact, we apply the threshold used in the word segmentation module in order to filter out words that fall into that category. Thus, the slope correction algorithm is not applied to words that meet the following criteria: word length < 3 main body height AND word height > main body height. For example, words such as The and by are likely to be filtered out. However, short words are not characterized by large slope angles. In the majority of cases, therefore, the skew correction of the entire text line will probably correct the slope of short words. In Fig. 10, the document image of Fig. 1 after the slope correction is shown. Fig. 10. The document image of Fig. 1 after the slope correction. The words, though independent word images by now, have been placed according to their original position in order to facilitate comparison.

13 Handwritten Document Image Processing Slant removal The character segmentation, a prerequisite stage in some systems, is a very difficult task of handwriting recognition. However, the slanted characters complicate it even more. On the other hand, even in handwriting recognition systems that do not perform character segmentation, the slanted characters cause various problems. A handwriting recognition system that does not deal with the slanted characters during the preprocessing stage, requires much more data for its training, in order to cover every case of inclined characters. In our system, we cope with this problem by automatically detecting the slant for each word and removing it. The same method as before is also used here, though the vertical projection profile of the word image is now used. A nonslanted word presents clear gaps between its characters, even if the characters are connected. These gaps introduce deep valleys in the vertical histogram of the word, while the characters draw, more or less sharp peaks. Instead, in a slanted word, the inclined characters cover the gaps between the characters and this is interpreted by less deep dips and smoother peaks in the vertical projection profile. The WVD is also used here to detect the projection profile with the most intent alternations. The procedure that is followed is the same as before and the slant that corresponds to the curve of maximum intensity with the greatest peak is selected. In order to slant a word image to right or left we follow the procedure below. The word image is segmented in equally-wide horizontal zones. The lowest zone is considered to be the base. The zone above the base is shifted one pixel to the right Fig. 11. The document image of Fig. 1 after the slant removal. The words, though independent word images by now, have been placed according to their original position in order to facilitate comparison.

14 630 E. Kavallieratou et al. or to the left. The next zone (if exists) is shifted two pixels to the same direction etc. The more the zones, the greater the slant angle. The maximum slant angle corresponds to one-pixel-high zones (i.e. when the amount of zones equals to the height h of the word image in pixels). In this case, the higher zone is shifted by h 1 pixels and the corresponding slant angle θ is the maximum one and it can be calculated as: tan θ = (h 1)/h 1 θ = 45, since h 1. In Fig. 11, the document image of Fig. 1 after the slant removal is shown. 6. Experimental Results The performance of the presented system has been tested on a collection of document images covering different scanning resolutions from dpi. In more detail, the test collection comprised roughly 100 unconstrained handwriting documents (mainly student notes), in both English and Modern Greek, b taken from corresponding writers. No special criteria were taken into account as far as it concerns the complexity and difficulty during the selection of those documents. As a result, documents of any length and complexity were included. Moreover, particular combinations of the presented algorithms have been used in several applications for automatic document image processing. Specifically, the skew angle correction, word segmentation, slope correction and slant removal algorithms were used in the system ACCeSS (European project LE ) aiming at the automatic processing application forms of insurance companies. The line segmentation, word segmentation, slope correction and slant removal algorithms were applied to GRUHD 13 and IAM-DB 17 databases in order to automate the conversion of the data in a form appropriate for further processing. The skew angle correction achieves a success rate of 99.4% with a confidence range of ±0.2 dealing with documents of skew angle ranging from 89 to +89. The response time strongly depends on the document resolution as well as the different skew angles that may be met in the document. However, since the accuracy of this algorithm is resolution-independent as it is shown in Table 2, we can drastically improve the time cost by transforming each document image into 50 dpi. On the other hand, the proposed technique for estimating the areas with different skew angles in a single document page achieves a success rate of roughly 85%. In Table 2, some experimental data of skew angle correction are presented. In general, the line and word segmentation algorithms cope with quite hard cases of high-density text as the example of Fig. 12, taken from GRUHD database. However, occasionally parts of descenders and ascenders are likely to be attached into an adjacent line, especially when characters of successive text lines are connected. b Both, English and Greek Alphabets, belong to the Alphabets of Indo-European languages and they do not require different processing as far as it concerns the preprocessing stages of OCR systems.

15 Handwritten Document Image Processing 631 Table 2. Experimental results of skew angle correction. Image Resolution Rate of Text Skew Angle Estimated No in dpi in the Page Border Graphics in Degrees Skew Angle % No No % No No % Yes No % No Yes % No Yes % No No % Yes No % Yes No % Yes Yes % No Yes % No Yes % Yes No % Yes No % No Yes % No No % No No % Yes No % No Yes % No No % No Yes % No No % No No % Yes No % Yes Yes % No No % Yes Yes % Yes No % Yes No % No Yes % No No Fig. 12. A difficult case of line segmentation as it is faced by our system. Moreover, the word segmentation algorithm may fail to distinguish consecutive words in the case of extremely slanted characters. The performance of the slope correction and slant segmentation algorithms proved to be satisfactory as well. The former reaches the success rate of 97%,

16 632 E. Kavallieratou et al. Skew Angle Correction Line Segmentation Word Segmentation Slope correction Slant Removal Fig. 13. The relative computational cost. bearing in mind that the short words (as defined in Sec. 5.1) are filtered out, while the latter overcomes the 98.5%. It is worth noting that even in the case of variant slanted characters within the same word, the slant removal algorithm generally improves the word image producing natural resultant images and facilitating the task of either character segmentation or character recognition. The computational cost is very much dependable on the size of the page as well as of the length of the text on the page. In the above-described experiments the mean computational cost was around 25 secs per page using a Pentium III at 1 GHz. However we consider more interesting the relative computational cost as it is presented in Fig Conclusions In this paper we presented an integrated system for handwritten image processing appropriate to be used as a preprocessing stage in any character segmentation or character recognition system. The proposed system consists of four main modules, namely skew angle estimation and correction, line and word segmentation, slope correction and slant removing stemming from the implementation of already existing as well as novel algorithms. A well known time-frequency distribution, the WVD, in combination with the projection profile technique is used in order to detect the skew angle, either of page or word images, as well as the slant of the characters. The idea behind these algorithms is that the alterations between peaks and dips of both the horizontal histogram of a properly oriented page/word image and the vertical histogram of a nonslanted word will be the most intent. According to the requirements of a certain application, either the entire system or specific combinations of its modules can be used. Moreover, since there are no special restrictions regarding the form of exchanging data, any module can be substituted by an already existing module that performs the same or a similar preprocessing task. In past, a similar work has been performed 30 for form-filling applications.

17 Handwritten Document Image Processing 633 References 1. A. Amiri, A. C. Downton, S. J. Hanlon, C. G. Leedham, S. M. Lucas and D. Monger, OSCAR: a visual programming toolkit for offline handwritten forms recognition, Proc. IWFHR4, 4th Int. Workshop on Frontiers in Handwriting Recognition, Taiwan, December 1994, pp A. Bagdanov and S. Kanai, Evaluation of document image skew techniques, Proc. SPIE, 1996, pp B. Boashash, B. Lovell and L. White, Time frequency analysis and pattern recognition using singular value decomposition of the Wigner Ville distribution, Advanced Algorithms and Architecture for Signal Processing, Proc. SPIE 828 (1987) P. Boles and B. Boashash, The cross Wigner Ville distribution a two dimensional analysis method for the processing of vibrosis seismic signals, Proc. IEEE ICASP 87, 1988, pp R. M. Bozinovic and S. N. Srihari, Off-line cursive script word recognition, IEEE Trans. PAMI, 11, 1 (1989) M. Y. Chen, A. Kundu, J. Zhou and S. N. Srihari, Off-line handwritten word recognition using HMM, U.S. Postal Service, 5th Adv. Technol. Conf., Washington, DC, 1992, pp W. Chin, A. Harvey and A. Jennings, Skew detection in handwritten scripts, Proc. IEEE Speech and Image Technologies for Computing and Telecommunications, 1997, pp T. A. Claasen and W. F. Mecklenbrauker, The Wigner distribution: tool for timefrequency signal analysis, Phillips J. Res. 35, 1 3 (1980) , and G. Cristobal, J. Bescos and J. Santamaria, Application of Wigner distribution for image representation and analysis, Proc IEEE 8th Int. Conf. Patt. Recogn., 1986, pp B. Gatos, N. Papamarkos and C. Chamzas, Skew detection and text line position determination in digitized documents, Patt. Recogn. 30, 9 (1997) H. Jiang, C. Han and K. Fan, A fast approach to the detection and correction of skew documents, Patt. Recogn. Lett. 18 (1997) E. Kavallieratos, N. Fakotakis and G. Kokkinakis, Skew angle estimation using Cohen s class distributions, Patt. Recogn. Lett. 20, (1999) E. Kavallieratou, N. Liolios, E. Koutsogiorgos, N. Fakotakis and G. Kokkinakis, The GRUHD database of modern Greek unconstrained handwriting, LREC (1999) O. Kenny and B. Boashash, An optical signal processing for time-frequency signal analysis using the Wigner Ville distribution, J. Elec. Electron. Eng. (1998) D. S. Le, G. R. Thoma and H. Wechsler, Automated page orientation and skew angle detection for binary document image, Patt. Recogn. 27, 10 (1994) J. Liu, C. Lee and R. Shu, An efficient method for the skew normalization of a document image, Proc. 12th Int. Conf. Pattern Recognition 1992, pp U. Marti and H. Bunke, A full English sentence database for off-line handwriting recognition, Proc. 5th Int. Conf. Document Analysis and Recognition, ICDAR 99, Bangalore, 1999, pp M. Mohamed and P. Gader, Handwritten word recognition using segmentation-free hidden Markov modeling and segmentation-based dynamic programming techniques, IEEE Trans. PAMI 18, 5 (1996) L. O Gorman, The document spectrum for page layout analysis, IEEE Trans. Patt. Anal. Mach. Intell. 15, 11 (1993)

18 634 E. Kavallieratou et al. 20. T. Pavlidis and J. Zhou, Page segmentation by white streams, Proc. 1st Int. Conf. Document Analysis and Recognition (ICDAR), International Association of Pattern Recognition, 1991, pp G. S. Peake and T. N. Tan, A general algorithm for document skew angle estimation, IEEE Int. Conf. Image Process. 2 (1997) H. Penz, I. Bajla, A. Vrabl, W. Krattenthaler and K. Mayer, Fast real-time recognition and quality inspection of printed characters via point-correlation, Proc. SPIE 4303 (2001) A. W. Senior and A. J. Robinson, An off-line cursive handwriting recognition system, IEEE Trans. Patt. Anal. Mach. Intell. 20, 3 (1998) M. Shridar and F. Kimura, Handwritten address interpretation using word recognition with and without lexicon, Proc. IEEE Int. Conf. Systems, Man and Cybernetics, Piscataway, NJ, USA, Vol. 3, 1995, pp Spitz, Analysis of compressed document images for document skews, multiple skew and logotype detection, CVIU 70, 3 (1998) Ville, Theorie et applicacions de la notion de signal analytique, Cable et Trasmission A2 (1948) J. Wang, M. K. H. Leung and S. C. Hui, Cursive word reference line detection, Patt. Recogn. 30, 3 (1997) K. B. Yu and S. Cheng, Signal synthesis from Wigner distribution, Proc. IEEE ICASSP 85, 1985, pp B. Yu and A. K. Jain, A robust and fast skew detection algorithm for generic documents, Patt. Recogn. 29, 10 (1996) B. Yu and A. K. Jain, A generic system for form dropout, IEEE Trans. Patt. Anal. Mach. Intell. 18, 11 (1996)

Handwritten Document Image Processing 635 E. Kavallieratou received her Diploma in electrical and computer engineering in 1996 from the Polytechnic School of the University of Patras and her Ph.D. in handwritten optical character recognition and document image processing from the same department in 2000.

working on Image Processing. During 2000 and 2001, she joined as guest researcher at the Institute of Communication Acoustics of Ruhr- Universitaet Bochum, Germany.

19 Handwritten Document Image Processing 635 E. Kavallieratou received her Diploma in electrical and computer engineering in 1996 from the Polytechnic School of the University of Patras and her Ph.D. in handwritten optical character recognition and document image processing from the same department in During the academic year she was a member of the Signals, Systems and Radiocomunications Laboratory of the Department of Telecommunications Engineering of the Polytechnic School of Madrid, working on Image Processing. During 2000 and 2001, she joined as guest researcher at the Institute of Communication Acoustics of Ruhr- Universitaet Bochum, Germany. Since 2002, she has been Assistant Professor of Audio Processing in Department of Audio and Musical Instruments Technology in Technological Educational Institute of Epirus, Greece. Her research interests include optical character recognition, document image analysis, signal processing, audio and image processing. N. Dromazou received her Diploma in electrical and computer engineering in 2000 from the Polytechnic School of the University of Patras. During the academic year she completed her thesis project in the field of optical character recognition. During this period she worked as a research assistant in the Wire Communications Laboratory of the Electrical and Computer Engineering Department for the creation of an Integrated System for Handwritten Document Image Processing. She also worked as a research assistant in the creation of a Greek handwritten character database. Since 2000, she has been working in the design center of ATMEL HELLAS as a software engineer specified in the telecommunications department.

636 E. Kavallieratou et al. N. Fakotakis received the B.Sc. degree from the University of London (UK) in electronics in 1978, the M.Sc. degree in electronics from the University of Wales (UK), and the Ph.

From 1986 to 1992, he was a Lecturer in the Electrical and Computer Engineering Department of the University of Patras, from 1992 to 1999, he was an Assistant Professor and since 1999, he is an

20 636 E. Kavallieratou et al. N. Fakotakis received the B.Sc. degree from the University of London (UK) in electronics in 1978, the M.Sc. degree in electronics from the University of Wales (UK), and the Ph.D. in speech processing from the University of Patras (Greece) in From 1986 to 1992, he was a Lecturer in the Electrical and Computer Engineering Department of the University of Patras, from 1992 to 1999, he was an Assistant Professor and since 1999, he is an Associate Professor in the area of speech and natural language processing and Head of the Speech and Language Processing Group at the Wire Communications Laboratory. He is the author of over 100 publications in the area of speech and natural language engineering. Dr. Fakotakis is a member of the Executive Board of ELSNET (European Language and Speech Network of Excellence), Editorin-Chief of the European Student Journal on Language and Speech, WEB-SLS. He is also a member of IEEE, TEE, EURASIP, ESCA. His current research interests include speech modeling, speech recognition/understanding, speaker recognition, spoken dialogue processing, natural language processing and optical character recognition. G. K. Kokkinakis received the Diploma in electrical engineering (Dipl.-Ing.) in 1961, the Doctor s Degree in engineering (Dr.-Ing) in 1966 and the Diploma in engineering economics (Dipl.-Wirt.-Ing) in 1967, all from the Technical University of Munich (Technische Hochschule Munchen). During he served at the Ministry of Coordination in Athens. Since 1969 he is with the Department of Electrical Engineering at the University of Patras, where he has organized and is directing the Wire Communications Laboratory (WCL). The WCL is a partner in several EU projects and in other R&D projects financed by the Greek Telecom Industry, the Greek General Secretariat for Research and Technology, etc. He has published several books on telecommunications and electrotechnology and over 250 technical papers, articles and reports on telecommunications and speech technology. Dr. Kokkinakis is a senior member of IEEE and a member of the Technical Chamber of Greece (TEE), the VDE (Verein Deutscher Elektrotechniker), ESCA (European Speech Communication Association), the EURASIP (European Association for Signal Processing), the EEEE (Greek Operations Research Society), and the Linguistics Society of America (LSA). Since 1997, he is a member of the board of ISCA. His current activity in research and development, which coincides with the activity of WCL, includes the analysis, synthesis recognition and linguistic processing of the Greek language and the design and optimization of telecom networks.

Locally baseline detection for online Arabic script based languages character recognition

Locally baseline detection for online Arabic script based languages character recognition International Journal of the Physical Sciences Vol. 5(7), pp. 955-959, July 2010 Available online at http://www.academicjournals.org/ijps ISSN 1992-1950 2010 Academic Journals Full Length Research Paper