Implementation of Text to Speech Conversion

Similar documents
Using RASTA in task independent TANDEM feature extraction

A Review of Optical Character Recognition System for Recognition of Printed Text

Smart License Plate Recognition Using Optical Character Recognition Based on the Multicopter

Image to Sound Conversion

Number Plate Recognition Using Segmentation

Contents 1 Introduction Optical Character Recognition Systems Soft Computing Techniques for Optical Character Recognition Systems

SMART READING SYSTEM FOR VISUALLY IMPAIRED PEOPLE

Abstract. Most OCR systems decompose the process into several stages:

Extraction and Recognition of Text From Digital English Comic Image Using Median Filter

THERMAL DETECTION OF WATER SATURATION SPOTS FOR LANDSLIDE PREDICTION

Automatic Speech Recognition (CS753)

Mel Spectrum Analysis of Speech Recognition using Single Microphone

An Improved Bernsen Algorithm Approaches For License Plate Recognition

An Evaluation of Automatic License Plate Recognition Vikas Kotagyale, Prof.S.D.Joshi

INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY

IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 05, 2016 ISSN (online):

A Novel Morphological Method for Detection and Recognition of Vehicle License Plates

Enhanced MLP Input-Output Mapping for Degraded Pattern Recognition

Course Objectives. This course gives a basic neural network architectures and learning rules.

THE Touchless SDK released by Microsoft provides the

Combined Approach for Face Detection, Eye Region Detection and Eye State Analysis- Extended Paper

The Use of Neural Network to Recognize the Parts of the Computer Motherboard

AUTOMATIC NUMBER PLATE DETECTION USING IMAGE PROCESSING AND PAYMENT AT TOLL PLAZA

Performance Analysis of MFCC and LPCC Techniques in Automatic Speech Recognition

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS

International Journal of Advanced Research in Computer Science and Software Engineering

Vehicle Number Plate Recognition with Bilinear Interpolation and Plotting Horizontal and Vertical Edge Processing Histogram with Sound Signals

Recognition System for Pakistani Paper Currency

Iraqi Car License Plate Recognition Using OCR

Real Time ALPR for Vehicle Identification Using Neural Network

Figure 1. Artificial Neural Network structure. B. Spiking Neural Networks Spiking Neural networks (SNNs) fall into the third generation of neural netw

Keyword: Morphological operation, template matching, license plate localization, character recognition.

Automated Number Plate Verification System based on Video Analytics

Implementation of License Plate Recognition System in ARM Cortex A8 Board

CHARACTERS RECONGNIZATION OF AUTOMOBILE LICENSE PLATES ON THE DIGITAL IMAGE Rajasekhar Junjunuri* 1, Sandeep Kotta 1

Electronic disguised voice identification based on Mel- Frequency Cepstral Coefficient analysis

Digital Image Processing Lec.(3) 4 th class

INDIAN VEHICLE LICENSE PLATE EXTRACTION AND SEGMENTATION

Preprocessing and Segregating Offline Gujarati Handwritten Datasheet for Character Recognition

Mobile SuDoKu Harvesting App

Detection of License Plates of Vehicles

(12) United States Patent (10) Patent No.: US 6,188,779 B1

ISSN No: International Journal & Magazine of Engineering, Technology, Management and Research

AN ANALYSIS OF SPEECH RECOGNITION PERFORMANCE BASED UPON NETWORK LAYERS AND TRANSFER FUNCTIONS

A Scheme for Salt and Pepper oise Reduction and Its Application for OCR Systems

A comparative study of different feature sets for recognition of handwritten Arabic numerals using a Multi Layer Perceptron

Automatic Car License Plate Detection System for Odd and Even Series

Open Access An Improved Character Recognition Algorithm for License Plate Based on BP Neural Network

Automatic Licenses Plate Recognition System

PRODUCT RECOGNITION USING LABEL AND BARCODES

FPGA based Real-time Automatic Number Plate Recognition System for Modern License Plates in Sri Lanka

Discriminative Training for Automatic Speech Recognition

AUTOMATIC SPEECH RECOGNITION FOR NUMERIC DIGITS USING TIME NORMALIZATION AND ENERGY ENVELOPES

Raspberry Pi-based Scanning Translation Device

The Basic Kak Neural Network with Complex Inputs

A Chinese License Plate Recognition System

POWER TRANSFORMER PROTECTION USING ANN, FUZZY SYSTEM AND CLARKE S TRANSFORM

VECTOR QUANTIZATION-BASED SPEECH RECOGNITION SYSTEM FOR HOME APPLIANCES

Text Extraction from Images

Scrabble Board Automatic Detector for Third Party Applications

Implementation of Neural Network Algorithm for Face Detection Using MATLAB

Research on Application of Conjoint Neural Networks in Vehicle License Plate Recognition

World Journal of Engineering Research and Technology WJERT

Bangla Optical Digits Recognition using Edge Detection Method

MAV-ID card processing using camera images

Speech Recognition using FIR Wiener Filter

Student: Nizar Cherkaoui. Advisor: Dr. Chia-Ling Tsai (Computer Science Dept.) Advisor: Dr. Eric Muller (Biology Dept.)

Recursive Text Segmentation for Color Images for Indonesian Automated Document Reader

Implementing Speaker Recognition

AUTOMATIC LICENSE PLATE RECOGNITION USING IMAGE PROCESSING AND NEURAL NETWORK

A DEVICE FOR AUTOMATIC SPEECH RECOGNITION*

Speech Synthesis using Mel-Cepstral Coefficient Feature

VEHICLE LICENSE PLATE DETECTION ALGORITHM BASED ON STATISTICAL CHARACTERISTICS IN HSI COLOR MODEL

Handwritten Character Recognition using Different Kernel based SVM Classifier and MLP Neural Network (A COMPARISON)

Smart Number Plate Identification Using Back Propagation Neural Network

Study and Analysis of various preprocessing approaches to enhance Offline Handwritten Gujarati Numerals for feature extraction

Research on Hand Gesture Recognition Using Convolutional Neural Network

SPEECH ENHANCEMENT USING PITCH DETECTION APPROACH FOR NOISY ENVIRONMENT

Proposed Method for Off-line Signature Recognition and Verification using Neural Network

Text Detection in Document Images: Highlight on using FAST algorithm

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

SONG RETRIEVAL SYSTEM USING HIDDEN MARKOV MODELS

A Smart Technique for Accurate Identification of Vehicle Number Plate Using MATLAB and Raspberry Pi 2

FACE RECOGNITION USING NEURAL NETWORKS

A Novel Approach for Image Cropping and Automatic Contact Extraction from Images

A Comprehensive Survey on Kannada Handwritten Character Recognition and Dataset Preparation

Face Detection System on Ada boost Algorithm Using Haar Classifiers

Shunt active filter algorithms for a three phase system fed to adjustable speed drive

An Approach to Detect QRS Complex Using Backpropagation Neural Network

Drum Transcription Based on Independent Subspace Analysis

ME 6406 MACHINE VISION. Georgia Institute of Technology

Automatic Ground Truth Generation of Camera Captured Documents Using Document Image Retrieval

FAULT DETECTION AND DIAGNOSIS OF HIGH SPEED SWITCHING DEVICES IN POWER INVERTER

Combination of Web and Android Application to Implement Automated Meter Reader Based on OCR

Automated hand recognition as a human-computer interface

Keshab Parhi Electrical and Computer Engineering

Transer Learning : Super Intelligence

MURDOCH RESEARCH REPOSITORY

DISEASE DETECTION OF TOMATO PLANT LEAF USING ANDROID APPLICATION

Traffic Sign Recognition Senior Project Final Report

Transcription:

Implementation of Text to Speech Conversion Chaw Su Thu Thu 1, Theingi Zin 2 1 Department of Electronic Engineering, Mandalay Technological University, Mandalay 2 Department of Electronic Engineering, Mandalay Technological University, Mandalay Abstract- Text-To-Speech (TTS) conversion is a computerbased system that can be able to read any text aloud, whether it was directly introduced in the computer by an operator or scanned and submitted to an Optical Character Recognition (OCR) system. While in text to speech, there are many systems which convert normal language text in to speech. The main aims of this paper are to study on Optical Character Recognition with speech synthesis technology and to develop a cost effective user friendly image to speech conversion system using MATLAB. In this work, the OCR system is implemented for the recognition of capital English character A to Z and number 0 to 9. Each character is recognized at once. The recognized character is saved as text in notepad file. In this work a text-to-speech conversion system that can get the text through image and directly input in the computer then speech through that text using MATLAB. 1. INTRODUCTION Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware [2]. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. Text-to-speech (TTS) convention transforms linguistic information stored as data or text into speech. It is widely used in audio reading devices for blind people now a days [6]. In the last few years however, the use of text-to-speech conversion technology has grown far beyond the disabled community to become a major adjunct to the rapidly growing use of digital voice storage for voice mail and voice response systems. Also developments in Speech synthesis technology for various languages have already taken place. The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows applications. 2. PROPOSED ALGORITHM In this work, there are two main parts: Optical Character Recognition System for Paper Text Text to Speech Conversion 2.1. Optical character recognition system In this part, there are three portions as described in the follow: Template file Creation Creating the Neural Network Character Recognition 2.1.1. Template file creation. Letter A to Z and number 0 to 9 images are collected. Each image is changed into 5 x 7 character representation in single vector by using step 1 to 5 as described in the character recognition section. These data are saved as data file for training in neural network. 2.1.2. Creating the neural network. A feedforward neural network is used to set up for pattern recognition with 25 hidden neurons. After creating the network, the weights and biases of the network are also initialized to be ready for training. The goal is assigned between 0.01 and to 0.05. The created Neural Network is trained by using data file and target file. The neural network has to be trained by adjusting weight and bias of network until the performance reaches to goal. 911

2.1.3. Character recognition. Figure 1 shows the flowchart of OCR system. Start Image Acquiring and Reading RGB to Gay Image extracting special characteristics and patterns of the image in the feature extraction phase. The classifier is then trained with the extracted features for classification task. The classification stage identifies each input character image by considering the detected features. As Classifiers, Template Matching and Neural Networks are used. 2.2. Text to speech conversion Templates trained in NN Gray to Binary Image Sementation Feature Extraction Classification Convert E-Text Open text.txt as file for write Write in the text file End Figure 1. Flowchart of OCR system The following steps are implemented for character recognition. Firstly acquire the character image and the image was read. Second step is preprocessing step. In this step firstly the image is converted into gray scale. Then this gray image is converted into black and white image (binary image). Firstly the threshold is counted in gray image then according to that threshold it is converted into black and white image. Find the boundary of the character image. Crop the image to the edge. Character is extracted and resized in this step. Letters are resized according to templates size. The resized binary image is changed into 5 x 7 character representation in single vector. Load templates that it can be matched the letters with the templates. Open the text.txt as file for write. Write in the text file and concatenate the letters. Feature extraction and classification are the heart of OCR. The character image is mapped to a higher level by The character image is converted into text and then text into speech. The algorithm is followed. Firstly check the condition that if Win 32 SAPI is available in the computer or not. If it is not available then error will be generated and Win 32 SAPI library should be loaded in the computer. Gets the voice object from Win 32 SAPI. Compares the input string with Win 32 SAPI string. Extracts voice by firstly select the voice which are available in library. Choose the pace of voice. Initializes the wave player for convert the text into speech. Finally get the speech for given image. Text to speech conversion for the e-text input that directly typed in computer is also executed by the above steps. 3. SIMULATION RESULTS In this work, the OCR system is implemented for the recognition of capital English character A to Z and number 0 to 9. Each character is recognized at one time. The recognized character is saved as text with notepad file. There are two portions in program; in the first portion it gives the text output according to input image, then it convert that text into the speech. In the second portion, the e-text is directly input in computer, then it is converted into speech. Firstly the input image of time new romance, font size 12, bold type characters is taken and then it is converted into text. As shown in Figure 2, character A is cropped from the image and features are extracted. After that it is converted to text, saved in notepad file and speech simultaneously. Similarly, the test results for character T is also illustrated in Figure 3. The recognized character can be displayed in the command widow and can be save in notepad file as shown in Figure 4. 912

Figure 2. Character A converted into text A sound wave Figure 4. Output text in command window Saved text in notepad (character A and T ) The mathematical numbers are also successfully converted into text and then speech which is shown in Figure 5. Figure 3. Character T converted into text T sound wave Figure 5. Number 5 converted into text Number 5 sound wave Another type of font character is taken and again it is converted into text and then speech successfully as shown in Figure 6 and 7. 913

As illustrated in Figure 8, the e-text that directly input in computer by typing from keyboard, then it is also converted into speech successfully. Figure 6. Character M converted into text Character M sound wave Figure 8. E-text Input Sound Wave Hello, How are you? 4. CONCLUSION In this work, image into text and then that text into speech is converted by MATLAB. E-text into speech is also converted successfully. By this approach text from a word document, Web page or e-book can be read and can generate synthesized speech through a computer's speakers. For image to text conversion, firstly image is converted into gray image. Gray image is converted into binary image by thresholding and then it is converted into text by MATLAB. Microsoft Win 32 SAPI library has been used to build speech enabled applications, which retrieve the voice and audio output information available for computer. In this work, one character can be converted into text at once. As a further extension, OCR system can be developed for converting words or sentences image into text. Figure 7. Number 2 converted into text Number 2 sound wave 914

REFERENCES 1. Ainsworth, W., "A system for converting English text into speech," Audio and Electroacoustics, IEEE Transactions on, vol.21, no.3, pp. 288-290, Jun 1973 2. Fushikida, Katsunobu; Mitome, Yukio; Inoue, Yuji, "A Text to Speech Synthesizer for the Personal Computer," Consumer Electronics, IEEE Transactions on, vol.ce-28, no.3, pp.250-256, Aug. 1982 3. Hertz, S., "English text to speech conversion with delta," Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP '86., vol.11, no., pp.2427-2430, Apr 1986 4. Lynch, M.R.; Rayner, P.J., "Optical character recognition using a new connectionist model," Image Processing and its Applications, 1989., Third International Conference on, vol., no., pp.63-67, 18-20 Jul 1989 5. S. Furui, Speaker independent isolated word recognition using dynamic features of speech spectrum, IEEE Transactions on Acoustic, Speech, Signal Processing, Vol.34, issue 1, Feb 1986, pp. 52-59. 6. Leija, L.; Santiago, S.; Alvarado, C., "A system of text reading and translation to voice for blind persons," Engineering in Medicine and Biology Society, 1996. Bridging Disciplines for Biomedicine. Proceedings of the 18th Annual International Conference of the IEEE, vol.1, no., pp.405-406 vol.1, 31 Oct-3 Nov 1996 7. Tanprasert, C.; Koanantakool, T., "Thai OCR: a neural network application,"tencon '96. Proceedings. 1996 IEEE TENCON. Digital Signal Processing Applications, vol.1, no., pp.90-95 vol.1, 26-29 Nov 1996 8. Breen, A.P., "The future role of text to speech synthesis in automated services," Advances in Interactive Voice Technologies for Telecommunication Services (Digest No: 1997/147), IEE Colloquium on, vol., no., pp.6/1-6/5, 12 Jun 1997 915