EXPLORING MULTIDIMENSIONAL LSTMS FOR LARGE VOCABULARY ASR
|
|
- Russell Simon
- 5 years ago
- Views:
Transcription
1 EXPLORING MULTIDIMENSIONAL LSTMS FOR LARGE VOCABULARY ASR Jinyu Li, Abderahman Mohamed, Geoffrey Zweig, and Yifan Gong Microsoft Corporation, One Microsoft Way, Redmond, WA {jinyi, asamir, gzweig, ABSTRACT Long short-term memory (LSTM) recurrent neura networks (RNNs) have recenty shown significant performance improvements over deep feed-forward neura networks. A key aspect of these modes is the use of time recurrence, combined with a gating architecture that aows them to track the ong-term dynamics of speech. Inspired by human spectrogram reading, we recenty proposed the frequency LSTM (F-LSTM) that performs -D recurrence over the frequency axis and then performs -D recurrence over the time axis. In this study, we further improve the acoustic mode by proposing a 2-D, time-frequency (TF) LSTM. The TF-LSTM jointy scans the input over the time and frequency axes to mode spectro-tempora warping, and then uses the output activations as the input to a time LSTM (T-LSTM). The joint timefrequency modeing better normaizes the features for the upper ayer T-LSTMs. Evauated on a 375-hour short message dictation task, the proposed TF-LSTM obtained a 3.4% reative reduction over the best T-LSTM. The invariance property achieved by joint time-frequency anaysis is demonstrated on a mismatched test set, where the TF-LSTM achieves a 4.2% reative reduction over the best T-LSTM. Index Terms LSTM, RNN, time and frequency, mutidimensiona. INTRODUCTION Recenty, significant progress has been made in automatic speech recognition (ASR) thanks to the appication of deep neura networks (DNNs) [][2][3][4][5][6]. DNNs, however, ony consider information in a fixed-ength siding window of frames and thus cannot expoit ong-range correations in the signa. Recurrent neura networks (RNNs), on the other hand, can encode sequence history in their interna state, and thus have the potentia to predict phonemes based on a the speech features observed up to the current frame. Unfortunatey, simpe RNNs, depending on the argest eigenvaue of the state-update matrix, may have gradients which either increase or decrease exponentiay over time. Thus, the basic RNN is difficut to train, and in practice can ony mode short-range effects. Long short-term memory (LSTM) RNNs [7][8] were deveoped to overcome these probems. LSTM-RNNs use input, output and forget gates to achieve a network that can maintain state and propagate gradients in a stabe fashion over ong spans of time. These networks have been shown to outperform DNNs on a variety of ASR tasks [9][0][][2][3][4]. A previousy proposed LSTMs use a recurrence aong the time axis to mode the tempora patterns of speech signas, and we ca them T-LSTMs in this paper. In common practice, og-fiter-bank features are often used as the input to the neura-network-based acoustic mode [5]. In standard systems, the og-fiter-bank features are independent of one-another, i.e. switching the positions of two fiter-banks won t affect the performance of the DNN or LSTM. However, this is not the case when a human reads a spectrogram: a human reies on both patterns that evove on time, and frequency, to predict phonemes. Switching the positions of two fiter-banks wi destroy the frequency-wise patterns. Meanwhie, switching the positions of two frames wi destroy the time-wise patterns. Inspired by the way peope read spectrograms, we recenty proposed frequency LSTM (F-LSTM) in [6] which performs recurrence aong the frequency axis to summarize the frequency invoving patterns as the feature for the upper eve T-LSTMs. A the LSTM operations in [6] are onedimensiona, either aong the frequency axis or the time axis. However, both time-wise and frequency-wise patterns are important to human spectrogram reading. Hence, it may be better to extract feature with both patterns. Further, the concept of mutidimensiona processing has been proved very successfu in the handwriting recognition tasks [7][8] and the computer vision tasks [9], and it outperformed the traditiona handwriting systems that use convoutiona neura networks (CNNs) [20][2] as the feature extractor. The main contribution of this paper is the proposa to use a mutidimensiona LSTM to mode both time and frequency dynamics for speech recognition. We further propose a method for doing this joint time-frequency anaysis in a highy efficient way. We term the proposed method the time-frequency LSTM or TF- LSTM. Evauated on a 375-hour Microsoft short message dictation (SMD) task, the TF-LSTM consistenty outperformed the F-LSTM and obtained 3.4% reative word error rate () reduction from the T-LSTM on the SMD test set, and a 4.2% reative reduction on a mismatched test set. The rest of the paper is organized as foows. In Section 2, we briefy introduce LSTMs and then we present the proposed timefrequency LSTM in Section 3. We differentiate the proposed method from the convoutiona LSTM DNN (CLDNN) [4] and muti-dimensiona RNN [7][8] in Section 4. Experimenta evauation of the agorithm is provided in Section 5. We summarize our study and draw concusions in Section THE LSTM-RNN An RNN is fundamentay different from the feed-forward DNN in that the RNN does not operate on a fixed window of frames; instead, it maintains a hidden state vector, which is recursivey updated after seeing each time frame. This aows RNNs to be resiient to arbitrary input warping aong the recurrence dimension eading to better generaization abiities. Stacking mutipe ayers of RNNs aows the network to discover reationships between frames on progressivey higher eves of abstraction. During earning, the simpe RNN suffers from the vanishing/expoding gradient probem [22]. This probem is we handed in the LSTM-RNNs through the use of the foowing four components:
2 Memory units: these store the tempora state of the network; Input gates: these moduate the input activations into the ces; Output gates: these moduate the output activations of the ces ; Forget gates: these adaptivey reset the ce s memory. Taken together as in Figure beow, these four components are termed a LSTM ce. Figure : Architecture of LSTM-RNNs with one recurrent ayer. Z is a time-deay node. Figure depicts the architecture of an LSTM-RNN with one recurrent ayer. In LSTM-RNNs, in addition to the past hidden-ayer output h t, the past memory activation c t is aso an input to the LSTM ce. This mode can be described as: i t = σ(w xi x t + W hi h t + W ci c t + b i ), () f t = σ(w xf x t + W hf h t + W cf c t + b f ), (2) c t = f t. c t + i t. tanh(w xc x t + W hc h t + b c ), (3) o t = σ(w xo x t + W ho h t + W co c t + b o ), (4) h t = o t. tanh(c t ), (5) where i t, o t, f t, and c t denote the activation vectors of input gate, output gate, forget gate, and memory ce at the -th ayer and time t, respectivey. h t is the output of the LSTM ces at ayer and time t. W terms denote different weight matrices. For exampe, W xi is the weight matrix from the ce input to the input gate at the -th ayer. b terms are the bias terms (e.g., b i is the bias of input gate at ayer ).. denotes eement wise mutipication. In [], a LSTM with an additiona projection ayer prior to the output was proposed to reduce the computationa compexity of LSTM. A projection ayer is appied to h t as r t = W hr h t And then h t in Eqs ()--(4) is repaced by r t. In this study, we adopt this structure for T-LSTM modeing. Figure 2: An exampe of time-frequency LSTM-RNN which scans both the time and frequency axis at the bottom ayer using TF-LSTM, and then scans the time axis at the upper ayers using T-LSTM. Note that the outputs of a TF-LSTM ces are fed into the upper ayer T- LSTM. f k,t The formuation of the TF-LSTM is as foows. = σ(w xi x k,t + W hi h k,t + W hi2 h k,t + W ci c k,t + b i ), (6) = σ(w xf x k,t + W hf h k,t + W hf2 h k,t + W cf c k,t + b f ), (7) i k,t o k,t c k,t = f k,t = σ(w xo x k,t. c k,t + W ho + i k,t h k,t. tanh(w xc x k,t W hc2 + W ho2 h k,t + W hc h k,t + + b c ), (8) h k,t + W co c k,t + b o ), (9) h k,t = o k,t. tanh(c k,t ), (0) In this formuation, every gate now has three indices: ayer, frequency band k, and time t. For exampe, f k,t denotes the activation vectors of forget gate at the ayer, frequency band k, and time t. Different from Eqs ()--(4), now we have both time-deay input h k,t and frequency-deay input h k,t. The W h. and W h.2 matrices denote the weight matrices connecting h k,t and h k,t, respectivey. The structure of a TF-LSTM ce is potted in Figure 3, where φ denotes the tanh function. 3. JOINT TIME-FREQUENCY ANALYSIS VIA MULTIDIMENSIONAL LSTM In this section, we propose a time-frequency LSTM (TF-LSTM) as shown in Figure 2. In contrast to the frequency LSTM (F-LSTM) in our previous work [6] which scans the frequency bands so that frequency-evoving information is summarized by the output of the F-LSTM, the new method scans both the time and frequency axes jointy to perform the time-frequency anaysis. Figure 3: A TF-LSTM ce at frequency band k, and time t.
3 The proposed TF-LSTM in Eqs (6)--(0) is a genera case of T-LSTM or F-LSTM. When a the time frequency bands are concatenated together as a singe unit, frequency index k and a the items associated with W h.2 are removed. Then the TF-LSTM reduces to the T-LSTM of Eqs ()--(5). In contrast, if a the items associated with W h. are removed, the TF-LSTM reduces to a F- LSTM, which can be viewed as removing the connections to h k,t in Figure 3. The detaied TF-LSTM processing is described as foows. At each time step, divide the N og-fiter-banks at the current time into M overapped chunks, shifting by C ogfiter-banks between adjacent chunks. They are denoted as x k,t, k = M. Using the hidden activations at each frequency chunk from the previous time step h k,t, the hidden activations at each time step from the previous frequency chunk h k,t, and the input at the current frequency chunk and time step x k,t, go through Eqs (6)--(0) to generate the output of h k,t, k = M. Note that we use og-fiterbanks as the input which means the time-frequency anaysis is in the first ayer, is set as in Eqs (6)--(0). Merge h k,t, k = M into a super-vector h t which can be considered as a trajectory of time-frequency patterns. Then use h t as the input to the upper ayer T-LSTM. It is aso worthwhie to investigate the stacking of mutipe TF- LSTM ayers. This can be easiy done by repacing x k,t with the hidden activations from the previous ayer h k,t in Eqs (6)--(9). Again, the output of the ast TF-LSTM ayer is merged into a supervector as the input to the upper ayer T-LSTM. A sampe of stacked two TF-LSTM ayers is shown in Figure 4. the CNN ayer is fed into a muti-ayer LSTM to earn the tempora patterns. Finay, the output of the ast LSTM ayer is fed into severa fuy connected DNN ayers for the purpose of cassification. The key difference between the TF-LSTM and the CLDNN is that the TF-LSTM uses joint time-frequency recurrence, whereas the CLDNN uses a siding convoutiona window for pattern detection. Whie the siding window achieves some oca invariance, it is not the same as a joint two-dimensiona recurrent network which scans the whoe time and frequency axis. The two approaches both aim to achieve invariance to input distortions, but the pattern detectors in the CNN maintain a constant dimensionaity, whie the TF-LSTM can perform a genera time-frequency warping. The proposed method is simiar to the mutidimensiona LSTM [7][8] which is used for handwriting recognition. Mutidimensiona LSTM has been used in [23] on a very sma phone recognition task, TIMIT [24], using connectionist tempora cassification (CTC) [25] as the training criterion. However, there is no accuracy comparison with T-LSTM in [23]. In contrast, we wi show the advantage of our proposed TF-LSTM over T-LSTM with the cross-entropy training criterion on a arge scae speech recognition task in next section. Athough using simiar concepts, the proposed TF-LSTM has a different formuation from the mutidimensiona LSTM in [7][8]. The proposed TF-LSTM has ony a singe memory unit and a singe forget gate whie the mutidimensiona LSTM in [7][8] has mutipe forget gates, each handing one dimensiona information. Thus we achieve a significant reduction in compexity. We are currenty buiding a strong CLDNNs baseine to compare with, and it wi be reported in the future. We wi aso impement the mutidimensiona LSTM with mutipe forget gates [7][8] and compare with our proposed method. 5. EXPERIMENTS AND DISCUSSIONS The proposed methods are evauated on a Microsoft Windows phone short message dictation task. The transcribed training data contain 375 hours of US-Engish audio. The test set is from the same Windows Phone task, and has 25k words. This arge test set guarantees the significance of reported improvement. The 87-dimentiona feature used in the DNN and T-LSTM experiments consists of the 29-dimensiona static og-fiter-bank outputs and their first- and second-order derivatives [26]. For the F- LSTM and TF-LSTM experiments, we ony use the static og-fiterbanks as the feature. A modes evauated in this study use 5976 tied-triphone states (senones), determined by a baseine CD-GMM- HMM system, and were trained to minimize the frame-eve crossentropy criterion. A experiments were conducted using the Computationa Network Tookit (CNTK) [27], which aows us to buid and evauate various network structures efficienty without deriving and impementing compicated training agorithms. Figure 4: An exampe of stacked TF-LSTM ayers. 4. RELATION TO PRIOR WORK In this section, we first discuss the difference between our proposed TF-LSTM and the convoutiona LSTM DNN (CLDNN) [4] which combines CNNs, LSTMs, and DNNs together. The CLDNN first uses a CNN to reduce the spectra variation, and then the output of To buid the baseine DNN, we augment the 87-dimensiona feature vectors with 5 frames of context on either side (5--5). The DNN has 5 hidden ayers, each with 2048 sigmoid units. The baseine T-LSTM is modeed after that in []. Each T-LSTM ayer has 024 hidden units and the output size of each T-LSTM ayer is reduced to 52 using a inear projection ayer. There is no frame stacking, and the output HMM state abe is deayed by 5 frames as in []. When training T-LSTM, the backpropagation through time (BPTT) [28] step is 20. We use a 4-ayer T-LSTM as our baseine.
4 This has 5.35%. It outperforms the baseine DNN with 0.39% reative reduction. This setup is better than the mode with three or five T-LSTM ayers as shown in Tabe. There is a 4.3% reative reduction when increasing one additiona ayer from 3-ayer T-LSTM to 4-ayer T-LSTM. However, a 5-ayer LSTM does not outperform a 4-ayer T-LSTM. Tabe : and mode size comparison of DNN and T-LSTM. M denotes miion in the coumn of number of. Mode DNN M 3-ayer T-LSTM M 4-ayer T-LSTM M 5-ayer T-LSTM M In Tabe 2, we compare the performance of the F-LSTM and TF-LSTM modes. The F-LSTM mode uses a singe LSTM to scan the og-fiter-banks whie the TF-LSTM uses a singe LSTM to scan both the time and og-fiter-banks. The generated time-frequency evoving summary or the frequency evoving summary wi then be passed into 3 or 4 ayers of T-LSTMs. At each time step, the 29 og-fiter-bank channes are divided into 22 overapped chunks with each chunk containing 8 og-fiterbanks, which means the frequency shift is og-fiter-bank. This og-fiter-bank grouping strategy foows our previous wisdom in CNN [29]. Then these 22 chunks are fed into F-LSTM. The input to the TF-LSTM ces incudes not ony the previous frequency chunks but aso the output of this TF-LSTM ce in the previous time frame. Both the F-LSTM and TF-LSTM have 24 memory ces, introducing sma computationa cost. The upper ayer T-LSTMs have the same structure as the baseine T-LSTMs, with 024 hidden units in each ayer, and the output size is reduced to 52 using a projection. A the setups in Tabe 2 outperform the baseine 4-ayer T- LSTM. With a 3-ayer T-LSTM on top of it, the F-LSTM and TF- LSTM perform amost the same. However, with a 4-ayer T-LSTM on top it, the TF-LSTM is much better than the F-LSTM, and gets 4.83% a 3.4% reative reduction from the baseine 4- ayer T-LSTM. The joint time-frequency modeing provides a better feature for the upper ayer T-LSTMs to consume. As shown in Tabe, simpy increasing number of ayers from 4 to 5 doesn t give any gain. Tabe 2: Comparison of F-LSTM or TF-LSTM Mode F-LSTM + 3-ayer T-LSTM M F-LSTM + 4-ayer T-LSTM M TF-LSTM + 3-ayer T-LSTM M TF-LSTM + 4-ayer T-LSTM M We further investigate the performance of stacked F-LSTM and TF-LSTM in Tabe 3. To have the same number of ayers as the TF-LSTM + 4-ayer T-LSTM setup in Tabe 2, we tried to use either 2-ayer F-LSTM or 2-ayer TF-LSTM, foowed by 3-ayer T- LSTM. Again, the setup using TF-LSTM outperformed the setup with F-LSTM. However, none outperformed the TF-LSTM + 4- ayer T-LSTM setup. Note that it ony introduces 0.M additiona from the TF-LSTM + 3-ayer T-LSTM setup in Tabe 2 to the 2-ayer F-LSTM + 3-ayer T-LSTM setup in Tabe 3 and this brings very sight improvement. This is because the TF- LSTM itsef has very sma number of parameter because the ce size is ony 24. In the future, we can have 2-ayer TF-LSTM foowed by 4-ayer T-LSTM to get some further gains. Tabe 3: The stacking of F-LSTM and TF-LSTM Mode 2-ayer F-LSTM + 3-ayer T-LSTM M 2-ayer TF-LSTM + 3-ayer T- LSTM M In a fina set of experiments, we evauated the invariance properties of the TF-LSTM mode by testing the modes trained with Windows phone data on the Aurora 4 [30] test sets. Two cean evauation sets (A and C) are recorded with the Sennheiser microphone and the secondary microphone, respectivey. The remaining two groups (B and D), are recorded with two types of microphone respectivey, and 6 types of noise are added with randomy chosen SNRs between 5 and 5 db for each of the microphone types. Therefore, these test sets have totay mismatched acoustic environments from the Windows phone training set. We used the baseine 4-ayer T-LSTM mode in Tabe and the TF-LSTM mode in Tabe 2 for the evauation. The anguage mode is a bigram provided by Aurora 4. As shown in Tabe 4, the TF-LSTM performs much better than the T-LSTM in a test conditions, and reduced the average from 7.46% to 5.0%, a 4.2% reative reduction. This confirms the robustness [3] of the joint time-frequency anaysis of the TF-LSTM. Tabe 4: The comparison of T-LSTM and TF-LSTM modes on the mismatched Aurora 4 test sets. Modes are trained with Windows phone short message dictation data. Mode A B C D Avg. 4-ayer T-LSTM TF-LSTM + 4- ayer T-LSTM CONCLUSIONS In this paper, we have presented a two-dimensiona TF-LSTM architecture that scans both the time and frequency axes to mode the evoving patterns of the spectrogram. The TF-LSTM uses a LSTM to perform a joint time-frequency recurrence that summarizes spectro-tempora patterns. The summarized patterns are then fed into upper eve T-LSTMs. The proposed TF-LSTM obtained a 3.4% reative reduction over the traditiona T- LSTM on a 375-hour short message dictation task. We further investigated the effectiveness of stacking mutipe TF-LSTM ayers, and found that the additiona accuracy gain is margina. This indicates that a one ayer TF-LSTM is good enough to extract the patterns reevant to speech recognition. When evauated with a totay mismatched Aurora 4 test set, the TF-LSTM demonstrates much better resistance to the distortion, giving 4.2% reative reduction over a T-LSTM.
5 REFERENCES [] F. Seide, G. Li, and D. Yu, Conversationa speech transcription using context-dependent deep neura networks, in Proc. Interspeech, pp , 20. [2] N. Jaity, P. Nguyen, A. Senior, and V. Vanhoucke, An appication of pretrained deep neura networks to arge vocabuary conversationa speech recognition, in Proc. Interspeech, 202. [3] T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak, A.-R. Mohamed, Making deep beief networks effective for arge vocabuary continuous speech recognition, in Proc. ASRU, pp , 20. [4] G. E. Dah, D. Yu, L. Deng, and A. Acero, Large vocabuary continuous speech recognition with context-dependent DBN- HMMs, in Proc. ICASSP, pp , 20. [5] A. Mohamed, G. E. Dah, and G. Hinton, Acoustic modeing using deep beief networks, IEEE Trans. Audio Speech and Language Process., vo. 20, no., pp. 4-22, Jan [6] L. Deng, J. Li, J.-T. Huang et. a. Recent advances in deep earning for speech research at Microsoft, in Proc. ICASSP, 203. [7] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neura Computation, vo. 9, no. 8, pp , 997. [8] A. Gers, J. Schmidhuber, and F. Cummins. Learning to forget: Continua prediction with LSTM, Neura Computation, vo. 2, no. 0, pp , [9] A. Graves, A. Mohamed, G. Hinton. Speech recognition with deep recurrent neura networks, in Proc. ICASSP, 203. [0] A. Graves, N. Jaity, A. Mohamed. Hybrid speech recognition with deep bidirectiona LSTM, in Proc. ASRU, 203. [] H. Sak, A. Senior, F. Beaufays, "Long short-term memory recurrent neura network architectures for arge scae acoustic modeing," in Proc. Interspeech, 204. [2] H. Sak, O. Vinyas, G. Heigod, A. Senior, E. McDermott, R. Monga, M. Mao, "Sequence discriminative distributed training of ong short-term memory recurrent neura networks," in Proc. Interspeech, 204. [3] X. Li and X. Wu, Constructing ong short-term memory based deep recurrent neura networks for arge vocabuary speech recognition, in Proc. ICASSP, 205. [4] T. N. Sainath, O. Vinyas, A. Senior and H. Sak, "Convoutiona, ong short-term memory, fuy connected deep neura networks," in Proc. ICASSP, 205. [5] A. Mohamed, G. Hinton, and G. Penn, Understanding how deep beief networks perform acoustic modeing, in Proc. ICASSP, pp , 202. [6] J. Li, A. Mohamed, G. Zweig, and Yifan Gong, LSTM time and frequency recurrence for automatic speech recognition, in Proc. ASRU, 205. [7] A. Graves, S. Fernández, J. Schmidhuber, Muti-dimensiona recurrent neura networks, in ICANN, pp , [8] A. Graves and J. Schmidhuber, Offine handwriting recognition with mutidimensiona recurrent neura networks, Advances in Neura Information Processing Systems, pp , [9] W. Byeon, T. M. Breue, F. Raue, and M. Liwicki, Scene abeing with LSTM recurrent neura networks, In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp , 205. [20] T. N. Sainath, A. Mohamed, B. Kingsbury and B. Ramabhadran, "Deep convoutiona neura networks for LVCSR," in Proc. ICASSP, 203. [2] O. Abde-Hamid, A. Mohamed, H. Jiang, L. Deng, G. Penn, and Dong Yu, Convoutiona neura networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language processing, vo. 22, no. 0, pp , 204. [22] Y. Bengio, P. Simard, and P. Frasconi. Learning ong-term dependencies with gradient descent is difficut, IEEE Transactions on Neura Networks, vo. 5, no. 2, pp , 994. [23] A. Graves, "Practica variationa inference for neura networks." In Advances in Neura Information Processing Systems, pp , 20. [24] J. S. Garofoo, L. F. Lame, W. M. Fisher, J. G. Fiscus, D. S. Paett, and N. L. Dahgren, DARPA TIMIT Acoustic- Phonetic Continuous Speech Corpus, U.S. Dept. of Commerce, NIST, Gaithersburg, MD, February 993. [25] A. Graves, S. Fernandez, F. Gomez, and J. Schmidhuber, Connectionist tempora cassification: abeing unsegmented sequence data with recurrent neura networks, in Proceedings of the 23rd internationa conference on Machine earning. ACM, pp , [26] J. Li, D. Yu, J. T. Huang, and Y. Gong. "Improving wideband speech recognition using mixed-bandwidth training data in CD-DNN-HMM," in Proc. IEEE Spoken Language Technoogy Workshop, pp. 3 36, 202. [27] D. Yu, A. Eversoe, M. Setzer, et. a., "An introduction to computationa networks and the computationa network tookit," Microsoft Technica Report MSR-TR-204-2, 204. [28] H. Jaeger, Tutoria on training recurrent neura networks, covering BPPT, RTRL, EKF and the echo state network approach, GMD Report 59, GMD German Nationa Research Institute for Computer Science, [29] J.-T. Huang, J. Li, and Y. Gong, An anaysis of convoutiona neura networks for speech recognition, in Proc. ICASSP, 205. [30] N. Parihar and J. Picone, Aurora working group: DSR front end LVCSR evauation AU/384/02, Tech. Rep., Institute for Signa and Information Processing, Mississippi State Univ., [3] J. Li, L. Deng, Y. Gong, and R. Haeb-Umbach, Robust Automatic Speech Recognition: A Bridge to Practica Appications, Esevier Press, 205.
LSTM TIME AND FREQUENCY RECURRENCE FOR AUTOMATIC SPEECH RECOGNITION
LSTM TIME AND FREQUENCY RECURRENCE FOR AUTOMATIC SPEECH RECOGNITION Jinyu Li, Abderahman Mohamed, Geoffrey Zweig, and Yifan Gong Microsoft Corporation, One Microsoft Way, Redmond, WA 98052 { jinyi, asamir,
More informationarxiv: v1 [cs.ne] 5 Feb 2014
LONG SHORT-TERM MEMORY BASED RECURRENT NEURAL NETWORK ARCHITECTURES FOR LARGE VOCABULARY SPEECH RECOGNITION Haşim Sak, Andrew Senior, Françoise Beaufays Google {hasim,andrewsenior,fsb@google.com} arxiv:12.1128v1
More informationIMPROVING WIDEBAND SPEECH RECOGNITION USING MIXED-BANDWIDTH TRAINING DATA IN CD-DNN-HMM
IMPROVING WIDEBAND SPEECH RECOGNITION USING MIXED-BANDWIDTH TRAINING DATA IN CD-DNN-HMM Jinyu Li, Dong Yu, Jui-Ting Huang, and Yifan Gong Microsoft Corporation, One Microsoft Way, Redmond, WA 98052 ABSTRACT
More informationEndpoint Detection using Grid Long Short-Term Memory Networks for Streaming Speech Recognition
INTERSPEECH 2017 August 20 24, 2017, Stockholm, Sweden Endpoint Detection using Grid Long Short-Term Memory Networks for Streaming Speech Recognition Shuo-Yiin Chang, Bo Li, Tara N. Sainath, Gabor Simko,
More informationImproving the Active Power Filter Performance with a Prediction Based Reference Generation
Improving the Active Power Fiter Performance with a Prediction Based Reference Generation M. Routimo, M. Sao and H. Tuusa Abstract In this paper a current reference generation method for a votage source
More informationBER Performance Analysis of Cognitive Radio Physical Layer over Rayleigh fading Channel
Internationa Journa of Computer ppications (0975 8887) Voume 5 No.11, Juy 011 BER Performance naysis of Cognitive Radio Physica Layer over Rayeigh fading mandeep Kaur Virk Dr. B R mbedkar Nationa Institute
More informationResource Allocation via Linear Programming for Multi-Source, Multi-Relay Wireless Networks
Resource Aocation via Linear Programming for Muti-Source, Muti-Reay Wireess Networs Nariman Farsad and Andrew W Ecford Dept of Computer Science and Engineering, Yor University 4700 Keee Street, Toronto,
More informationUsing RASTA in task independent TANDEM feature extraction
R E S E A R C H R E P O R T I D I A P Using RASTA in task independent TANDEM feature extraction Guillermo Aradilla a John Dines a Sunil Sivadas a b IDIAP RR 04-22 April 2004 D a l l e M o l l e I n s t
More informationPerformance Measures of a UWB Multiple-Access System: DS/CDMA versus TH/PPM
Performance Measures of a UWB Mutipe-Access System: DS/CDMA versus TH/PPM Aravind Kaias and John A. Gubner Dept. of Eectrica Engineering University of Wisconsin-Madison Madison, WI 53706 akaias@wisc.edu,
More informationLearning the Speech Front-end With Raw Waveform CLDNNs
INTERSPEECH 2015 Learning the Speech Front-end With Raw Waveform CLDNNs Tara N. Sainath, Ron J. Weiss, Andrew Senior, Kevin W. Wilson, Oriol Vinyals Google, Inc. New York, NY, U.S.A {tsainath, ronw, andrewsenior,
More informationRate-Allocation Strategies for Closed-Loop MIMO-OFDM
Rate-Aocation Strategies for Cosed-Loop MIMO-OFDM Joon Hyun Sung and John R. Barry Schoo of Eectrica and Computer Engineering Georgia Institute of Technoogy, Atanta, Georgia 30332 0250, USA Emai: {jhsung,barry}@ece.gatech.edu
More informationFuzzy Model Predictive Control Applied to Piecewise Linear Systems
10th Internationa Symposium on Process Systems Engineering - PSE2009 Rita Maria de Brito Aves, Caudio Augusto Oer do Nascimento and Evaristo Chabaud Biscaia Jr. (Editors) 2009 Esevier B.V. A rights reserved.
More informationChannel Division Multiple Access Based on High UWB Channel Temporal Resolution
Channe Division Mutipe Access Based on High UWB Channe Tempora Resoution Rau L. de Lacerda Neto, Aawatif Menouni Hayar and Mérouane Debbah Institut Eurecom B.P. 93 694 Sophia-Antipois Cedex - France Emai:
More informationConvolutional Neural Networks for Small-footprint Keyword Spotting
INTERSPEECH 2015 Convolutional Neural Networks for Small-footprint Keyword Spotting Tara N. Sainath, Carolina Parada Google, Inc. New York, NY, U.S.A {tsainath, carolinap}@google.com Abstract We explore
More informationA BAG-OF-FEATURES APPROACH TO ACOUSTIC EVENT DETECTION. Department of Computer Science, TU Dortmund University, Dortmund, Germany
A BAG-OF-FEATURES APPROACH TO ACOUSTIC EVENT DETECTION Axe Pinge, René Grzeszick, and Gernot A. Fink Department of Computer Science, TU Dortmund University, Dortmund, Germany ABSTRACT The cassification
More informationRateless Codes for the Gaussian Multiple Access Channel
Rateess Codes for the Gaussian Mutipe Access Channe Urs Niesen Emai: uniesen@mitedu Uri Erez Dept EE, Te Aviv University Te Aviv, Israe Emai: uri@engtauaci Devavrat Shah Emai: devavrat@mitedu Gregory W
More informationAn Approach to use Cooperative Car Data in Dynamic OD Matrix
An Approach to use Cooperative Car Data in Dynamic OD Matrix Estimation L. Montero and J. Barceó Department of Statistics and Operations Research Universitat Poitècnica de Cataunya UPC-Barceona Tech Abstract.
More informationSecure Physical Layer Key Generation Schemes: Performance and Information Theoretic Limits
Secure Physica Layer Key Generation Schemes: Performance and Information Theoretic Limits Jon Waace Schoo of Engineering and Science Jacobs University Bremen, Campus Ring, 879 Bremen, Germany Phone: +9
More informationMinimizing Distribution Cost of Distributed Neural Networks in Wireless Sensor Networks
1 Minimizing Distribution Cost of Distributed Neura Networks in Wireess Sensor Networks Peng Guan and Xiaoin Li Scaabe Software Systems Laboratory, Department of Computer Science Okahoma State University,
More informationNeural Network Acoustic Models for the DARPA RATS Program
INTERSPEECH 2013 Neural Network Acoustic Models for the DARPA RATS Program Hagen Soltau, Hong-Kwang Kuo, Lidia Mangu, George Saon, Tomas Beran IBM T. J. Watson Research Center, Yorktown Heights, NY 10598,
More informationPulsed RF Signals & Frequency Hoppers Using Real Time Spectrum Analysis
Pused RF Signas & Frequency Hoppers Using Rea Time Spectrum Anaysis 1 James Berry Rohde & Schwarz Pused Rea Time and Anaysis Frequency Seminar Hopper Agenda Pused Signas & Frequency Hoppers Characteristics
More informationPower Control and Transmission Scheduling for Network Utility Maximization in Wireless Networks
roceedings of the 46th IEEE Conference on Decision and Contro New Oreans, LA, USA, Dec. 12-14, 27 FrB2.5 ower Contro and Transmission Scheduing for Network Utiity Maximization in Wireess Networks Min Cao,
More informationPredicting Eye Fixations using Convolutional Neural Networks
Predicting Eye Fixations using Convoutiona Neura Networks Nian Liu 1, Junwei Han 1*, Dingwen Zhang 1, Shifeng Wen 1 and Tianming Liu 2 1 Northwestern Poytechnica University, P.R. China 2 University of
More informationAirborne Ultrasonic Position and Velocity Measurement Using Two Cycles of Linear-Period-Modulated Signal
Airborne Utrasonic Position and Veocity Measurement Using Two Cyces of Linear-Period-Moduated Signa Shinya Saito 1, Minoru Kuribayashi Kurosawa 1, Yuichiro Orino 1, and Shinnosuke Hirata 2 1 Department
More informationRadial basis function networks for fast contingency ranking
Eectrica Power and Energy Systems 24 2002) 387±395 www.esevier.com/ocate/ijepes Radia basis function networks for fast contingency ranking D. Devaraj a, *, B. Yegnanarayana b, K. Ramar a a Department of
More informationFast Hybrid DFT/DCT Architecture for OFDM in Cognitive Radio System
Fast Hybrid DF/D Architecture for OFDM in ognitive Radio System Zhu hen, Moon Ho Lee, Senior Member, EEE, hang Joo Kim 3 nstitute of nformation&ommunication, honbuk ationa University, Jeonju, 56-756,Korea
More informationGoogle Speech Processing from Mobile to Farfield
Google Speech Processing from Mobile to Farfield Michiel Bacchiani Tara Sainath, Ron Weiss, Kevin Wilson, Bo Li, Arun Narayanan, Ehsan Variani, Izhak Shafran, Kean Chin, Ananya Misra, Chanwoo Kim, and
More informationResource Allocation via Linear Programming for Fractional Cooperation
1 Resource Aocation via Linear Programming for Fractiona Cooperation Nariman Farsad and Andrew W Ecford Abstract In this etter, resource aocation is considered for arge muti-source, muti-reay networs empoying
More informationFEATURE COMBINATION AND STACKING OF RECURRENT AND NON-RECURRENT NEURAL NETWORKS FOR LVCSR
FEATURE COMBINATION AND STACKING OF RECURRENT AND NON-RECURRENT NEURAL NETWORKS FOR LVCSR Christian Plahl 1, Michael Kozielski 1, Ralf Schlüter 1 and Hermann Ney 1,2 1 Human Language Technology and Pattern
More informationADAPTIVE ITERATION SCHEME OF TURBO CODE USING HYSTERESIS CONTROL
ADATIV ITRATION SCHM OF TURBO COD USING HYSTRSIS CONTROL Chih-Hao WU, Kenichi ITO, Yung-Liang HUANG, Takuro SATO Received October 9, 4 Turbo code, because of its remarkabe coding performance, wi be popuar
More informationAudio Effects Emulation with Neural Networks
DEGREE PROJECT IN TECHNOLOGY, FIRST CYCLE, 15 CREDITS STOCKHOLM, SWEDEN 2017 Audio Effects Emulation with Neural Networks OMAR DEL TEJO CATALÁ LUIS MASÍA FUSTER KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL
More informationInformation Theoretic Radar Waveform Design for Multiple Targets
1 Information Theoretic Radar Waveform Design for Mutipe Targets Amir Leshem and Arye Nehorai Abstract In this paper we use information theoretic approach to design radar waveforms suitabe for simutaneousy
More informationGeneralized constrained energy minimization approach to subpixel target detection for multispectral imagery
Generaized constrained energy minimization approach to subpixe target detection for mutispectra imagery Chein-I Chang, MEMBER SPIE University of Maryand Batimore County Department of Computer Science and
More informationJoint Beamforming and Power Optimization with Iterative User Clustering for MISO-NOMA Systems
This artice has been accepted for pubication in a future issue of this journa, but has not been fuy edited. Content may change prior to fina pubication. Citation information: DOI 0.09/ACCESS.07.70008,
More informationIterative Transceiver Design for Opportunistic Interference Alignment in MIMO Interfering Multiple-Access Channels
Journa of Communications Vo. 0 No. February 0 Iterative Transceiver Design for Opportunistic Interference Aignment in MIMO Interfering Mutipe-Access Channes Weipeng Jiang ai Niu and Zhiqiang e Schoo of
More informationJoint Optimization of Scheduling and Power Control in Wireless Networks: Multi-Dimensional Modeling and Decomposition
This artice has been accepted for pubication in a future issue of this journa, but has not been fuy edited. Content may change prior to fina pubication. Citation information: DOI 10.1109/TMC.2018.2861859,
More informationA Low Complexity VCS Method for PAPR Reduction in Multicarrier Code Division Multiple Access
0 JOURNAL OF ELECTRONIC SCIENCE AND TECHNOLOGY OF CHINA, VOL. 5, NO., JUNE 007 A Low Compexity VCS Method for PAPR Reduction in Muticarrier Code Division Mutipe Access Si-Si Liu, Yue iao, Qing-Song Wen,
More informationPROPORTIONAL FAIR SCHEDULING OF UPLINK SINGLE-CARRIER FDMA SYSTEMS
PROPORTIONAL FAIR SCHEDULING OF UPLINK SINGLE-CARRIER SYSTEMS Junsung Lim, Hyung G. Myung, Kyungjin Oh and David J. Goodman Dept. of Eectrica and Computer Engineering, Poytechnic University 5 Metrotech
More informationAnalysis, Analysis Practices, and Implications for Modeling and Simulation
, Practices, and Impications for Modeing and imuation Amy Henninger The Probem The act of identifying, enumerating, evauating, and mapping known technoogies to inferred program requirements is an important
More informationUniversity of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /GLOCOM.2003.
Coon, J., Siew, J., Beach, MA., Nix, AR., Armour, SMD., & McGeehan, JP. (3). A comparison of MIMO-OFDM and MIMO-SCFDE in WLAN environments. In Goba Teecommunications Conference, 3 (Gobecom 3) (Vo. 6, pp.
More informationTHE TRADEOFF BETWEEN DIVERSITY GAIN AND INTERFERENCE SUPPRESSION VIA BEAMFORMING IN
THE TRADEOFF BETWEEN DIVERSITY GAIN AND INTERFERENCE SUPPRESSION VIA BEAMFORMING IN A CDMA SYSTEM Yan Zhang, Laurence B. Mistein, and Pau H. Siege Department of ECE, University of Caifornia, San Diego
More informationDealing with Link Blockage in mmwave Networks: D2D Relaying or Multi-beam Reflection?
Deaing with Lin Bocage in mmwave etwors: DD Reaying or Muti-beam Refection? Mingjie Feng, Shiwen Mao Dept. Eectrica & Computer Engineering Auburn University, Auburn, AL 36849-5, U.S.A. Tao Jiang Schoo
More informationTIME-FREQUENCY CONVOLUTIONAL NETWORKS FOR ROBUST SPEECH RECOGNITION. Vikramjit Mitra, Horacio Franco
TIME-FREQUENCY CONVOLUTIONAL NETWORKS FOR ROBUST SPEECH RECOGNITION Vikramjit Mitra, Horacio Franco Speech Technology and Research Laboratory, SRI International, Menlo Park, CA {vikramjit.mitra, horacio.franco}@sri.com
More informationGeorgia Institute of Technology. simulating the performance of a 32-bit interconnect bus. referenced to non-ideal planes. A transient simulation
Power ntegrity/signa ntegrity Co-Simuation for Fast Design Cosure Krishna Srinivasan1, Rohan Mandrekar2, Ege Engin3 and Madhavan Swaminathan4 Georgia nstitute of Technoogy 85 5th St NW, Atanta GA 30308
More informationPerformance Comparison of Cyclo-stationary Detectors with Matched Filter and Energy Detector M. SAI SINDHURI 1, S. SRI GOWRI 2
ISSN 319-8885 Vo.3,Issue.39 November-14, Pages:7859-7863 www.ijsetr.com Performance Comparison of Cyco-stationary Detectors with Matched Fiter and Energy Detector M. SAI SINDHURI 1, S. SRI GOWRI 1 PG Schoar,
More informationDeep Neural Network Architectures for Modulation Classification
Deep Neural Network Architectures for Modulation Classification Xiaoyu Liu, Diyu Yang, and Aly El Gamal School of Electrical and Computer Engineering Purdue University Email: {liu1962, yang1467, elgamala}@purdue.edu
More informationCo-channel Interference Suppression Techniques for STBC OFDM System over Doubly Selective Channel
Co-channe Interference Suppression Techniques for STBC OFDM System over Douby Seective Channe Jyoti P. Patra Dept. of Eectronics and Communication Nationa Institute Of Technoogy Rourkea-769008, India E
More informationBlind Multiuser Detection in Asynchronous DS-CDMA Systems over Nakagami-m Fading Channels
Bind Mutiuser Detection in Asynchronous DS-CDMA Systems over akagami-m Fading Channes Vinay Kumar Pamua JU Kakinada, Andhra Pradesh, India 533 003 pamuavk@yahoo.com ABSRAC his paper presents a technique
More informationNew Image Restoration Method Based on Multiple Aperture Defocus Images for Microscopic Images
Sensors & Transducers, Vo. 79, Issue 9, September 204, pp. 62-67 Sensors & Transducers 204 by IFSA Pubishing, S. L. http://www.sensorsporta.com New Image Restoration Method Based on Mutipe Aperture Defocus
More informationSparse Channel Estimation Based on Compressed Sensing for Massive MIMO Systems
Sparse Channe Estimation Based on Compressed Sensing for Massive MIMO Systems Chenhao Qi, Yongming Huang, Shi Jin and Lenan Wu Schoo of Information Science and Engineering, Southeast University, Nanjing
More informationFusing Noisy Fingerprints with Distance Bounds for Indoor Localization
Fusing Noisy Fingerprints with Distance Bounds for Indoor Locaization Suining He 1 S.-H. Gary Chan 1 Lei Yu 2 Ning Liu 2 1 Department of CSE, The Hong Kong University of Science and Technoogy, Hong Kong,
More informationRecurrent neural networks Modelling sequential data. MLP Lecture 9 Recurrent Networks 1
Recurrent neural networks Modelling sequential data MLP Lecture 9 Recurrent Networks 1 Recurrent Networks Steve Renals Machine Learning Practical MLP Lecture 9 16 November 2016 MLP Lecture 9 Recurrent
More informationI D I A P. Hierarchical and Parallel Processing of Modulation Spectrum for ASR applications Fabio Valente a and Hynek Hermansky a
R E S E A R C H R E P O R T I D I A P Hierarchical and Parallel Processing of Modulation Spectrum for ASR applications Fabio Valente a and Hynek Hermansky a IDIAP RR 07-45 January 2008 published in ICASSP
More informationAudio Effects Emulation with Neural Networks
Escola Tècnica Superior d Enginyeria Informàtica Universitat Politècnica de València Audio Effects Emulation with Neural Networks Trabajo Fin de Grado Grado en Ingeniería Informática Autor: Omar del Tejo
More informationRecurrent neural networks Modelling sequential data. MLP Lecture 9 / 13 November 2018 Recurrent Neural Networks 1: Modelling sequential data 1
Recurrent neural networks Modelling sequential data MLP Lecture 9 / 13 November 2018 Recurrent Neural Networks 1: Modelling sequential data 1 Recurrent Neural Networks 1: Modelling sequential data Steve
More informationSCHEDULING the wireless links and controlling their
3738 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 13, NO. 7, JULY 2014 Minimum Length Scheduing With Packet Traffic Demands in Wireess Ad Hoc Networks Yacin Sadi, Member, IEEE, and Sinem Coeri Ergen,
More informationEvaluating robust features on Deep Neural Networks for speech recognition in noisy and channel mismatched conditions
INTERSPEECH 2014 Evaluating robust on Deep Neural Networks for speech recognition in noisy and channel mismatched conditions Vikramjit Mitra, Wen Wang, Horacio Franco, Yun Lei, Chris Bartels, Martin Graciarena
More informationTime-domain Techniques in EMI Measuring Receivers. Technical and Standardization Requirements
Time-domain Techniques in EMI Measuring Receivers Technica and Standardization Requirements CISPR = Huge, Sow, Compex, CISPR = Internationa Specia Committee on Radio Interference Technica committee within
More informationA Comparative Analysis of Image Fusion Techniques for Remote Sensed Images
roceedings of the Word Congress on Engineering 27 Vo I WCE 27, Juy 2-4, 27, London, U.K. Comparative naysis of Image Fusion Techniques for emote Sensed Images sha Das 1 and K.evathy 2 Department of Computer
More informationAn Investigation on the Use of i-vectors for Robust ASR
An Investigation on the Use of i-vectors for Robust ASR Dimitrios Dimitriadis, Samuel Thomas IBM T.J. Watson Research Center Yorktown Heights, NY 1598 [dbdimitr, sthomas]@us.ibm.com Sriram Ganapathy Department
More informationAcoustic modelling from the signal domain using CNNs
Acoustic modelling from the signal domain using CNNs Pegah Ghahremani 1, Vimal Manohar 1, Daniel Povey 1,2, Sanjeev Khudanpur 1,2 1 Center of Language and Speech Processing 2 Human Language Technology
More informationEffect of Estimation Error on Adaptive L-MRC Receiver over Nakagami-m Fading Channels
Internationa Journa of Appied Engineering Research ISSN 973-456 Voume 3, Number 5 (8) pp. 77-83 Research India Pubications. http://www.ripubication.com Effect of Estimation Error on Adaptive -MRC Receiver
More informationRecurrent neural networks Modelling sequential data. MLP Lecture 9 Recurrent Neural Networks 1: Modelling sequential data 1
Recurrent neural networks Modelling sequential data MLP Lecture 9 Recurrent Neural Networks 1: Modelling sequential data 1 Recurrent Neural Networks 1: Modelling sequential data Steve Renals Machine Learning
More informationUtility-Proportional Fairness in Wireless Networks
IEEE rd Internationa Symposium on Persona, Indoor and Mobie Radio Communications - (PIMRC) Utiity-Proportiona Fairness in Wireess Networks G. Tychogiorgos, A. Gkeias and K. K. Leung Eectrica and Eectronic
More informationSatellite Link Layer Performance Using Two Copy SR-ARQ and Its Impact on TCP Traffic
Sateite Link Layer Performance Using Two Copy SR-ARQ and Its Impact on TCP Traffic Jing Zhu and Sumit Roy Department of Eectrica Engineering, University of Washington Box 352500, Seatte, WA 98195, USA
More informationA Neural Attention Model for Urban Air Quality Inference: Learning the Weights of Monitoring Stations
The Thirty-Second AAAI Conference on Artificia Inteigence (AAAI-18) A Neura Attention Mode for Urban Air Quaity Inference: Learning the Weights of Monitoring Stations Weiyu Cheng, Yanyan Shen, Yanmin Zhu,
More informationarxiv: v2 [cs.cl] 20 Feb 2018
IMPROVED TDNNS USING DEEP KERNELS AND FREQUENCY DEPENDENT GRID-RNNS F. L. Kreyssig, C. Zhang, P. C. Woodland Cambridge University Engineering Dept., Trumpington St., Cambridge, CB2 1PZ U.K. {flk24,cz277,pcw}@eng.cam.ac.uk
More informationSINGING-VOICE SEPARATION FROM MONAURAL RECORDINGS USING DEEP RECURRENT NEURAL NETWORKS
SINGING-VOICE SEPARATION FROM MONAURAL RECORDINGS USING DEEP RECURRENT NEURAL NETWORKS Po-Sen Huang, Minje Kim, Mark Hasegawa-Johnson, Paris Smaragdis Department of Electrical and Computer Engineering,
More informationSpeech Enhancement In Multiple-Noise Conditions using Deep Neural Networks
Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks Anurag Kumar 1, Dinei Florencio 2 1 Carnegie Mellon University, Pittsburgh, PA, USA - 1217 2 Microsoft Research, Redmond, WA USA
More informationFBMC/OQAM for the Asynchronous Multi-User MIMO Uplink
FBMC/OQAM for the Asynchronous Muti-User MIMO Upin Yao Cheng, Peng Li, and Martin Haardt Communications Research Laboratory, Imenau University of Technoogy P. O. Box 100565, D-98694 Imenau, Germany {y.cheng,
More informationTheoretical Profile of Ring-Spun Slub Yarn and its Experimental Validation
Chong-Qi Ma, Bao-Ming Zhou, Yong Liu, Chuan-Sheng Hu Schoo of Texties, Tianjin Poytechnic University, 399 West Binshui Road, Xiqing District, Tianjin, 300387, China E-mai: iuyong@tjpu.edu.cn Theoretica
More informationPerformance of Single User vs. Multiuser Modulation in Wireless Multicarrier (MC) Communications
erformance of Singe User vs. Mutiuser Moduation in Wireess Muticarrier (MC) Communications Anwaru Azim, ecturer, East West University Bangadesh Abstract-- he main objective of this paper is to compare
More informationCOMPARATIVE ANALYSIS OF ULTRA WIDEBAND (UWB) IEEE A CHANNEL MODELS FOR nlos PROPAGATION ENVIRONMENTS
COMPARATIVE ANALYSIS OF ULTRA WIDEBAND (UWB) IEEE80.15.3A CHANNEL MODELS FOR nlos PROPAGATION ENVIRONMENTS Ms. Jina H. She PG Student C.C.E.T, Wadhwan, Gujarat, Jina_hshet@yahoo.com Dr. K. H. Wandra Director
More informationJoint Optimal Power Allocation and Relay Selection with Spatial Diversity in Wireless Relay Networks
Proceedings of SDR'11-WInnComm-Europe, 22-24 Jun 2011 Joint Optima Power Aocation and Reay Seection with Spatia Diversity in Wireess Reay Networks Md Habibu Isam 1, Zbigniew Dziong 1, Kazem Sohraby 2,
More informationRadar Signal Demixing via Convex Optimization
Radar Signa Demixing via Convex Optimization Youye Xie Shuang Li Gongguo Tang and Michae B. Wain Department of Eectrica Engineering Coorado Schoo of Mines Goden CO USA Emai: {youyexie shuangi gtang mwain@mines.edu
More informationSTUDY ON AOTF-BASED NEAR-INFRARED SPECTROSCOPY ANALYSIS SYSTEM OF FARM PRODUCE QUALITY
STUDY ON AOTF-BASED NEAR-INFRARED SPECTROSCOPY ANALYSIS SYSTEM OF FARM PRODUCE QUALITY Xiaochao Zhang *, Xiaoan Hu, Yinqiao Zhang, Hui Wang, Hui Zhang 1 Institute of Mechatronics Technoogy and Appication,
More informationModel of Neuro-Fuzzy Prediction of Confirmation Timeout in a Mobile Ad Hoc Network
Mode of Neuro-Fuzzy Prediction of Confirmation Timeout in a Mobie Ad Hoc Network Igor Konstantinov, Kostiantyn Poshchykov, Sergej Lazarev, and Oha Poshchykova Begorod State University, Pobeda Street 85,
More informationOptimal and Suboptimal Finger Selection Algorithms for MMSE Rake Receivers in Impulse Radio Ultra-Wideband Systems 1
Optima and Suboptima Finger Seection Agorithms for MMSE Rake Receivers in Impuse Radio Utra-Wideband Systems Sinan Gezici, Mung Chiang, H. Vincent Poor and Hisashi Kobayashi Department of Eectrica Engineering
More informationRelays that Cooperate to Compute
Reays that Cooperate to Compute Matthew Nokeby Rice University nokeby@rice.edu Bobak Nazer Boston University bobak@bu.edu Behnaam Aazhang Rice University aaz@rice.edu Natasha evroye University of Iinois
More informationFREQUENCY-DOMAIN TURBO EQUALIZATION FOR SINGLE CARRIER MOBILE BROADBAND SYSTEMS. Liang Dong and Yao Zhao
FREQUENCY-DOMAIN TURBO EQUALIZATION FOR SINGLE CARRIER MOBILE BROADBAND SYSTEMS Liang Dong and Yao Zhao Department of Eectrica and Computer Engineering Western Michigan University Kaamazoo, MI 49008 ABSTRACT
More informationSMOOTHED DOPPLER PROFILE IN MST RADAR DATA- THE MODIFIED CEPSTRUM APPROACH
SMOOTHED DOPPLER PROFILE IN MST RADAR DATA- THE MODIFIED CEPSTRUM APPROACH M. Venatanarayana 1 and T. Jayachandra Prasad 1 Department of ECE, KSRE, Kadapa, India RGET, Nandya, India E-Mai: narayanamoram@gmai.com
More informationCommunication Systems
Communication Systems 1. A basic communication system consists of (1) receiver () information source (3) user of information (4) transmitter (5) channe Choose the correct sequence in which these are arranged
More informationCross-layer queuing analysis on multihop relaying networks with adaptive modulation and coding K. Zheng 1 Y. Wang 1 L. Lei 2 W.
www.ietd.org Pubished in IET Communications Received on 18th June 2009 Revised on 30th Juy 2009 ISSN 1751-8628 Cross-ayer queuing anaysis on mutihop reaying networks with adaptive moduation and coding
More informationDISTANT speech recognition (DSR) [1] is a challenging
1 Convolutional Neural Networks for Distant Speech Recognition Pawel Swietojanski, Student Member, IEEE, Arnab Ghoshal, Member, IEEE, and Steve Renals, Fellow, IEEE Abstract We investigate convolutional
More informationDESIGN OF SHIP CONTROLLER AND SHIP MODEL BASED ON NEURAL NETWORK IDENTIFICATION STRUCTURES
DESIGN OF SHIP CONROLLER AND SHIP MODEL BASED ON NEURAL NEWORK IDENIFICAION SRUCURES JASMIN VELAGIC, FACULY OF ELECRICAL ENGINEERING SARAJEVO, BOSNIA AND HERZEGOVINA, asmin.veagic@etf.unsa.ba ABSRAC his
More informationSpace-Time Focusing Transmission in Ultra-wideband Cooperative Relay Networks
ICUWB 2009 (September 9-11, 2009) 1 Space-Time Focusing Transmission in Utra-wideband Cooperative Reay Networks Yafei Tian and Chenyang Yang Schoo of Eectronics and Information Engineering, Beihang University
More informationCopyright 2000 IEEE. IEEE Global Communications Conference (Globecom 2000), November 27 - December 1, 2000, San Francisco, California, USA
Copyright 2000 EEE. EEE Goba Communications Conference (Gobecom 2000), November 27 - December 1, 2000, San Francisco, Caifornia, USA Persona use of this materia is permitted. owever, permission to reprint/repubish
More informationA CPW-Fed Printed Monopole Ultra-Wideband Antenna with E-Shaped Notched Band Slot
Iraqi Journa of Appied Physics Emad S. Ahmed Department of Eectrica and Eectronic Engineering University of Technoogy, Baghdad, Iraq A CPW-Fed Printed Monopoe Utra-Wideband Antenna with E-Shaped Notched
More informationJOINT NOISE AND MASK AWARE TRAINING FOR DNN-BASED SPEECH ENHANCEMENT WITH SUB-BAND FEATURES
JOINT NOISE AND MASK AWARE TRAINING FOR DNN-BASED SPEECH ENHANCEMENT WITH SUB-BAND FEATURES Qing Wang 1, Jun Du 1, Li-Rong Dai 1, Chin-Hui Lee 2 1 University of Science and Technology of China, P. R. China
More informationA Novel Method for Doppler and DOD- DOA Jointly Estimation Based on FRFT in Bistatic MIMO Radar System
7 Asia-Pacific Engineering and Technoogy Conference (APETC 7) ISBN: 978--6595-443- A Nove Method for Dopper and DOD- DOA Jointy Estimation Based on FRFT in Bistatic MIMO Radar System Derui Song, Li Li,
More informationarxiv: v1 [cs.it] 22 Jul 2014
MODULATION FORMATS AND WAVEFORMS FOR THE PHYSICAL LAYER OF 5G WIRELESS NETWORKS: WHO WILL BE THE HEIR OF OFDM? Paoo Banei, Stefano Buzzi, Giuio Coavope, Andrea Modenini, Fredrik Rusek, and Aessandro Ugoini
More informationAn Optimization Framework for XOR-Assisted Cooperative Relaying in Cellular Networks
n Optimization Framework for XOR-ssisted Cooperative Reaying in Ceuar Networks Hong Xu, Student Member, IEEE, Baochun Li, Senior Member, IEEE bstract This work seeks to address two questions in cooperative
More informationAutomatic Speech Recognition (CS753)
Automatic Speech Recognition (CS753) Lecture 9: Brief Introduction to Neural Networks Instructor: Preethi Jyothi Feb 2, 2017 Final Project Landscape Tabla bol transcription Music Genre Classification Audio
More informationA New Framework for Supervised Speech Enhancement in the Time Domain
Interspeech 2018 2-6 September 2018, Hyderabad A New Framework for Supervised Speech Enhancement in the Time Domain Ashutosh Pandey 1 and Deliang Wang 1,2 1 Department of Computer Science and Engineering,
More informationDiscriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks
Discriminative Enhancement for Single Channel Audio Source Separation using Deep Neural Networks Emad M. Grais, Gerard Roma, Andrew J.R. Simpson, and Mark D. Plumbley Centre for Vision, Speech and Signal
More informationPath Delay Estimation using Power Supply Transient Signals: A Comparative Study using Fourier and Wavelet Analysis
Path Deay Estimation using Power Suppy Transient Signas: A Comparative Study using Fourier and Waveet Anaysis Abhishek Singh, Jitin Tharian and Jim Pusqueic VLSI Research Laboratory Department of Computer
More informationCAPACITY OF UNDERWATER WIRELESS COMMUNICATION CHANNEL WITH DIFFERENT ACOUSTIC PROPAGATION LOSS MODELS
CAPACITY OF UNDERWATER WIRELESS COMMUNICATION CHANNEL WITH DIFFERENT ACOUSTIC PROPAGATION LOSS MODELS Susan Joshy and A.V. Babu, Department of Eectronics & Communication Engineering, Nationa Institute
More informationGRAY CODE FOR GENERATING TREE OF PERMUTATION WITH THREE CYCLES
VO. 10, NO. 18, OCTOBER 2015 ISSN 1819-6608 GRAY CODE FOR GENERATING TREE OF PERMUTATION WITH THREE CYCES Henny Widowati 1, Suistyo Puspitodjati 2 and Djati Kerami 1 Department of System Information, Facuty
More informationCNMF-BASED ACOUSTIC FEATURES FOR NOISE-ROBUST ASR
CNMF-BASED ACOUSTIC FEATURES FOR NOISE-ROBUST ASR Colin Vaz 1, Dimitrios Dimitriadis 2, Samuel Thomas 2, and Shrikanth Narayanan 1 1 Signal Analysis and Interpretation Lab, University of Southern California,
More informationAn Efficient Adaptive Filtering for CFA Demosaicking
Dev.. Newin et. a. / (IJCSE) Internationa Journa on Computer Science and Engineering An Efficient Adaptive Fitering for CFA Demosaicking Dev.. Newin*, Ewin Chandra Monie** * Vice Principa & Head Dept.
More information