Communcatons and etwork, 2013, 5, 448-454 http://dx.do.org/10.4236/cn.2013.53b2083 Publshed Onlne September 2013 (http://www.scrp.org/journal/cn) Rado Lnk Parameters Based QoE Measurement of Voce Servce n GSM etwork * Wenzh L 1, Jng Wang 1, Zesong Fe 1, Yuqao Ren 1, Xao Yang 2, Xaoq Wang 2 1 School of Informaton and Electroncs, Bejng Insttute of Technology, Bejng, Chna 2 The Research Insttuton of Chna Moble, Bejng, Chna Emal: wenzh306@163.com, wangjng@bt.edu.cn, fezesong@bt.edu.cn, ryq8884291@126.com, yangxao@chnamoble.com, wangxaoqyf@chnamoble.com Receved May 2013 ABSTRACT Recently, Qualty of Experence (QoE) of voce servce has been pad more attentons because t represents the performance of voce servce subjectvely perceved by the end users. And speech qualty s commonly used to measure the QoE value. In ths paper, a speech qualty assessment algorthm s proposed for GSM network, amng to predct and montor QoE of voce servce based on rado lnk parameters wth low complexty for operators. Multple Lnear Regresson (MLR) and Prncpal Component Analyss (PCA) are combned and used to establsh the mappng model from rado lnk parameters to speech qualty. Data set for model tranng and testng s obtaned from real commercal network of Chna Moble. The expermental results show that wth suffcent tranng data, ths algorthm can predct rado speech qualty wth hgh accuracy and could be used to montor speech qualty of moble network n real tme. Keywords: QoE; Speech Qualty Assessment; Voce Servce; Regresson Analyss; PCA; GSM 1. Introducton Voce servce has been and wll contnue to be the most fundamental and sgnfcant servce n cellular moble communcaton systems. And speech delvered over Global System for Moble Communcatons (GSM) network accounts for much of voce traffc. Therefore, for operators, t s of sgnfcant that Qualty of Experence (QoE) [1] of voce servce can be montored n real tme, whch gudes network optmzaton as well as network mantenance drectly and effectvely. Speech qualty s consdered as the most comprehensve metrc that characterzes the QoE of end subscrber. A QoE measurement algorthm, whch can reflect the rado lnk condton and could be ntegrated n the sgnalng montor system, s preferred from the perspectve of operators. ote that the novel algorthm should be real-tme and accurate. Besdes, low complexty s also necessary. Subjectve Mean Opnon Score (MOS) [2] assessment reflects the lstener s actual percepton of voce best, but the operaton s tme-consumng and laborous. Thus, objectve assessments algorthm, whch can be dvded nto voce based and rado lnk parameters based algorthms dependng on whether voce sgnal s needed, s devel- * Ths work was supported by Chna atonal S&T Major Project (2012ZX03001034). oped to approxmate the subjectve MOS. Perceptual Evaluaton of Speech Qualty (PESQ) [3] proposed by ITU-T s a typcal voce-based algorthm whch s a commonly used method of voce qualty test n wreless network due to ts qute hgh relevance wth subjectve MOS. However, PESQ does not apply to the long-term and large scale network montorng for ts hgh cost of mplementaton. The assessment method based on network parameters [4] s more sutable to real-tme assessment of voce qualty n moble network, because most of ts nput parameters can be measured from network n real tme. The Speech Qualty Indcator (SQI) [5] algorthm developed by Ercsson Corporaton and the Voce Qualty Index (VQI) [6] specfcally for Tme Dvson Synchronous Code Dvson Multple Access (TD-SCDMA) network by Huawe Corporaton are two typcal algorthms based on network parameters. SQI expresses the degree of voce dstorton caused by rado lnk transmsson, whch s calculated by weghtng a number of rado lnk parameters ncludng Bt Error Rate (BER), Frame Error Rate (FER), handng over, Dscontnuous Transmsson (DTX) and speech codng mode (speech codec), etc. VQI has a smlar thnkng wth SQI. The nput parameters of VQI are speech codng mode, FER, BER, handng over and frame loss. Copyrght 2013 ScRes.
W. Z. LI ET AL. 449 Although those two algorthms have been mplemented by equpment manufacturers, wth hgh accuracy f adequate enough network parameters are collected, ther ndex values are not very applcable to montor the QoE of voce servce and network qualty. That s because the major parameters such as FER, BER, frame loss and so on whch have a great mpact on QoE can t be real-tme acqured by operators n the GSM sgnalng montorng platform. Besdes, the speech ndex values of SQI and VQI cannot be compared n the network montorng and optmzaton because of ther prvate nterfaces by dfferent manufacturers. The purpose of ths paper s to solve the exstng problems by proposng a novel QoE measurng algorthm especally for GSM network. The algorthm nputs are specfc network parameters collected n sgnalng montorng platform from commercal GSM network of Chna Moble. Multple Lnear Regresson (MLR) based on least squares s adopted to further nvestgate the relatonshp between network parameters and QoE of voce servce. All of these features make t possble that the real-tme algorthm wth low complexty s sutable for montorng QoE of voce servce by operators. 2. Measurement of QoE of Voce Servce Based on GSM etwork Parameters 2.1. Thnkng of Measurement Algorthm The purpose of ths algorthm s to measure QoE of voce servce n real tme by GSM network parameters. Therefore, two condtons should be satsfed: network parameters must be obtaned n real tme; a mappng model from network parameters to speech qualty should be establshed. In GSM network, Measurement Report (MR) s one of the man foundatons to assess the qualty of rado envronment. The MR sgnalng s transmtted every 480ms n traffc channel (470ms n sgnalng channel), ncludng Receved Sgnal Qualty (RxQual), Receved Sgnal Level (RxLev), handng over, hoppng, speech codng mode and etc. Therefore, selectng MR as the access of network parameters can not only express the qualty of current rado lnk, but also requres lttle cost to transform current network. Consderng the followng condtons: tme for hu- man ear to percept voce, PESQ algorthm proposng the assessed object ncludes at least 3.2 s speech [7] and the quantty of MR demanded by measurng algorthm and the effcency of data collecton n commercal network, the tme granularty of measurng algorthm used n ths paper s set as 4.8 s fnally. The next step s to obtan the speech qualty used for data modelng correspondng to network parameters. The specfc approach s to record the voce sample correspondng to a set of network parameters n tme, and then assess the speech qualty wth PESQ algorthm. The model mappng from network parameters to voce qualty adopts the Multple Lnear Regresson method whch takes the advantage of low complexty and hgh accuracy. 2.2. Obtanng etwork Parameters To reflect the status of current network more realstcally, both the model tranng and testng use data are collected from the commercal network. In order to accurately measure the nfluence to speech qualty caused by rado lnk parameters, we captured the network parameters and speech data usng the way of cell phone calls landlne. A communcaton crcut ncludes wred lnks and wreless lnks, of whch the wreless lnks are the key aspects that affect speech qualty whle the wred parts havng less effects on speech qualty are neglgble. Meanwhle, the algorthm s to assess the voce qualty of one sngle sde of the rado lnks because most of the parameters reflectng the network qualty could not be transmtted to the other sde of core network (MSC). Accordngly, the method of obtanng network parameters and speech samples s expressed n Fgure 1. The transmssons of uplnk speech and downlnk speech are relatvely ndependent process. Therefore, the speech qualty of uplnk or downlnk s affected by uplnk or downlnk respectvely. The uplnk parameters are measured by the Base Transcever Staton (BTS) of the network whle the downlnk parameters are measured by the user termnal and then reported to network by Measurement Report sgnalng of Um nterface. In summary, both the uplnk and down lnk parameters are collected by sgnalng montorng platform of Base Staton Controller (BSC). Fgure 1. Model of data collecton. Copyrght 2013 ScRes.
450 W. Z. LI ET AL. The uplnk dstorted speech should be obtaned from BTS or BSC n theory, but t s not supported n real network and of more cost n transformaton. Furthermore, consderng the lttle loss of speech qualty caused by wred transmsson, we use the way that MS calls landlne, and collect the dstorton speech from landlne sde. PESQ algorthm recommended by ITU P.862.1 s used to calculate the MOS value of every sngle speech because PESQ s the most wdely used algorthm to assess speech qualty n moble network testng currently, whch s of very hgh relevance wth subjectve MOS. For every MR the absolute tme was record accurately and for every voce sample the recordng start tme was recorded, n order to match the voce sample wth MR convenently. Specfcally, every dstorted voce was corresponded to 10 peces of MR (480 ms) data, whch was used for the tranng and verfyng of algorthm. 2.3. The Specfc Structure of the Algorthm The structure of the algorthm s shown n Fgure 2, wth detaled descrpton of each part shown as followng. 2.3.1. Preprocessng the Data The speech qualty level for a certan perod s related wth not only the average level but also the fluctuaton of the network parameters. To reflect the fluctuaton of the network parameters, we calculated the mean, varance, extreme value and some other statstcs of the 10 observatons durng 4.8 s. Specfcally, we assume that the observaton matrxes are Equatons (1) and (2): Rxq 1 RxL 1 (1) Rxq 10 RxL 10 [ codec HO HOP DTX ] (2) where rangng from 1 to n refers to the speech sample ndex, n denotes the total number of observatons, the Rxq j and RxL j stand for RxQual and RxLevel separately, and the codec, HO, HOP and DTX stand for speech codng mode, handng over happened or not, hoppng used or not, dscontnuous transmsson used or not. The frst matrx was preprocessed, and the output s Equaton (3) combned wth Equaton (2), where X j s the statstcs of RxQual and RxLevel. X X codec HO HOP DTX (3) [ ] 1 m 2.3.2. Data Classfcaton Collected data should be classfed accordng to codng mode, because the network parameters nfluence the qualty of speech transmsson by dfferent mechansm under dfferent codng mode. Specfcally, the total data was dvded accordng to codec. Assumng the number of data collected under a certan codng mode (e.g. FER) s n, then the observed data matrx of ths mode s: X11 X1 m codec1 HO1 HOP1 DTX1 (4) X n1 X nm codecn HOn HOPn DTX n 2.3.3. Prncple Components Extracton The data have a larger dmenson after preprocess, whch make the analyss of relatonshp between preprocessed data dffcult n multdmensonal space. Besdes, the parameters have very strong correlatons wth each other, whch lead to cross mpacts on the speech qualty, and t wll be dffcult to analyze and present ths cross effect. Prncpal component analyss was ntroduced to solve the problem. Specfcally, we analyzed the correlaton of the frst m columns n the matrx expressed n secton 2.3.2, usng Prncpal Component Analyss (PCA) to calculate the prncple components, and then we took the frst p prncple components of larger varance as the nput vectors of regresson analyss, that s: Y11 Y 1p (5) Yn 1 Y np n whch every data column s a selected prncple component. RxLevel RxQual Data preprocessng (mean, varance, extreme value, etc.) Data classfy PCA 4.8s speech qualty predcton HO HOP DTX CODEC Fgure 2. Basc structure of algorthm. Copyrght 2013 ScRes.
W. Z. LI ET AL. 451 It should be noted that not all selected p prncple components wll necessarly reman n the fnal measurng formula, because some prncple components of lttle mpact on speech qualty would be excluded accordng to the hypothess testng results n the fttng procedure of regressng equaton. 2.3.4. Qualty Measurng of 4.8 s Speech Under a certan codng mode (e.g. uplnk FER), the basc form of the measurng formula s: MOSupEFR = a0 + a1* Y1+ + a * Y + a + 1* HO (6) + a * HOP + a * DTX + 2 + 3 where Y1 Y are the extracted prncple components, and HO, HOP, DTX, codec are lmted to specfc dscrete values, and a are ftted coeffcents. The fnal form of the measurng formula and the fttng coeffcents would be obtaned through multple regresson analyss [8]. For each codng mode, the prelmnary least-squares fttng of the nput data and the speech qualty values would be taken, and then the test of sgnfcance (e.g. the F-test and T-test) wll apply to the obtaned regresson equaton. An F-test ( a = 0.05 ) s used to determne whether the lner relatonshp of the equaton s sgnfcant, whle a T-test s used to determne whether the mpact of each varable s sgnfcant, leadng to some varables excluded accordng to the result. After the hypothess testng, the regresson equaton needs resdual test and outler test to determne whether the nonlnear transform processng or some other knds of processng should be taken to the data. ormalty test s used n the resdual analyss. 3. Performance Analyss 3.1. Condton of Data Collecton The data collecton for algorthm tranng and testng was based on three typcal codec modes of GSM network, wth the network parameters and speech samples recorded accordng to uplnk and downlnk separately. The dstrbuton of vald data used n the algorthm s shown as Table 1. For each case, three quarters of the total data are used for algorthm tranng to produce the measurng formula of speech qualty. The left one quarter data are used to test the performance of algorthm. 3.2. Evaluaton Index of the Algorthm Here, we call the QoE value of voce servce predcted by rado lnk parameters n moble network as RSQ (Rado Speech Qualty). For each set of network parameters (corresponded to 10 peces of MR data), we predct a RSQ value usng ths algorthm, and compare t wth the actual PESQ value of the speech, countng the followng ndcators to measure the algorthm s predcton accuracy. Amng at montorng speech qualty n actual network, ths paper proposed a strcter segmented relatve error ndcator. Indcator 1: Segmented relatve error, as shown n Table 2. In order to elmnate the nfluence to statstc result caused by the nterval endpont value, normalzed relatve error expressed n Equaton (7) s used based on the fact that PESQ (MOS-LQO) [9] has a workng range of (1.02, 4.56]. Actual MOS Predcted MOS relatve error = *100% (7) where ( MOS MOS ) 4.56 1.02 = H L =, MOS H and MOS L stand for upper and lower lmts of the PESQ value. Specally, n MOS range of (1.02, 2], the accuracy of low value alarm was taken to ndcate the accuracy of the algorthm. That s because the referenced PESQ algorthm Table 1. Dstrbuton of data used n algorthm. Uplnk Downlnk Actual PESQ value range EFR FR HR EFR FR HR (1, 2] 82 20 4 13 20 2 (2, 3] 62 63 18 110 63 10 (3, 4.5] 413 251 185 326 251 198 Table 2. Relatve error ndcators n segments. Actual RSQ Output of algorthm Segmented ndcators of accuracy (1.02, 2] RSQ value and low value alarm (gve low value alarm when predcted speech qualty s n (1.02, 2]) Accuracy of low value alarm (2, 3] RSQ value Percentage of data whose relatve error s less than 10% (3, 4.56] RSQ value Percentage of data whose relatve error s less than 10% Copyrght 2013 ScRes.
452 W. Z. LI ET AL. tself has a low measurng accuracy n low value nterval, and the speech qualty become ntolerable when MOS value s lower than 2, where t moots to gve the specfc MOS value, so alarm should be gven when the actual network qualty appears very low. The accuracy of low value alarm s calculated n Equaton (8), M Accuracy of low value alarm =. (8) where M denotes the number of samples wth predcted MOS n ( 1.02, 2 + 0.2] and actual MOS n ( 1.02, 2.0 ], s the total number of samples wth actual MOS n ( 1.02, 2.0 ]. Indcator 2: Pearson's correlaton coeffcent R calculated by Equaton (9), R= ( q )( ) =1 q y y ( q q) ( y y ) 2 2 =1 =1 where q and q stand for the value and mean of actual MOS separately; y and y stand for the value and mean of predcted MOS by algorthm separately. Indcator 3: Root Mean Square Error RMSE s shown n Equaton (10), where q and y stand for the actual MOS value and the predcted MOS value separately. (9) RMSE= =1 ( q y ) 2 (10) Correlaton coeffcent and Root Mean Square Error are metrcs of performance commonly used n nternatonal objectve qualty assessment algorthm, whch can measure not only the correlaton between the predcted value and the real value but also the degree of dsperson. 3.3. Test Performance of the Algorthm The vald data collected were dvded nto tranng data accounted for three quarters and testng data accounted for one quarter. Accuracy n ndcator 1 (Table 2) s shown as Table 3 ( - ndcates amount of data of the nterval s too small to count accuracy). Due to the network condtons, the data collected from commercal network was dffcult to acheve traversal, and most networks n the collecton area were confgured EFR mode to acheve relatvely better performance, wth fewer FR and HR data. Accordngly, the algorthm has a better performance under EFR mode because of more tranng data. Table 4 shows the measurng results of ndcator R and RMSE under EFR mode. To reflect the performance of algorthm ntutvely, the maps of actual RSQ and predcted RSQ under uplnk EFR mode s shown for example n Fgure 3. Table 3. Accuracy of algorthm n ndcator 1. Actual PESQ value range (1, 2] (2, 3] (3, 4.5] Uplnk EFR tranng effect 93% 60% 96% testng effect 95% 58% 94% Downlnk EFR tranng effect 67% 65% 92% testng effect 100% 67% 93% Uplnk FR tranng effect 71% 54% 88% testng effect 100% 56% 88% Downlnk FR tranng effect - 83% 99% testng effect - 100% 97% Uplnk HR tranng effect 67% 73% 98% testng effect - - 98% Downlnk HR tranng effect - 85% 97% testng effect - 67% 95% Table 4. R and RMSE n EFR mode. Actual PESQ value range (1, 2] (2, 3] (3, 4.5] Overall Uplnk EFR tranng results R 86% 91% 84% 97% RMSE 0.19 0.27 0.17 0.19 Uplnk EFR testng results R 86% 82% 83% 97% RMSE 0.20 0.25 0.17 0.19 Downlnk EFR tranng results R 66% 52% 89% 92% RMSE 0.50 0.28 0.19 0.23 Downlnk EFR testng results R 82% 56% 89% 91% RMSE 0.50 0.29 0.21 0.25 Copyrght 2013 ScRes.
W. Z. LI ET AL. 453 Fgure 3. Dstrbuton of Actual MOS and Predcted MOS for UP-EFR Mode. In Fgure 3, the abscssa ndcates the RSQ predcted by the algorthm usng rado lnk parameters, and the ordnate ndcates the speech qualty assessed by PESQ. The mddle s the 45 solne, on whch the predcted values and the actual values are equal. And two lnes whch ndcate that the absolute value of estmated error s 0.5 are below and above the solne. It can be suggested, for uplnk EFR, the amount of data s adequate and the MOS values dstrbute relatvely evenly, whch means the amount and ergodcty are better. Consequently, the overall correlaton s greater than 90%, and the RMSE s about 0.2, ndcatng that the measurng algorthm s of better performance, In addton, accuracy of tranng and testng are bascally equal, showng a good stablty of proposed algorthm. Meanwhle, amount of data s much larger n MOS nterval of (3, 4.5] than (2, 3], accordngly the former relatve error s sgnfcantly smaller than the later, from whch we can see that the amount of tranng data has an mportant mpact on the algorthm accuracy. Under downlnk EFR mode, the accuracy dfference between tranng and testng of the algorthm s larger n MOS nterval of (1, 2] because the amount of avalable data n the nterval s smaller, leadng to local nstablty of the algorthm. For FR and HR modes, because of less data and poor MOS ergodcty, t s stll not suffcent to support effectve tranng of the algorthm. In concluson, when the amount of tranng data s adequate and the dstrbuton of MOS value s evenly, the algorthm provdes a hgh measurng accuracy. 4. Concluson Ths paper proposed a QoE measurng algorthm of voce servce for GSM network, takng rado lnk parameters whch can be obtaned from moble network n real tme as nputs. Multple regresson and prncple component analyss are combned n the modelng approach of QoE assessment. The method s especally convenent to be ntegrated nto sgnalng montorng platform of wreless networks. Both the algorthm s tranng and testng procedures use data collected from commercal GSM networks, and the result has shown that wth adequate vald data, the algorthm wll acheve hgh accuracy. Furthermore, the proposed QoE predcton method based on GSM network can also be extended to other wreless networks such as Unversal Moble Telecommuncatons System (UMTS) and Long-term Evoluton (LTE). REFERECES [1] ITU-T P.10/G.100, Vocabulary and Effects of Transmsson Parameters on Customer Opnon of Transmsson Qualty, 2008. [2] ITU-T Recommendaton P.800, Methods for Subjectve Determnaton of Transmsson Qualty, 1996. [3] ITU-T Recommendaton P.862, Perceptual Evaluaton of Speech Qualty (PESQ): An Objectve Method for Endto-End Speech Qualty Assessment of arrow-band Telephone networks and Speech Codecs, 2001. [4] Huawe Technologes Co., Ltd. The Methods and Devces for the Estmaton of Speech Qualty, Chna Patent o. 200710172408.7, 2009. Copyrght 2013 ScRes.
454 W. Z. LI ET AL. [5] Ercsson Telefon AB-LM, Speech Qualty Measurement n Moble Telecommuncaton etworks Based on Rado Lnk Parameters, US Patent o. 19970861563, 2000. [6] Y. J. ZUO, Percepton of Voce, Wn by Method Solutons for Voce Qualty Assessment n TD-SCDMA by HUAWEI: VQI, Moble Communcatons, Vol. 34, o. 3, 2010, pp. 30-31. [7] ITU-T Recommendaton P.862.3, Applcaton Gude for Objectve Qualty Measurement Based on Recommendatons P.862, P.862.1 and P.862.2, 2007. [8] M. Kantardzc (translated by S.Q. Shan, Y. Chen and Y. Cheng), Data Mnng, Tsnghua Unversty Press, Bejng, 2003. [9] ITU-T Recommendaton P.862.1, Mappng Functon for Transformng P.862 Raw Result Scores to MOS-LQO, 2003. Copyrght 2013 ScRes.