Quarterly Progress and Status Report. Acoustic properties of the Rothenberg mask

Size: px

Start display at page:

Download "Quarterly Progress and Status Report. Acoustic properties of the Rothenberg mask"

Colin Alexander
5 years ago
Views:

1 Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Acoustic properties of the Rothenberg mask Hertegård, S. and Gauffin, J. journal: STL-QPSR volume: 33 number: 2-3 year: 1992 pages:

3 STL-QPSR 2-3 / 1992 ACOUSTIC PROPERTIES OF THE ROTHENBERG MASK Stellan Hertegdrd* G. Jan Gauffin Abstract The flow response and possible distortion from the Rothenberg mask system on radiated speech were studied by means of sweep-tone measurements. The flow frequency response was flat within 3 db up to 1.6 khz. This frequency range for the speech signal radiated through the mask normally includes the lowest two formants for open vowels. and is probably sufjicient to describe most aspects of the glottalflow waveform for untrained normal and pathological voices. A pronounced zero between 1.8 and 2 khz was found. This restricts the use of the mask systems tested here for air flow measurements at higher frequencies. It was shown that this zero could be moved up in frequency, increasing the useful frequency response to nearly 3 khz. We suggest that the zero is caused by acoustic shunting of the nasal part of the mask. A modified mask design by placing wire screen also in its nasal part might substantially improve the frequency response. INTRODUCTION Mean airflow at the lips is often measured in order to assess vocal function. If the subglottal pressure during phonation also is measured, the glottal resistance during phonation can be calculated as the quotient between pressure and flow (Isshiki, 1964; Schutte, 1980). This gives an estimate of the mean "closing" ability of the glottis. Pneumotachograph masks are frequently used for measuring mean airflow during phonation of a vowel. The disadvantage of calculating glottal resistance from mean airflow is that it is impossible to study each individual glottal cycle and measure the absolute airflow during the closed phase. Fig. 1 illustrates that a certain mean airflow can be produced by means of a high modulated transglottal airflow and zero airflow during the closed phase, indicating complete glottal closure. The same mean airflow could also result from a smaller modulated airflow and a constant airflow (waveform offset) even during the presumed closed phase, indicating insufficient glottal closure**. These two alternatives would not result in the same voice quality, but could not be separated from measurements of the mean airflow only. The circumferentially vented wire screen mask designed by Rothenberg (Fig. 2) has some special properties that differ from many other pneumotachograph masks that are used for the measurement of mean airflow at the lips (Rothenberg, 1973, 1977). In the Rothenberg mask a number of holes have been drilled around the circumference close to the mouth. These are covered with a fine meshed single or double layered steel wire screen that produce an acoustic resistance. A wide frequency differential pressure transducer is used as a microphone. It has one input on the inside of the wire screen and another input on the outside, which provides a measure of the pressure drop across the screen, thus also an estimation of flow. The purpose of the design is to reduce the effect of the mask on the acoustic resonances of the vo- *~e~t. of Logopedics and Phoniatrics, Huddinge University Hospital, Karolinska Institute, Stockholm. ** A small amount of waveform offset during the closed phase could be caused by vertical movements of the vocal folds as previously discussed in Hertegbd, Gauffin, & Karlsson, 1992.

4 cal tract and also to increase the frequency response of the mask system. If the airflow at the lips is properly inverse filtered, each glottal pulse during phonation can be studied in more detail and the absolute transglottal airflow during the open and closed phases can be measured after calibration. The response time and acoustic properties of the mask system have been studied by Rothenberg (1973; 1977). Ttansglottal airflow closed phase Time Fig. I. Mean airflow and transglottal airflow for two types of phonations: A with a high modulated flow amplitude and zero airflow during the closed phase, B with a small amount of modulated airflozu and a constant waveform offset. hfe~allic wire screen cove;ing the holes F compressible Results from studies of the upper frequency limit of the flow response of the mask system and assessment of the possible distortion of the radiated speech sound from the mask were presented in a previous paper (Badin, Hertegsrd, & Karlsson, 1990). In the present study complementary experiments are described and their implications for clinical use of the Rothenberg mask system are discussed. Fig. 2. Schematic drawing of the Rothenberg mask. METHODS The flow response of the mask system was tested by sweep-tone analysis. Fig. 3 shows the experimental set-up. The mask was connected via a ceramic adapter provided by Glottal Enterprise, Syracuse, USA and placed on a wooden plate with a central 5 cm hole'on top of a small loudspeaker driven by a Hewlett Packard Dynamic Signal Analyser, type 3562A. The Dynamic Signal Analyser produced a constant amplitude frequency sweep. The loudspeaker and adapter simulated a short vocal tract with its first formant at approximately 3 khz. The output of the mask

5 STL-QPSR 2-3/ 1992 electronics was recorded by the Dynamic Signal Analyser. The microphone in the mask system tested was a differential pressure transducer (type MTW). The mask electronics in the experiment was type MSIF-2, which has a built in 3 khz low-pass filter. The masks tested were a larger double layer mask (type MA-2N) with a resistance of approximately 0.5 cm H20/l/s, a smaller double layer mask (type MA-2s)) and a single layer wire screen (type MA-IN) mask with a resistance of about 0.25 cm H20/l/s. All these items were manufactured by Glottal Enterprises, Syracuse, USA, the maker of the Rothenberg mask systems. Rothenberg mask on ceramic adaptor 7 mask preamp1 ifier 3- wooden plate - - Sweep inpu ' Dyn alyze Loudspeaker Fig. 3. The experimental set-up for the sweq-tone measurements. The frequency sweep provided by the loudspeaker was first tested with a free field sweep. This was measured with a Briiel & Kjaer (BK) 1/2" microphone, with a flat frequency response high above 5 khz, and the Dynamic Signal Analyser. Fig. 4 shows that the sweep provided by the loudspeaker was linear within 3 db from 200 Hz to approximately 3 khz. Thus, it can be concluded that the loudspeaker provided a sinusoidal volume flow with an amplitude that decreased with a constant slope of 6 db/octave above 200 Hz to close to 3 khz. RESULTS Sweep-tone analysis was carried out for each of the three masks as described above. In Fig. 5, the result of the sweep for the larger double layer mask, type MA- 2N (with the MTW-transducer and the MSIF-2 preamplifier) is shown. A -6 db/octave line was drawn in the figure. The response of this mask system was linear within 3 db to 1.6 khz. At around 1.9 khz there was a pronounced dip of about 15 db from the -6 db line. Slightly below 3 khz there was a peak. The level fell for frequencies above 3 khz, probably due to the low-pass filter of the mask electronics. There was also a small dip around 400 to 500 Hz. Vibrations could be felt in the measuring equipment during the sweep around these frequencies, indicating that this dip was caused by a mechanical resonance in the measuring equipment itself. This small dip was also observed during the sweep tone measurements for the other masks tested. In Fig. 6 are shown the results of the sweep tone measurements for all three masks described above. All three mask systems had a linear frequency response within 3 db up to khz. They all had a zero around 2 khz and a peak near 3 khz.

6 STL-QPSR 2-3/1992 Fig. 4. A free field sweep registered with a BriieleKjaer 1/2" microphone held over the loudspeaker with no mask present. The ripple in the curve is due to noise in the experimental room I I I I I l l 1 I I I I ' 200 Log Hz 2 5k Fig. 5. The frequency response from a sweq with the equipment as described in Fig 3. A larger double layer mask (type MA-2N) was used. A -6 db /octave line was superimposed for comparison.

8 STL-QPSR 2-3/ 1992 Fig. 7 shows a sweep with the MTW transducer without a mask in free field approximately 2 cm above the ceramic adapter and the loudspeaker and another sweep with the transducer held 2 cm above the larger double layer MA-2N mask (which was held firmly to the ceramic adapter). The sound level was generally lower with the mask present. The free field recording without the mask did not show any dip near 2 khz, whereas a dip was present near 2 khz with the mask. There was a peak near 3 khz in both recordings, probably caused by the first "formant" in the acoustic system, as mentioned above. In Fig. 8 similar sweeps were made with a 1/2" BK microphone in free field above the loudspeaker and adapter without the mask and with the mask held firmly to the adapter. A dip was present near 2 khz with, but not without, the mask I I I I I I I I J 200 Log Hz 2 5k Fig. 8. The frequency response from a free field sweep registered with a 112 " BriielaKjaer microphone held above the MA-2N mask (solid line) and without the mask (dashed line). Further analysis of the zero near 2 khz In order to investigate if the zero near 2 khz was caused by a mechanical resonance in the mask or microphone, we excited the mask (type MA-2N) and microphone with a mechanical pulse and measured the mechanical vibrations at different points. The pulse was produced by a small hammer fitted with an accelerator (PCB Impulse force hammer type 086M37) connected to the HP Dynamic Analyzer. This revealed a mechanical resonance in the connector wire hold of the microphone at approximately 1.9 khz. This resonance was effectively dampened by a piece of plasticine. By repeating the sweep-tone measurements we could conclude that this mechanical resonance only had a marginal effect on the zero. Since there was no mechanical resonance causing the dip, we suspected a cross resonance or a shunt in the mask itself. By filling out the nasal part of the mask with

9 STL-QPSR 2-3/ 1992 plasticine we could move the zero at 2 khz upwards in frequency to over 3 khz (Fig. 9) Fig. 9. The frequency response from sweep tone measurements using the equipment as described in Fig. 3, including the MA-2N mask (solid line) and with plasticine dampening in the nasal part of the mask (dotted line). DISCUSSION The acoustic properties of the mask system have been previously studied by Rothenberg (1973; 1977). He reported the mask to be linear for a static airflow from zero up to well above 1 l/s, which is satisfactory for most clinical conditions. He also found that for open vowels, such as /a/ and /a/, the lowest two formants (which are most important for the shape of the waveform during inverse filtering) were lowered by Hz due to the increase in effective vocal tract length from the mask. The transmission characteristics of the mask were found to be linear within 6 db for speech range frequencies, except for a pronounced dip at khz. This dip probably corresponds to the dip found around 1.7 khz in our previous study (Badin, et al., 1990) and between 1.8 and 2 khz in the present report. Badin, et al. (1990) studied the effect of the mask on radiated speech sound pressure by means of LTAS analysis of speech samples with and without the mask. This analysis showed a reduced sound pressure level around 2 khz in recordings with the mask than without. This indicates that the zero near 2 khz exists both in the speech measurements and in the sweep-tone measurements with the mask. All free field measurements without the mask failed to show any zero. This indicates that the zero was caused by a cross resonance or a shunt in the mask itself. In our previous study it was shown that the frequency of the zero varied somewhat for different dampening foam settings in the mask microphone and the mask (Badin, et

10 STL-QPSR 2-3/ 1992 al., 1990), but the dip could not be eliminated. Pressing the mask with varying force to the adapter or face did not significantly affect the dip. As described in this paper, a marginal dampening of the zero resulted from dampening a small cavity in the transparent plastic microphone adapter. From the present experiments it is also apparent that the measurement equipment itself was not responsible for the dip. We have also made some tests of a prototype to a new pressure transducer provided by Glottal Enterprises (type PTW). Those sweep-tone measurements failed to give a better frequency response and the zero mentioned above was also present with that transducer. We conclude that the zero near 2 khz, which limits the response of the mask system tested here, is due to acoustic shunting in the mask. By filling out the nasal part of the mask the dip was moved up in frequency, resulting in a substantially improved frequency response. Implications for clinical use of the mask system tested In our previous study (Badin, et al., 1990) we concluded that the frequency response of the mask system tested was essentially flat up to 1 khz. However, after additional testing of different masks it seems that the response of the mask systems tested here was flat within 3 db to 1.6 khz. This includes harmonics for a male speaker with a normal fundamental frequency around 120 Hz. For females it would include 7 harmonics in the normal speaking range. The frequency range also includes the first and second formants for open vowels, such as /a/, used in speech samples by most researchers who published results from studies with the Rothenberg mask (Gauffin & Sundberg, 1989; HertegArd & Gauffin, 1991; HertegArd, Gauffin, & Karlsson, 1992; Holmberg, Hillman, & Perkell, 1988; Karlsson, 1992; Lofquist, 1992; Rothenberg, 1973; 1977). This frequency range is probably sufficient to describe the most important aspects of the glottal waveform. Normally closed vowels (such as /i/) and nasalized vowels are avoided in speech samples because of difficulties in performing a proper inverse filtering, regardless of whether a mask recording or an ordinary pressure microphone recording is used. The level of the higher harmonics are usually lower for pathological voices which often have insufficient vocal fold closure. The transglottal waveform does not often seem to be affected by higher harmonics in these cases. On the other hand, for trained voices, such as for a singer with a prominent so-called The shunting ca\ity singer's formant, harmonics near 3 khz may 4 influence the shape of the glottal velocity waveform. In these cases results from studies using the mask systems tested here must be evaluated with caution if details of the waveform are described. Glottal waveform parameters used in voice synthesis, such as in the so-called Liljencrants-Fant (LF) glottal model, are dependent on a correct frequency response for higher frequencies (Fant, Liljencrants, & Lin, 1985). Fig. 10. The Rothenberg mask. The arrow is pointing to the shunting cavity.

11 STL-QPSR 2-3/ If data are collected for such a model, an ordinary pressure microphone recording or special mask systems (Rothenberg, 1987) are probably preferable. The improved frequency response with the nasal part of the mask filled out by plasticine indicate that some modifications in mask design might substantially improve the response. The placement of holes with wire screen also in the nasal part of the mask might have the same effect (Fig. 10). CONCLUSION The present experiments indicate that the tested mask systems seem to be linear within 3 db from zero airflow to around 1.6 khz. For a male voice with a fundamental frequency of 120 Hz during phonation this includes harmonics and for an open vowel like /a/ this also includes both the first and seconds formants, which are the most important for the waveform. The same will be true for female voices with fundamental frequencies around 200 Hz. For pathological voices the level of the higher harmonics are often lowered due to insufficient vocal fold closure. This means that the frequency response of the mask seems sufficient for measurements on most patients with voice problems. However, if details of the waveform are studied from mask recordings made on subjects with trained voices and with more prominent higher harmonics, the results must be evaluated with caution. A zero in the frequency response near 2 khz seems to be caused mainly by the shunting of the nasal part of the mask. This dip could be moved up in frequency by filling out the nasal part of the mask, resulting in an increase in the response to around 3 khz. A modification in the mask design with wire screen placement in the nasal part might have a similar effect. ACKNOWLEDGEMENT We would like to express our thanks to Erik Jansson for assistance and valuable advice during the experiments. REFERENCES Badin, P., Hertegard, S., & Karlsson, I. (1990): "Notes on the Rothenberg mask," STL-QPSR No. 1, pp 1-7. Fant, G., Liljencrants, J., & Lin, Q. (1985): "A four parameter model of glottal flow," STL- QPSR NO. 4, pp Gauffin, J. & Sundberg, J. (1989): "Spectral correlates of glottal voice source waveform characteristics," J.Speech & Hear.Res. 32, pp Hertegard, S. & Gauffin, J. (1991): "Insufficient vocal fold closure as studied by inverse filtering," pp in (J. Gauffin & B. Hammarberg, eds.), Vocal Fold Physiology: Acoustic, Perceptual and Physiolo~ical Aspects of Voice Mechanism, Singular Publ. Group, Inc. San Diego, CA. Hertegard, S., Gauffin, J., & Karlsson, I. (1992): "Physiological correlates of the inverse filtered flow waveform," J. Voice 6:3, pp Holmberg, E., Hillman, R., & Perkell, J. (1988): "Glottal airflow and transglottal airpressure measurements for male and female speakers in soft, normal and loud voice," J.Acoust.Soc.Am. 84, pp Isshiki, N. (1964): "Regulatory mechanism of voice intensity variation," J.Speech & Hear.Res. 7, pp

12 STL-QPSR 2-3/ 1992 Karlsson, I. (1992): Analysis and Synthesis of Diferent Voices with Emphasis on Female Speech, Diss., Dept. of Speech Communication and Music Acoustics, KTH, Stockholm. Lofquist, A. (1991): "Inverse filtering as a tool in voice research and therapy," Scand. 1. Logopedics and Phoniatrics 16, pp Rothenberg, M. (1973): "A new inverse-filtering for deriving the glottal airflow waveform during voicing, " ].Acoust.Soc.Am. 53, pp Rothenberg, M. (1977): "Measurement of airflow in speech," ].Speech G. Hear.Res. 20, pp Rothenberg, M. (1987): "Cosi fan tutte and what it means or Nonlinear source-tract acoustic interaction in the soprano voice and some implications for the definition of vocal efficiency," pp in (T.B. Baer, C. Sasaki, & K.S. Harris, eds.), Laryngeal Function in Phonation and Respiration (Proc. Vocal Fold Physiology Conf. 1985), Singular Publ. Group, Inc., San Diego, CA. Schutte, H. (1980): "The efficiency of voice production," Groningen (issued by the author).

Quarterly Progress and Status Report. Notes on the Rothenberg mask

Dept. for Speech, Music and Hearing Quarterly Progress and Status Report Notes on the Rothenberg mask Badin, P. and Hertegård, S. and Karlsson, I. journal: STL-QPSR volume: 31 number: 1 year: 1990 pages: