Philips J. Res. 39, 94-102, 1984 R 1084 APPLICATIONS OF A DIGITAL AUDIO-SIGNAL PROCESSOR IN T.V. SETS by W. J. W. KITZEN and P. M. BOERS Philips Research Laboratories, 5600 JA Eindhoven, The Netherlands Abstract Digital audio-signal processors are already being used in professional applications (mixing consoles, artificial reverberation etc.). In this paper some applications of such a device in a consumer product, e.g. a television set, are discussed. Apart from well-known functions such as tone, volume and balance control the following rather new functions can also be implemented using a digital processor: - a pseudo-stereo circuit that can be used if the television set is equipped to reproduce stereophonic signals, but the program is transmitted in mono, - a stereo base expanding circuit to be used when the two loudspeakers are closely spaced, - a circuit for enhancing the impression of spaciousness, - a volume control with combined loudness correction. PACS number: 4388 1. Introduetion Until now, most of the applications of digital signal processors in the audio field have been professional applications. Because of their rather complex structures and the many components required, those processors were too expensive for consumer products. However VLSI now makes it possible for consumer products to take advantage of digital processing. The advantages are: There is no signal distortion and no processing noise is added if the number of bits that represents the signals in the processor is large enough. A perfect delay line can be implemented digitally which is impossible in an analog way. For many applications a delay line is needed, e.g. a spaciousness enhancing circuit. Quite complex processing can be implemented rather easily. The implementation of a function in a digital processor is in fact a program that consists of instructions, coefficients and (data) RAM addresses. By changing this program, a different function can be implemented, so many functions can be covered with a single processor. In this paper, after a short discussion of the structure of a digital processor, we discuss four applications that can be implemented and used in television sets. 94 Philip, Journul of Research Vol.39 No. 3 1984
Applications of a digital audio-signal processor in T.V. sets 2. Structure of a digital processor The basic structure of the digital processor used for our experiments is depicted in fig. 1. Via the I/O ports data from an external device such as an AD converter can be stored in the data RAM. The multiplier can multiply these Fig. 1. Basic structure of digital processor. data by coefficients from the coefficient RAM. The results of some multiplications can be added in the accumulator, which is again connected with the data RAM. Processed data are available for external devices such as a DA converter via the I/O ports. The processor is controlled by a program in the control, address and coefficient RAMs. Each instruction consists of a control word and if necessary a RAM address and a coefficient. A multiplication, for example, involves data on a particular RAM address, a coefficient in the coefficient RAM and a control word in the control RAM. Because we have used random access memories for coefficients as well as for addresses and control words, which can all be updated by a microprocessor via a control interface, the processor is very flexible. This has turned out to be very useful for experiments with different kinds of algorithms. Of course' a digital processor has limitations. One of them is the size of the data RAM, which fixes the maximum delay time that can be implemented. The word-size of the coefficients determines the accuracy of the filters implemented, the word-size of the data RAM the processing noise. The complexity of the applications depends among other things on the speed of the processor, i.e. the speed of the RAMs, ROMs and the multiplier. By using pipe-lining techniques the processor is speeded up: 3. Applications Apart from well-known functions such as tone, volume and balance control some rather new functions can also be implemented with a digital processor. In this paper we discuss four of them. The first function is a pseudo-stereo circuit. At present few television programs are transmitted with stereophonic Phillps Journalof Research Vol. 39 No. 3 1984 95
W. J. W. Kitzen and P. M. Boers sound. With a pseudo-stereo circuit it is possible to convert a monophonic input signal into pseudo-stereophonic output signals. Another function that can be used in a television receiver is a stereo base expanding circuit. In most television sets equipped to reproduce stereophonic signals the two loudspeakers are closely spaced, so the physical stereo base is rather small. By using a stereo base expanding circuit, the apparent stereo base can be widened. We also discuss a spaciousness-enhancing circuit with which it is possible to influence the listener's sense of spaciousness. Finally we describe a volume control with combined loudness correction. 3.1. A pseudo-stereo circuit With monophonic reproduetion the sound is localized in one direction. The objective of a pseudo-stereo circuit is that the sound should be perceived as if it arrives from many directions. Several psycho-acoustical phenomena related to directional hearing can be explained if we assume that our hearing system performs a short-time crosscorrelation process on the two ear-signals. A normalized binaural short-time cross-correlation function as defined by Blauert 1) is: t f /(B) r(b + r) o(t - r) db if:! (r, T) = -00 V J /2(B) O(t - B) db J r 2 (B) O(t - B) db where /(t) is the left signal, r(t) is the right signal and O(t) is a weighting function: o(t) = exp (- t!re). The time constant re is, according to Blauert, certainly smaller than a few milliseconds. By.determining the time r for which the cross-correlation function is maximum, our hearing is able to localize the sound source. If the cross-correlation function is very small for all values of r, then, according to this theory, the sound cannot be localized and the sound image is broadened or diffusely localized. So in order to realize pseudo-stereo the binaural cross-correlation function should be small for all values of r. If a set-up with two loudspeakers is used, a prerequisite for a small binaural cross-correlation is a small crosscorrelation between the two loudspeaker signals. This can be achieved by realizing two impulse responses (from input to the left loudspeaker and from input to the right loudspeaker) having little correlation. There are many practical ways of doing this. Because the localization of sounds is determined by 96 Phlllps Journalof Research Vol.39 No.3 1984
Applications of a digital audio-signal processor in T.V. sets about the first 5 ms of the impulse responses, the actual impulse responses need not be longer than 5 ms. Figure 2 shows a solution proposed by Lauridsen 2) using two complementary comb filters. The left and right output signals in fig. 2 have little correlation since both phase and amplitude tro characteristics of the left and right amplitude 1 3 input "Tbrv 21" 21" - frequency Fig. 2. Pseudo-stereo circuit (see ref. 2). + ~ - frequency impulse responses are different. However, because the amplitude-frequency responses are not flat, these filters may introduce colouration. A solution given by Schroeder "), where the filters in the left and right channels are both allpass filters but have different phase characteristics, is shown in fig. 3. Because of the different phase characteristics an envelope delay difference between left and right channel is introduced which is frequency-dependent. As explained by Schroeder, this results in localization of some frequency envelope delay difference Î input -07 Fig. 3. Pseudo-stereo circuit (see ref. 3). Phllips Journalof Research Vol. 39 No. 3 1984 97
W. J. W. Ki/zen and P. M. Boers components at the position of the left loudspeaker and other components at the right loudspeaker. However, if the time delay r is chosen too large, allpass filters mayalso introduce audible colouration. If we consider a short-time Fourier transform (integration over a finite time interval) of the impulse response, such a filter is not allpass at all, and, as we have seen before, the integration time of our hearing is only a few milliseconds. 3.2. Stereo base expanding Stereo sound reproduetion in a stereo set-up may create virtual sources at every position between the loudspeakers, Le. the stereo base. If the loudspeakers are closely spaced, the small stereo base can be widened by applying delayed crosstalk in anti ph ase between the two channels, as has been explained recently 4). Suppose we apply a signal only at the right input (fig. 4). If no crosstalk is applied, a listener in the plane of symmetry localizes the source at the right virtual is ~rt1 \. ~\ \ \ \ Li -.--- -H~CDf+_Lo~ ~\\\ IOUT/1W real ~ Ri-.J..._--------II~+CDH-~ / Ra /,,/ virtual Is _~ / Fig. 4. Stereo base expanding circuit. loudspeaker position. If undelayed crosstalk in antiphase (r = 0) is applied, a virtual source will shift to the right. This is caused by the increased interaural phase and group delays 4). Thus, the right loudspeaker in fig. 4 is replaced by a virtual one shifted to the right and the left loudspeaker by a virtual one shifted 98 Phllips Journalof Research Vol.39 No.3 1984
Applications of a digital audio-signal processor in T.V. sets to the left, Le. the stereo base has been widened. Adding a small delay of about 0.1 ms in the crosstalk circuit will make the effect more pronounced. This is brought about by interaurallevel differences which, for a listener in the plane of symmetry, reinforce the effect of the interaural time delay differences. A disadvantage of this crosstalk in fig. 4 is that, in the low frequency range, the input signals R, and L, (when they are completely correlated) will be attenuated, whereas when uncorrelated the signals will be amplified. In the case of correlated signals (La = Ri) the output signals are: for WT«1t '2' When t.; and R i are uncorrelated and 1t.. 1 2 = 1 s, 1 2 : In the case of normal stereo signals, the left and right signals are highly correlated in the low-frequency range and uncorrelated at high frequencies. The attenuation of the low frequencies relative to the high frequencies is therefore: For a = 0.7 this is about 12 db. This can be corrected by using a bass-boost filter before the crosstalk processing. 3.3. A spaciousness-enhancing circuit In a real room, e.g. a concert hall, the impression of spaciousness is caused by the' early reflections via the side-walls that arrive within t < 100 ms 5). Because of these reflections signals with little correlation reach the ears of a listener. To enhance the impression of spaciousness in an artificial way, therefore, delayed signals should be added to the original signals that are uncorrelated. The signals should be delayed because we do not want to influence the localization of the sounds. Here too, then, as in the pseudo-stereo circuit, we must realize two impulse responses that are uncorrelated, but in this case the impulse responses should be longer (about 50 ms). A circuit with which this is possible shown in fig. 5a. In order to produce many reflections, a recursive filter is used. Each channel consists of a delay line with a feedback loop. The Phllips Journal of Research VoJ. 39 No. 3 1984 99
W. J. W. Kitzen and P. M. Boers -time t direct sound reflections -time Fig. 5a. Spaciousness-enhancing circuit. output of that structure is added to the original input signal. So the impulse responses from the input to the outputs consist of an impulse at t = 0, which represents the direct sound, and a large number of reflections. By choosing different delays and feedback factors in the left and right channels, the correlation of the impulse response of the left channel with the impulse response of the right channel is made small. By changing the amplitude of the uncorrelated signals to be added to the original signals (variable 13), the impression of spaciousness can be adjusted. The comb filter effect in the circuit may produce colouration. This effect can be reduced by making the feedback factor a frequency dependent (fig. 5b). As a consequence the length of the impulse responses for frequencies higher than about 1500 Hz is reduced. However this - frequency Fig. 5b. Frequency dependency of feedback factor u. 100 Philips JournnI of Research Vol.39 No. 3 1984
Applications of a digital audio-signal processor in T.V. sets does not affect the impression of spaciousness because this impression, according to Barron, is mainly determined by the low frequencies 5). Because the localization of the sources depends only on the first arriving sound, i.e. the first 5 ms of the impulse responses, it is unaffected by the circuit when Tt,r > 5 ms. If the input signal is to be reproduced by two loudspeakers that are closely spaced, the decorrelated signals can also be reproduced via an expanded stereo circuit. 3.4. Volume control combined with loudness-correction Loudness is the subjective quantity by which we measure the perceived sound pressure level. The way loudness increases with sound pressure level strongly depends on frequency for frequencies below about 400 Hz 6 ). As follows from the equalloudness contours (fig. 6), the reproduetion of music 2 5 2 5 2 5 104 2 _ frequency (Hz) Fig. 6. Equal-loudness contours (see ref. 6). at a level below the originallevel will lead to a decrease in loudness of the low frequencies relative to the high frequencies i.e. a change in timbre. To compensate for this effect a simple first-order filter can be implemented which enhances the low frequencies more the lower the setting of the volume control (fig. 7). Here the output level of 0 db corresponds to the level in the studio mixing room. We assume that the timbre at this level is optimal without any correction. Philips Journalof Research Vol. 39 No. 3 1984 101
W. J. W. Kitzen and P. M. Boers oufpuffdb} volume = i maximal O~--------------- -10dB -20dB volume control - frequency Fig. 7. Volume control combined with loudness-correction. 4. Conclusions We have discussed some applications of a digital signal processor in a T.V. set. One of the advantages of our digital processor, because of its flexibility, is the ease with which one can experiment with different algorithms and compare their qualities. If a single digital processor is to do all the processing for tone and balance control as well as for pseudo-stereo, stereo base expanding, spaciousness and volume control with combined loudness correction, then its speed should be about 100 instructions/sample period. The wordlength of the data depends on the desired SIN ratio, the wordlength of the coefficients on the desired accuracy of the filters. With the VLSI technology of today it is possible to integrate such a processor on one single chip. Such a chip will constitute a powerfull building block which performs all kinds of processing, not only for applications in a television set but also for other sound reproducing systems. REFERENCES ') J. Blauert, Raumliches Horen, S. Hirzel Verlag, Stuttgart, 1974, English 'Edition: Spatial Hearing, MIT Press, Boston, Mass., 1983; cf. 278. 2) H. Laur i'dsen and F. Schleger, Gravesaner Blatter, H.S, 27 (1956). 3) M. R. Schroeder, J. Acoust. Soc. Am. 33, 1061 (1961). 4) P. M. Boers, ABS 73rd Cony. Eindhoven, The Netherlands, No. 1967 (A5) (March 1983). 6) M. F. E. Barron, The effects of early reflections on the subjective quality in concert halls, Doctor's thesis, University of Southampton (1974). 6) D. W. Robinson and R. S. Dadson, Brit. J. Appl, Phys. 7, 156 (1956). ISO Recommendation R226-1961. ' 102 Phillps JournnI of Research Vol.39 No. 3 1984