Auditory Theory: Acoustics
Lecture 016 Instruments II
Reading Assignment for Lecture 017
Before next lecture please read Sections
- 4.5 The speaking and singing voice 198
pages 198 to 208 of Acoustics and Psychoacoustics. We may have a brief quiz on these sections at the beginning of the next class.
Brain Bullets 
Sound source in singing
- The sound source in singing is the acoustic result of the vocal folds vibrating in the larynx which is sustained by air flowing from the lungs. The sound modifiers in singing are the spaces between the larynx and the lips and nostrils, known as the 'vocal tract', which can be changed in shape and size by moving the 'articulators', for example the jaw, tongue and lips
- As we sing or speak, the shape of the vocal tract is continually changing to produce different sounds. The soft palate acts as a valve to shut off and open the nasal cavity (nose) from the airstream.
- Vocal fold vibration in a healthy larynx is a cyclic sequence in which the vocal folds close and open regularly when a note is being sung. Thus the vocal folds of a soprano singing A4 <to = 440.0 Hz) will complete this vocal fold closing and opening sequence 440 times a second. Singers have two methods by which they can change the f0 of vocal fold vibration: they alter the stiffness of the folds themselves by changing the tension of the fold muscle tissue or by altering the vibrating mass by supporting an equal portion of each fold in an immobile position. Adjustments of the physical properties of the folds themselves allows many trained singers to sing over a pitch range of well over two octaves.
- The frequency spectrum of the regular pressure pulses generated by the vibrating vocal folds during speech and singing consists of all harmonics with an amplitude change on average of -12 dB per octave rise in frequency (see the illustration on the right in Figure 4.32). Thus for every doubling in frequency, equivalent to an increase of one octave the amplitude reduces by 12 dB. The amplitudes of the first, second, fourth and eighth harmonics (which are separated by octaves) in the figure illustrate this effect.
- The shape of the acoustic excitation spectrum remains essentially constant while singing, although the amplitude change of -12 dB per octave is varied for artistic effect, singing style and to aid voice projection by professional singers (e.g. Sundberg, 1987). The spacing between the harmonics will change as different notes are sung, and Figure 4.34 shows three input spectra for sung notes an octave apart. Trained singers, particularly those with Western operatic voices, exhibit an effect known as 'vibrato' in which their f0 is varied at a rate of approximately 5.5-7.5 Hz with a range of between +-0.5 and +-semitones
Sound modifiers in singing
- The regular series of pulses from the vibrating vocal folds are modified by the acoustic properties of the vocal tract (see Figure 4.21). In acoustic terms, the vocal tract can be considered as a stopped tube (closed at the larynx which operates as a flowcontrolled reed, open at the lips) which is approximately 17.5 cm in length for an adult male. When the vowel at the end of "announcer" is produced, the vocal tract is set to what is referred to as a neutral position, in which the articulators are relaxed, and the soft palate (see Figure 4.31) is raised to cut off the nose; the vowel is termed 'non-nasalised'. The neutral vocal tract approximates quite closely to a tube of constant diameter throughout its length and therefore the equation governing modal frequencies in a cylindrical stopped pipe can be used to find the vocal tract standing wave mode frequencies for this vowel.
- The frequencies for the neutral vowel, and these are often rounded to 500 Hz, 1500 Hz and 2500 Hz for convenience. When considering the acoustics of speech and singing, the standing wave modes are generally referred to as 'formants'. ldealised frequency response curves for a vocal tract set to produce the vowels in the words fast, feed and food are shown in Figure 4.33. and the centre frequency of each formant is labelled starting with 'F1' or 'first formant' for the peak that is lowest in frequency, continuing with 'F2' (second formant) and 'F3' (third formant) as shown in the figure. The formants are acoustic resonances of the vocal tract itself resulting from the various dimensions of the vocal tract spaces. These are modified during speech and singing by movements of the articulators. When considering the different sounds produced during speech, usually just the first, second and third formants are considered since these are the only formants whose frequencies tend to vary. Six or seven formants can often be identified in the laboratory and the higher formants are thought to contribute to the individual identity of a speaking or singing voice. However, in singing important contributions to the overall projection of sound are believed to be made by formants higher than the third. In order to produce different sounds, the shape of the vocal tract shape is altered by means of the articulators to change its acoustic properties. The perturbation theory principles explored in the context of woodwind reed instruments (see Figure 4.23) can be employed here also (Kent and Read, 1992). Figure 4.35 shows the displacement nodes and antinode positions for the first three formants of the vocal tract during a neutral non-nasalised vowel,
- The representation of the formant structure in the output spectrum is important if the listener is to identify different vowels. Figure 4.37 suggests that somewhere between the G above middle C and the G an octave above, vowel identification will become increasingly difficult. This is readily tested by asking a soprano to sing different vowels on mid and top G as shown in the figure and listening to the result. In fact, when singing these higher notes, professional sopranos adopt vocal tract shapes which place the lower formants over individual harmonics of the excitation so that they are transmitted via the vocal tract with the greatest amplitude. In this way, sopranos can produce sounds of high intensity which will project well. This effect is used from approximately the C above middle C where the vocal tract is, in effect, being 'tuned-in' to each individual note sung, but at the expense of vowel clarity.
- This tuning-in effect is not something that tenors need to do since the ratio between the formant frequencies and the to of the tenor's range is higher than that for sopranos. However, all singers who do not use amplification need to project above accompaniment, particularly when this is a full orchestra and the performance is in a large auditorium.
|