Today I would like for us to accomplish the following.
Instructor's Background
1970 to 1982: leader of a traveling rock and roll band. Played electro mechanical and electronic keyboards (Hammond B3, Fender Rhodes, Mini Moogs, ARP2600, etc.)
1982 joined a California based synthesizer company call Sequential Circuits that had a huge success with a synthesizer called the Prophet 5, the world’s first programmable synthesizer which could store and recall its programs. Worked on the development of the Prophet VS and the Prophet 2000 before the company was bought by Yamaha Corporation of Japan.
1984 to 2000 worked with Yamaha as the Manager of Sound Design Office. Sound Design Office was an integral part of Yamaha’s product development cycle and SDO teams made the waves, wave ROMS and voicing for 27 Yamaha synthesizers during this period. We had ongoing developments in sampling, FM synthesis, Analog synthesis, Physical modeling and formant synthesis.
2000-2003 President of Sound and Light Productions - an internet design company specializing in internet audio applications and commercial database driven websites
When I call your name would you be kind enough to answer and tell us something about your major, and what you hope to get from a course in Auditory Theory.
2003- present day Relocate to Santa Fe in order to teach at the College of Santa Fe.
NOTE: Pass out the syllabus for the course and discuss the schedule. Point out that the first two weeks represent a money refundable drop out period and that we will be covering an overview of the subject terminating in a quiz. Cover disciplinary matters like attendance and lateness.
In order to describe the major goals some preamble is necessary.
Most of your musical education is about learning to play notes, however after you play a note a large number of events occur before that note is received by the ears of a listener. These events, if not understood by the musician can significantly change the intended result, possibly to the great detriment of your reputation. It doesn't matter how well you play if you get booked into a room with poor acoustics where the amplification of your performance is under the control of another individual who lacks the understanding necessaru to get a good sounding result. People will not say "Oh well, the room acoustics were not good and the audio reproduction was not well managed". They will instead say "well that wasn't very good" and they won't come and hear you play again. Unfortunately success as a musician is defined by getting people to come and hear you again.
NOTE: hand out the following two diagrams
In the 20th Century the music making chain looked like this

A minimum of ten people were involved in the process and each one of them had University degree level skills which rarely overlapped.
By contrast the 21st century recording chain is more likely to look like this:

The computer has put infinitely more power into the hands of the individual and skills which were optional in the last century will be required for everyone in this century. In order to survive this new music making chain you have to know a great deal more than the musicians of the last century. You now have to be the composer, the arranger, the copyist, the musician, the recording engineer, the producer and the mixing engineer. This is a lot to know and the primary goal of this course is going to be:
Most music education is about teaching you what note to play, when to play it and how hard to play it. However acoustics deals with what happends immediately after a note is played and the perilous journet sound makes into the brain of the listener
Music making demands predictable results yet your ears are constantly changing. It is the second goal of this course to enable you to understand and interpret correctly what it is that you are hearing and arrive at correct diagnosis of problems.
NOTE: Compare my goals with the goals they have outlined on the board earlier.
Discuss why some suggestions or will not fall into the purview of this class.
(10 minutes)
Except Chapter 7 which is covered by other courses.
This is an extremely modern and high quality text.
There are however two difficulties to it:
In order to help you with this I have created a metric converter which is available online at:
http://www.santafevisions.com/csf/html/math/000_metric_converters.htm
You may save this converter to your computer
No one will fail this class due to their inability to perform advanced mathematics. However you will be required to understand the principles which the mathematics are trying to prove.
For this purpose I have made Javascript calculators of all the essential calculations in this course
http://www.santafevisions.com/csf/html/math/001_acoustic_math.htm
It is rare (but not impossible) that in your career as a 21st Century musician you will be required to calculate anything, however as 5.1 sound systems become more popular it is essential that you should deeply understand how sound streams propagate, how they interact and how they can destroy each other. It is not just a stereo world any more and you had better not try a 5.1 sound project without a good understanding of acoustics.
Although we will try to stay roughly parallel to the reading we will not adhere slavishly to it. Our main efforts are going to be to demonstrate the relevance of the knowledge we acquire to real careers in the real world.
In addition to lectures/discussions we will be using the video of Dr Richard E Berg, a professor of acoustics at the University of Maryland. Dr Berg is one of the most charismatically challenged humans ever to commit his image to video but his science is good and his videos will prevent me from having to bring a lot of bulky antique aparatus into the class room.
These are available for viewing on the website
A.J.M. Houtsma
Institute for Perception Research (IPO)
Eindowven, The Netherlands
T.D.Rossing
Northern Illinois University
DeKalb, IL U.S.A.
W.M. Wagenaars
Institute for Perception Research (IPO)
Eindowven, The Netherlands
These are state of the art demonstrations of some subtle aspects of human hearing.
These are also available on the website.
We will discuss ideas for these individually. However you should probably expect to to make some soundmaking device of some sort. Hardware or software are both acceptable as long as they demonstrate a relevant principle of acoustics.
Idea Links:
You may use any model you choose but it must be capable of doing Base 10 logarithms, anti logs, Natural Logarithms, exponetial calculations and square roots.
This course is available online at
http://www.santafevisions.com/csf/html/index.htm
Before you can access the site you need to give me your preferred username and password and I will create an authorization for you.
Benefits:
About the Videos: It will help if you have a broadband connection to view these and they will still be slow to load because of their size. Tip: If you view these using Microsoft Eplorer rather than Safari or Netscape they will remain in the cache and start very quickly next time you view them. Enlarge your cache to several hundred meg in Explorer preferences.
The website will be closed during exams.
Within two weeks you are free to drop out of this course and get your money back from the college if you do not think it is right for you. Accordingly I will be giving a quiz at the end of the two weeks in order that you and I can evaluate together your affinity for the subject.
This course will concern itself with four basic subjects
What it is, how it behaves, how we measure it. This part of the course will lay the groundwork for us to discuss sound meaningfully and highlight some commonly misunderstood pitfalls.
How we hear and what distortions are introduced into the hearing process by the hearing mechanism. How to interpret what you hear and how to avoid psycho acoustic illusions.
What they are, how they work and what is important in their synthesis. The more you understand acoustic instruments the better you will be able to write for the, synthesize and record them.
The environment in which you listen to sound has an enormous effect on what you hear. Understanding and coping with your listening environment is an essential survival tool if you want to avoid surprises.
Within those headings are the following details
The terminology which will be used in the software and hardware that you use to make music will be the terminology of acoustic engineering...not of the street. In addition sound systems are becoming more complex with more speakers involved. Those who can calculate the results of using multi speaker systems will do better than those who cannot.
Probably the greatest contribution of psychoacoustics to music in recent years would be the MP3 file format and its descendants (MP4 etc) which are in the process of transforming the face of the music industry. It is likely in the future that the next generation of audio breakthroughs will come from psychoacoustics
First of all there is no "MP3" specification. Never was and never will be. There is only MPEG 1 LEVEL 3 which got abbreviated to "MP3"
Anybody here know how MP3 works?
MP3 is a "Perceptual encoder" which reduces data by not recording what it knows you cannot hear. Sound masks sound in predictable ways due to way our ears work.
• MPEG-1: 1.5 Mbits/sec for audio and video
About 1.2 Mbits/sec for video, 0.3 Mbits/sec for audio
(Uncompressed CD audio is 44,100 samples/sec * 16 bits/sample * 2 channels > 1.4 Mbits/sec)
• Compression factor ranging from 2.7 to 24.
• With Compression rate 6:1 (16 bits stereo sampled at 48 KHz is reduced to 256 kbits/sec) and optimal listening conditions, expert listeners could not distinguish between coded and original audio clips.
• MPEG audio supports sampling frequencies of 32, 44.1 and 48 KHz.
• Supports one or two audio channels in one of the four modes:
1. Monophonic -- single audio channel
2. Dual-monophonic -- two independent channels, e.g., English and French
3. Stereo -- for stereo channels that share bits, but not using Joint-stereo coding
4. Joint-stereo -- takes advantage of the correlations between stereo channels
Steps in algorithm:
1. Use convolution filters to divide the audio signal (e.g., 48 kHz sound) into 32 frequency subbands --> subband filtering.
2. Determine amount of masking for each band caused by nearby band using the psychoacoustic model shown above.
3. If the power in a band is below the masking threshold, don't encode it.
4. Otherwise, determine number of bits needed to represent the coefficient such that noise introduced by quantization is below the masking effect (Recall that one fewer bit of quantization introduces about 6 dB of noise).
5. Format bitstream

Example:
• After analysis, the first levels of 16 of the 32 bands are these:
| ---------------------------------------------------------------------- | ||||||||||||||||
| Band: | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| Level (dB): | 0 | 8 | 12 | 10 | 6 | 2 | 10 | 60 | 35 | 20 | 15 | 2 | 3 | 6 | 3 | 1 |
| ---------------------------------------------------------------------- | ||||||||||||||||
• If the level of the 8th band is 60dB,
it gives a masking of 12 dB in the 7th band, 15dB in the 9th.
Level in 7th band is 10 dB ( < 12 dB ), so ignore it.
Level in 9th band is 35 dB ( > 15 dB ), so send it.
[ Only the amount above the masking level needs to be sent, so instead of using 6 bits to encode it, we can use 4 bits -- a saving of 2 bits (= 12 dB). ]
MPEG Layers
• MPEG defines 3 layers for audio. Basic model is same, but codec complexity increases with each layer.
• Divides data into frames, each of them contains 384 samples, 12 samples from each of the 32 filtered subbands as shown below.

Figure: Grouping of Subband Samples for Layer 1, 2, and 3
• Layer 1: DCT type filter with one frame and equal frequency spread per band. Psychoacoustic model only uses frequency masking.
• Layer 2: Use three frames in filter (before, current, next, a total of 1152 samples). This models a little bit of the temporal masking.
• Layer 3: Better critical band filter is used (non-equal frequencies), psychoacoustic model includes temporal masking effects, takes into account stereo redundancy, and uses Huffman coder.
Stereo Redundancy Coding:
Intensity stereo coding -- at upper-frequency subbands, encode summed signals instead of independent signals from left and right channels.
Middle/Side (MS) stereo coding -- encode middle (sum of left and right) and side (difference of left and right) channels.
At the moment most instruments are sampled because sampling is quick , easy and cheap. However tthere will come a time in the future when processing power will be cheaper than memory and the pendulum will swing back to physical modeling. In 1999 I went to Japan to lead the team that voiced the VL1, Yamaha's first modeling synthesizer

Introduce the VL1
This is the age of the project studio. The Music Industry has put more and more power into your desktop computer which means that you no longer have to leave your bedroom to record a song. This is attractive to most musicians I have met. However it means that they will be making music in the bedroom which is not always an ideal listening environment.
If you understand how environment affects how we hear you will not be taken by surprise when a mix sounds bad.
Please read the first Chapter of Acoustics and Psychoacoustics and study deeply the first two sections:
|
Be prepared to answer a brief quiz on the material at the beginning of class.
|
|
|
|
|
There are two rules always observed by successful professional musicians:
|
You may find me relentless in my expectation of these.