Sound for Documentary, Part 2: What is it and why is it so important?

In Part 1 of this series we covered eight fundamental concepts of sound for documentary. In our next post we will move closer to the practical, nitty gritty matters of sound for documentary, however, before we do, I want to briefly get into what sound is. We are going to focus on just some of basics we need to understand as documentary media makers in order to become better sound recordists, editors, and designers. If you want to go deeper into this question, future posts will include more links to resources get you going with that.

SNDTo put it in one sentence, sound is vibrations in air that touch us. Like the proverbial tree that falls in the forest, if nobody is around to hear it, did it really make a sound? The vibrations caused by the falling tree don’t become sound until they touch us. And these waves literally touch us, according to developmental psychologist Ann Fernald, who describes sound as “touch at a distance” in the Musical Language episode of RadiolabI believe the intimacy of touch Fernald is describing has something to do with why there’s such a strong connection between our emotions and sound.

In their landmark book, Spaces Speak, Are You Listening?: Experiencing Aural ArchitectureBarry Blesser and Linda-Ruth Salter suggest that social relationships are strongly influenced by the way that space changes sound. Every environment has an aural architecture, whose attributes contribute to the fabric of human culture. We can sense spatial configurations through hearing, for example, with our eyes closed we can find an open doorway or know if we are in a room with low ceilings. The aural imprint of Tokyo is quite different from that of New York, and once we have experienced one of these cities we are likely to recognize their aural imprint much in the same manner we recognize images. Sound recording on a documentary need not just be about getting clear dialog, it can also be about capturing the aural imprint of the spaces we are observing. Whenever possible, when I am out observing with my camera, I try to record stereo ambience as well.  Given that a good stereo audio recorder can be had these days for about $200 there’s no reason a separate audio recorder can’t be capturing the aural landscape when you are out capturing video with your camera. In fact, given where things are heading technologically and aesthetically, perhaps we should go the distance and record surround ambience. It is all well and good that Hollywood features create amazing soundscapes during the post-production process, but as documentarians, we should try when we can to capture the actual aural landscape of the locations we observe.

A particularly impressive example of the aural landscape as an integral component of a documentary is Habana – Arte nuevo de hacer ruinas (2006, Florian Borchmeyer, English title: Havana – The new art of making ruins). For Borchmeyer the beauty of Havana lies in the poetry of its ruins and through this documentary he presents a portrait of the inhabited ruins of Havana in their final moments before the majestic buildings of pre-Castro Cuba either crumble or are renovated. When I spoke with Borchmeyer at the Rio International Film Festival the year the film was released, I told him I was impressed by the dimensionality the aural landscape in the film. He explained that when he shot the interviews, he also recorded surround sound of the environment. As a viewer I was transported into the buildings and the stories the interviewees were telling, but there was something more going on, I felt as if I was enveloped by the landscape in a manner I have rarely experienced in a documentary. All too often capturing the aural landcape is the farthest thing from a documentarians mind, however, listening to and recording the aural landscape can create a sensory experience that will engage viewers in a deep, primal manner.

From a documentary filmmaker’s perspective, sound offers a way of capturing the auditory sense of space in a manner complimentary to the camera’s visual recording of that space. Blesser and Salter remind us that from prehistoric multimedia cave paintings, to classical Greek open-air theaters, through the time of Gothic cathedrals and French villages, auditory spatial awareness has been a prism through which cultural attitudes and a sense of space are revealed to us. Their book really opened up my ears to the importance and potential of sound as an integral component expanding the sensory dimension of documentary.

What is the physical nature of these vibrations that touch us the way they do? A movement, for example a hammer striking a table or those made by human vocal cords, creates waves in the air, much like when you thrown a rock, or pour a stream of water, into still waters. You see waves that move out in all directions away from the point of impact. It’s important to appreciate that these waves go in all directions, bounce off walls, travel around corners, it’s very messy. In addition, the intensity of these waves falls off quickly the farther you are from the source of the vibration.


Sound is not like light, which travels in a straight line. Recording sound is not like taking pictures with a camera, there is no equivalence to the edge of the frame. Sound travels around corners. Reflections from hard surfaces add unwanted reflections that smear the sound. Other sounds from all over the place mixes in with the sound you want to record. Sound waves are promiscuous little devils compared to the stoic light rays. With your camera the frame is the frame. You don’t see what’s out of the frame, but not so with sound, we hear everything around us and so does the microphone we use for recording. And while our sensory apparatus is good at separating a salient sound out of a collection of sounds (a phenomenon known as the cocktail party effect), the microphone and the recording systems we use capture everything with annoying equality.

The process of recording sound involves capturing an analogue of the sound waves and converting the signal as faithfully as possible into a stream of digital data. This is usually accomplished with a microphone, an electromechanical device with a delicate diaphragm that vibrates in response to the waves that fall upon it. These movements are translated into an electrical signal, which is amplified and then sampled (converted into a stream of digital data) and then stored in a data file. Two common audio file formats used in production are WAV (Waveform Audio) or AIFF (Audio Interchange File Format). Both formats are uncompressed and can reproduce the exact same signal on playback that was initially stored in them. How this process occurs through the signal chain from microphone to data file has an effect on the quality of the sound. Microphone technologies, the designs typically used in documentary work, along with various recording options, will be discussed in a future installment, as will microphone placement strategies and noise management techniques. In later posts we will also get into more details about audio files and other technical details.

What are we recording

For now, what’s most important to remember is that when you’re out in the field with your microphone and recording device, your microphone is always picking up a mix of three things:

  1. The direct sound from the source you want to record,
  2. reflected sound from surfaces close to the source (including troublesome reverberation from walls in a space), and
  3. noise, common sources being the natural background sounds in an environment and wind noise.

In addition to these three, you also get as a bonus noise caused by electrical circuits in the signal chain, and if your unlucky, you might pick up some interference from a nearby sources of electromagnetic radiation or perhaps crackling sounds from a defective cable.

You always want to maximize the relative level of #1 while reducing #2 and #3 as much as possible. Move away from reflective surfaces, get the microphone as close in as possible (but not too too close, there’s the proximity effect, bass exaggeration when sources are very close, that works well for radio announcers but is not always what you want for dialog recording). There are also issues with breath pops (wind screens help with that).

Not only does sound radiate in all directions, it also falls off so the intensity falls off quickly the farther you are from the source, or from a physics perspective, sound radiation follows the Inverse Square Law, which is a fancy way of saying that sound intensity from a point source of sound drops off at a rate proportional to the square of the distance. Or simpler yet, double the distance from the source, you get a quarter of the original sound energy. That’s why it’s so important to get that microphone close to the source you are recording.

When you play back the sound you recorded, the reverse process takes place, the sound you recorded, stored as a digital representation (created through sampling and encoding) is converted back to an analog signal which is amplified and in turn causes the movement of the speaker diaphragm, creating  vibration in the air. The sound waves that result are never exactly like the original, there’s always something that changes in the process. There’s always noise and distortion added along the way. We try to minimize these problems using good techniques and good gear.

Some terms that will come up a lot when working with sound include the Decibel, frequency, and more. An overview of human hearing, intensity, signal levels, frequency, along with some other fundamental terms and concepts, will be the topic of our next post. Hopefully these will provide a foundation that helps us better understand what we’re doing when we work with sound. After that we will dive into tools and techniques. Until then, good night and good sound.


Image credits: 1. “Water Spire’ by likeablerodent; 2. “Three things” by the author.

Note: Note: Minor edits were made to this post on 11/17/2015 to remove a broken YouTube link to a clip from Havana – The new art of making ruins.



  1. Alice Apley says

    I love this: “With your camera the frame is the frame. You don’t see what’s out of the frame, but not so with sound, we hear everything around us and so does the microphone we use for recording.” It reminds me why it is so hard to have one person doing both camera and sound. So much to pay attention to, and the parameters are so different.

  2. kathryn Mora says

    Thank you for your brilliance and sharing it with us. You possess the heart of a teacher and share everything you know about your subject with your students. Your mastery of the subject goes far beyond the surface. You not only inspire me, but remind me how much I love documentary filmmaking and the joy and passion I feel to learn more. Somehow, I’ve allowed myself to fall off the path. Do I think I’ll live forever and have enough time to do what’s really important and rewarding to me? I know that’s not true, although I act as if it’s true. Also, you remind me that I want to take a class with you because you not only have mastered your subject on a deeper level, but your nature is to share what you know with others to help them grow, too. Thank you.