Storytelling and Sound Illusions

Storytelling and Sound Illusions

We humans are easily fooled by sound. It sneaks into the side door to the brain, and in almost all cases, outside of music theory classes, we tend to resist analyzing it. We just feel it, and this is the case with sound effects too, not just “music.”

This fact makes sound for film/video/games storytelling very powerful. Sad to say that power doesn’t translate into big budgets, partly because movie and game producers are no more analytical about the creative power of sound than they are about the sounds themselves. They feel it, but they can’t rationalize funding it, because they don’t understand it.

One story involving a sound illusion that I like to tell happened during the mix of the second Star Wars film “The Empire Strikes Back.” The Director was sitting in the back of the mix room and noticed a moment where he thought there should be a change in the mix. He mentioned it to the mixers, who added it to their list of changes to be done. The Director got distracted by a phone call for maybe thirty minutes and wasn’t paying attention to what was going on at the mixing console. When he returned, the mixers happened to be playing through the area where he had asked for the change. He complimented them on having addressed his concern, and said that it sounded much better now. One problem … they had not gotten to that item on the list yet. We are highly suggestible when it comes to sound. If we have some reason to think a sound has changed, then it will sound to us like it has changed.

​... when you see a ceiling fan and hear a helicopter...


Situations like the one I just described can be interpersonally delicate. Should the mixers tell the Director that in fact no change was made, and thereby risk embarrassing him? Or perhaps let him think the change has happened, then surreptitiously actually make the change when he isn’t around? Or not make the change at all? In this case they were honest with him about what had happened. The best response is usually the one tailored to the personality of the Director. There are lots of those tricky decisions in mixing.

It would be wrong for me to end this piece without talking about the great storytelling advantages of sound’s ambiguity, especially in its relation to moving visual images. As audience members our tendency is to accept what is presented in terms of sound, sometimes having done quite a bit of unconscious rationalizing to justify the acceptance. A classic example is Captain Willard in “Apocalypse Now” seeing a ceiling fan spinning, but hearing a helicopter instead.

The ambiguities of sound are actually what make it such a strong tool in my opinion. When a sound feels ambiguous or even slightly out of place in a given dramatic situation, it forces us in the audience to bring our own history and experience into the process of interpretation. That is the ideal response, because it means the listener/viewer is fully engaged and actually helping to tell the story themselves. Obviously, it’s possible to go completely off the rails with creative ambiguity, resulting in complete confusion in the audience. Best to avoid that!

Enhancing Intelligibility in Sound Mixing

Enhancing Intelligibility in Sound Mixing​

One of the most powerful but subtle techniques in sound mixing is micro-adjusting the timing of words, sound effects, and/or music to improve clarity and intelligibility. When multiple sounds compete for attention in a mix—whether it’s dialogue buried under a sound effect or specially important musical note masked by a word in a lyric—clarity can be significantly improved by shifting one of the sounds just slightly in time. This technique, while seemingly small, can have a big impact on how well the audience perceives critical elements in a mix. I’ve used this trick and seen it used in the mixes of literally hundreds of feature films. It works!

This is is a discussion of the Art of Micro-Timing Adjustments.

The Challenge of Masking in Audio

Masking occurs when two or more sounds overlap in a way that prevents the listener from distinguishing one from the other. This is particularly problematic in film, television, and music, where dialogue or key sound effects must cut through background layers. In many cases, simply turning up the volume of a buried sound isn’t the best solution. Increasing volume can lead to unnatural dynamics, distortion, or an unnatural balance in the mix. A more elegant and effective approach is sometimes to manipulate timing, ensuring that crucial syllables or transient sounds land at a moment of lower competition.

Why Micro-Timing Adjustments Work

 
Human perception of sound is heavily dependent on transient information—sharp attacks and high-energy consonants, such as the “t” in “time” or the “k” in “quick.” If these transients are masked by competing sounds, the intelligibility of an entire word can be compromised.
However, by slightly shifting the word or sound effect earlier or later, the critical transient can emerge more clearly.
 
For example, if an actor says “spectacular” while a loud explosion occurs at the same time, the “sp” transient at the beginning of the word may be lost. But if the word is moved slightly earlier or later, the transient may fall into a clearer space, allowing the audience to hear the word properly without needing to raise the volume excessively.

Applications in Different Audio Contexts

 
This micro-adjustment technique is widely used in various sound mixing environments, including:

1.  Dialogue Mixing in Film and Television

 
In dialogue mixing, words must be clear even when layered over music, ambiance, and sound effects. Sometimes, the problem is not that dialogue is too quiet, but that a single syllable is masked by another sound. A skilled mixer can slightly shift the timing of a sentence or even just one syllable so that the key transient no longer competes with other element.

2.  Music Production

 
In music, lyrics often struggle to remain clear over dense instrumentals. Singers naturally emphasize certain syllables, and if these moments coincide with loud drums, guitars, or synth hits, intelligibility suffers. Moving the vocal track slightly to avoid overlap with snare drum hits or cymbal crashes can allow key words to stand out without affecting rhythm or groove.

3.  Sound Effects in Video Games and Animation

 
In video game sound design, where multiple elements are constantly competing for attention, precise placement of sound effects is critical. If a player character’s voice line overlaps with an explosion or a gunshot, the mixer can shift the line slightly to make sure it lands in a pocket of clarity.
 
For animated and live-action films, the timing of sound effects relative to dialogue is just as important. Suppose a character drops a glass while speaking. If the sound of shattering glass occurs at the same time as a key syllable, it can mask the dialogue. Moving the glass shatter forward by a fraction of a second can ensure the audience hears both elements clearly, and often the perceived sync of the event will still be within acceptable limits.

Practical Techniques for Micro-Timing Adjustments


    Using Clip or Region-Based Nudging – Most DAWs provide tools to nudge audio clips in increments as small as a few milliseconds, allowing for precise adjustments without disrupting sync.

    Time-Stretching for Natural Adjustments – If moving a word slightly disrupts sync, small- scale time-stretching can be used to subtly extend or compress the word without noticeable artifacts.

    Automated Ducking vs. Manual Shifting – While dynamic EQ and sidechain compression can help reduce masking, manually adjusting timing often provides a more natural and precise solution.

    Experimenting with Placement – Since every mix is unique, a mixer may need to experiment with different placements to find the best position where the crucial sound remains intelligible.

The Subtle Power of Micro-Timing in Mixing

 
One of the fascinating aspects of this technique is that it operates at the subconscious level for the listener. A well-mixed scene or song feels clear and natural, without the audience realizing that words or effects have been subtly adjusted.

Sonic Analogies and Synonyms

Sonic Analogies and Synonyms 

Aspiring sound designers often ask how important it is to get an academic degree of some kind before pursuing a career in media sound. A degree itself isn’t important at all in my opinion. No potential employer has ever cared whether I had a degree. On the other hand, the more we know about human culture and the physical world around us, the better equipped we are to handle challenges in sound design, both the creative kind and the interpersonal, political, diplomatic kind.

On the practical, creative side, the better we are at language, the easier time we’ll have searching sound effects libraries. That’s partly because when we search for possible sounds for a given moment, action, event, or environment, it’s almost always a good idea to include in our search-terms some words that are not necessarily connected in an obvious way to the thing we are being asked to address. Having an imagination and a vocabulary for the non-obvious can help a lot.

Example: Our project includes an above-water shot of a thirty-foot wooden boat hitting submerged rocks.

The most literal search may be with words like “aground” or “grounded.” A less literal one will include the words “boat,” “wood,” and “rock.” Notice that I used “rock” instead of “rocks.” A search for the singular will automatically give me results for the plural too, but a search for the plural will not always give me results for the singular. So, it’s usually a good idea to use the simplest version of a word, which may sometimes be a partial word. If you are looking for “rake,” it might be good to try “rak,” because that is more likely to give you search results that include “raking” in addition to “rake.”

For our boat shot, a search using any of those fairly obvious keywords will probably bring up some useful sounds in a big, diverse library. But how about the non-obvious?

My process is often to try to reduce what is going on sonically in a shot to the most fundamental level. In this case, a more-or-less hollow wooden object is impacting and/or scraping against something hard that’s under water. Does a potentially useful sound HAVE to be made by a boat? Nope. Does a potentially useful sound HAVE to be made by a rock or rocks? Nope. Does it need to have anything to do with water? Nope. So now I’m free to search a much wider variety of sounds than my literal “boat, wood, rock” searches would have yielded.
 
In fact, the term “wood” may limit the search unnecessarily. We like to think we know what wood sounds like, and sometimes we do, but sound is powerful at fooling us. A hollow, plastic object can make sounds very similar to wood. So, you may not want to limit your search to wood objects. Same with the word “hollow.” Something flat, not hollow, can make sounds similar those made by hollow objects if it resonates in certain ways.

It occurs to me to look for “scrape” (or maybe “scrap” because that will give me results that include “scraping”). Just the term “impact” could give me some appropriate sounds, or just “hollow” or “rub.” “Shudder” by itself might help, because it’s possible that I’ll find something shuddering that isn’t wood, but sounds like it could be wood. “Resonate” or “resonat” could lead to something nice.

Some of the results, almost certainly most of the results from a search this oblique, won’t be useful. But some will be amazingly, wildly useful elements to supplement or even replace the more on-the-nose sounds you’ll find with your “boat,” “wood,” rock” search. Sounds that have nearly nothing to do with a boat impacting rocks can add interest and character to your sound design for that moment, making it feel believable but unique.

An old, unrepairable acoustic guitar (hollow wooden object), banged into a door or scraped against rough concrete, then the recording pitched down an octave or two or three to make it feel bigger, could be just the thing for our boat scene. Good luck with THAT search!

What To Leave Out

What to Leave Out

In sound design for film, video games, podcasts, or ambient art installations, the principle of “less-is-more” is worth thinking about. Sound designers often discover that the strategic absence of certain sonic details can heighten tension, invite curiosity, and encourage emotional involvement. For instance, consider a horror film. The scariest scenes may not be those that feature detailed, bloodcurdling sound effects of monsters in the foreground, but rather those that allow the mind to conjure horrors that remain mostly unheard or faintly implied. Obviously, if we are commercial artists, we do what the client wants us to do, but in cases where I have some latitude, I often try the less-is-more approach because I know how effective it can be.

Leave space in your soundtrack for imagination and fantasy...

The sound design for the film “No Country for Old Men” by Skip Lievsay is a master class in subtlety and only using sounds that are absolutely necessary. Alfred Hitchcock famously said, “There is no terror in the bang, only in the anticipation of it.” Much of that anticipation stems from incomplete information. If a scene includes footsteps echoing in a hallway but the source remains unseen and the sound itself is distant or muffled, the viewer’s imagination rushes in to fill the gap. The result is often far more menacing than if the director had provided a full, high-fidelity recording of a recognizable creature snarling. Our internal fears, shaped by personal memories and cultural myths, can surpass anything an artist can depict in concrete detail.

The phenomenon of imaginative engagement is partly rooted in how our brains function. Neuroscientists have explored how the brain’s mirror neuron system is activated not just by literal perception of events, but also by suggestion and implication. When art leaves room for the viewer or listener to participate, these neural circuits fire in ways similar to firsthand experience. The act of completing an image or inferring a sound can evoke empathy and identification, forging a powerful connection with the artwork and its characters or themes.

While championing the power of omission, it’s also important to recognize that not all forms of ambiguity are created equal. Too much vagueness can result in confusion or frustration rather than intrigue. The key is balance: offering just enough detail to ground the audience, while withholding enough to spark curiosity. In painting, this might mean having a focal point rendered with some clarity, while the peripheral elements remain suggestive. In sound design, the main narrative cues might be present and understandable, but certain layers or background elements remain ambiguous, inviting the imagination to fill in the blanks.

 Effective omission hinges on context and intention. A meticulously structured piece can carefully carve out negative space, ensuring that the “missing” details serve the story or emotional arc. Artists must make deliberate choices about what to show and what to hide. This intentional selectivity transforms the artwork from a mere display of skill into a dialogue with the audience.

Leaving out certain details in a piece of sound design can indeed trigger the audience’s imagination in profound ways. By resisting the urge to depict every nuance or audible cue, artists tap into a universal human inclination to seek meaning and resolve mysteries. Painters who leave elements obscured encourage viewers to project their own emotions and narratives onto the canvas, forging a personal connection that a more literal painting might not achieve. Sound designers who omit certain sonic details invite listeners to become co-creators, layering individual memories and imaginations onto the incomplete soundscape.

In a world saturated with high-definition images, immersive audio and constant stimulation, the subtlety of suggestion becomes all the more valuable. It offers a vital counterbalance—a space for quiet wonder and personal storytelling. Ultimately, the choice to omit details is not about negligence or laziness. It can be a deliberate artistic strategy that harnesses the creative power of the human mind, reminding us that some of the most enchanting and memorable experiences in art occur when we are invited to do a bit of the creating ourselves.

Enhancing Sound Effects by Slightly Offsetting Diverse Sound Elements

Offsetting Diverse Sound Elements

Sound designers can improve emotional impact of short-duration events while maintaining believability in visual synchronization.

This technique involves layering multiple sound elements and slightly staggering their onset times by a few frames or even less. Instead of having all components of a sound effect occur simultaneously—which can result in a flat or less dynamic auditory experience—the elements are introduced in quick succession. This creates a composite sound that unfolds over a brief period, adding complexity and depth. The staggered timing can be as minimal as a few milliseconds, but the effect on the listener's perception can be significant.

Sounds that evolve are more engaging than static ones...

Our perception is highly sensitive to variations in sound over time. Sounds that evolve, even over a short duration, are generally more engaging than static ones. By crafting sound effects with two or three syllables—essentially, sounds that have distinct beginning, middle, and end phases—designers can capture the listener's attention more effectively.
This multi-syllabic structure introduces a rhythmic quality, making the sound more memorable and emotionally resonant.

Enhancing Emotional Impact

A gunshot comprised of an initial explosive attack, maybe with lots of mid-range, followed a frame to three frames later by another explosive attack featuring a different part of the spectrum, maybe a beefier sound with more low frequencies, followed by natural or artificial reverb, is more likely to feel more powerful and interesting than a single sound with all elements hitting at exactly the same time. Even more syllables, with longer intervals between them, can sometimes add to the effect, especially in a sound like an explosion that can be drawn out over a longer period of time. Using this technique for anything from gunshots to door slams, it will be useful to fade down the end of each syllable just before the onset of the next one so that the first sound doesn’t mask that next one’s attack.

Maintaining Believability and Visual Synchronization

You might worry that introducing timing offsets could disrupt the synchronization between sound and visual cues, breaking the illusion of reality. But, when executed with precision, these offsets are often imperceptible in terms of visual sync but noticeable in terms of auditory richness. The human brain is adept at fusing sounds that occur within a close temporal window, perceiving them as a single event. By keeping the offsets within a few frames (each frame being approximately 1/24th or 1/30th of a second), the sound remains tightly linked to the visual action, preserving believability.

All the sound design greats, including Ben Burtt, Walter Murch, Gary Rydstrom, and Richard King have used this technique to great effect.