Looping Music in Unity

Unity’s default audio capabilities have never been a particular strong point. Some much needed functionality has been added over its lifespan, like audio mixers, sends, and insert effects, but it’s still extremely limited compared to the feature sets found in widely used audio middleware and even other game engines.

In this post, I’m going to talk about 3 different potential approaches of gradually increasing complexity for looping music with native Unity audio.  Hopefully, there will be something useful here for a variety of experience levels.

First we’ll cover use Unity’s default loop functionality. Second, we’ll use a bit of audio editing and AudioClip PlayScheduled() to create a seamless loop. Lastly, we’ll calculate a loop point given the beats per minute (BPM), beats per measure (Time Signature), and the total number of measures in the track and create a simple custom looping system, again using PlayScheduled().

Before starting, it should be noted the mp3 format is ill-suited for this application for technical reasons beyond the scope of this post and should be avoided. Ogg and Wav are good options that handle seamless looping well in Unity.

1. Default Loop

This is the simplest option, requiring no scripting at all, but it’s also the most limited. For music that with no reverb or tail to speak of or music that doesn’t need to restart exactly on measure, this can be serviceable. A quick fade at the end of the last bar can work for less ideal music tracks, but it will result in an obvious and unnatural loop point.

Create a new object in your scene, or use one that already exists. Whatever is appropriate.

Add an AudioSource component to it and set the AudioClip to your music file, either from the menu to the right of the field or by dragging and dropping it from the project browser.

Make sure that “Play On Awake” and “Loop” are enabled. Your music will play when you start the scene and loop at the end of the audio file.

2. Manual Tail/Release Overlap

This method requires some work outside of Unity with an audio editor or Digital Audio Workstation (DAW). Here we’ll still use Unity’s default looping functionality, after playing and introductory variation of the looped track.

Before doing anything in Unity you need two separate versions of the music track in question, one with the tail cut at the exact end time of the last bar/measure, and another with that tail transposed to the beginning of the track, so that it overlaps with the start.

Ensure that the start and end of these tracks are at a zero crossing, to avoid any discontinuities (audible pops) during playback. This can be accomplished with extremely short fades at the start and end points. This second track will transition seamlessly from the introductory track and loop seamlessly as well.

Add an AudioSource to an object as in the previous section and set the second edit of the track (with the tail overlapped with the start) as the default AudioClip. “Play on Start” should NOT be enabled.

This is where a bit of scripting is required. Create a C# script and add it to the same game object as your AudioSource.


Open it in your IDE of choice. This will only require a few lines of code. First, declare two public variables: an AudioSource and an AudioClip

Save this and switch back to the Unity editor. There will be two new fields for the C# Script component in the Inspector: “Music Source” and “Music Start.”

Click and drag the AudioSource you added to your game object earlier into the “Music Source” field on your script. Do the same with “Music Start,” using the intro edit of the clip (without a tail at the start or end).

This is where the code that makes noise comes in.

When the scene Starts, the first clip will play once and the second clip will be scheduled to play as soon as the first has ended. This start time is determined simply by adding the length in seconds of the first clip to dspTime (the current time of the audio system in seconds, based on the actual number of samples the audio system processes).

From that point, the track will loop normally with Unity’s default loop functionality.

3. Calculating the Loop Point and Looping Manually

The last approach requires more scripting work, and some extra information about the music itself, but does not require any specific editing of the audio file. We’ll be creating a simple custom looping solution using two AudioSources and AudioSource.PlayScheduled() that calculates the end of the last bar or measure based on some data entered in the Inspector and uses that to determine the loop interval.

Add two AudioSources to your game object and set the default AudioClip for both to the music track you’re going to loop. This will allow each repeat to overlap with the tail of the previous one as it plays out.

Add a new script to your game object and open it on your IDE. First, we need some public variables that we can set in the inspector: an array of AudioSources and three integer values which correlate to simple properties of the music composition itself.

In the inspector, set the Size of the Music Sources array to 2 and drag the two AudioSources you’ve created to the Element 0 and Element 1 fields.

Then enter a few music properties. Music BPM is the tempo of the music track in Beats Per Minute (BPM). Time Signature is the number of beats per bar/measure. and Bars Length is the number of bars/measures in the track. You need to know these values for this calculation to work.

Next, we need some private variables for some values we will be calculating in the script itself.

The loopPoint values will be used to store the loop interval once it has been calculated. Time will be the value of dspTime at the start of the scene and be incremented by loopPointSeconds for each PlayScheduled() time. And nextSource will be used to keep track of which AudioSource needs to be be scheduled next.

Now, in the Start() method we need the script to calculate the loop interval, play the first AudioSource, and initialize the time and nextSource values.

The custom loop functionality itself is defined in the Update() method, which is called every frame.

First, we check if the nextSource is still playing. Then, if it is NOT:

  1. Increment the time by the loop interval (loopPointSeconds).
  2. Schedule the nextSource AudioSource to play at that time.
  3. Toggle the value of nextSource (from 1 to 0 or from 0 to 1), so the script will check and schedule the other audio source.

And that’s it. The music track should begin playing at the start of the scene and continue to repeat at the loop point until the object is destroyed.

Rhetorical Sound Design

Hearing is weird. It’s abstract in a way that sight isn’t. A picture can clearly communicate a sense of size and space. A series of pictures can communicate speed and distance. Sound is only movement. Almost any movement. It’s the vibrations people and things make when they pass through the air and come into contact with each other.

Hearing is also different from sight in part because we have less control over what we hear. We don’t open and close our ears, though we can try to block them. We don’t really focus our ears in the way that we do our eyes. We’re always hearing (as long as we are able to), and so we often become so used to sound that we don’t actively notice it unless we make the effort. We learn to tune a lot of sounds out, but instinctually notice when they are absent.

I think this is why sound design often goes unnoticed unless it is so incongruent that it breaks the audience’s immersion. Effective sound design sells the argument that what they are seeing with their eyes is real. It reinforces all of the concrete information about size, space, and action that they see on a screen. It is simply expected to be there and to sound “right.”

For that reason, it can be useful to have certain heuristics to apply to this problem; the problem of making things sound “right.” I’m loosely adapting Aristotle’s main rhetorical appeals—logos, ethos, and pathos—as a framework for thinking about effective sound design, with a particular focus on game audio. There is overlap between these appeals, because they are all fundamentally related (emotion and logic are never truly separate) and because each sound effect is essentially its own argument that should ideally succeed on multiple levels.

Pathos: The Emotional Appeal

An important function of any synchronized or reactive audio is to reinforce the emotional experience of the scene. This is where the role and function of sound design overlaps most with that of the musical score. Does an impact feel big? Does the gun the player is firing feel powerful? Does the giant monster they’re fighting feel enormous and deadly? Does that abandoned mansion feel haunted? This is the visceral, “game feel” component of game sound effects.

This has important implications for game design. Emotionally satisfying audio cues influence player behavior in a variety of ways.

  • A feeling of constant or impending danger can make players play slower and more cautiously.
  • A powerful-sounding weapon can inspire confidence and encourage players to be more aggressively.
  • A weak-sounding weapon might be used less often, regardless of its practical functionality.

Zander Hulme told a relevant story along these lines at a panel at PAX Aus 2016 about multiplayer weapon sound effects in a Wolfenstein game.

The players with the weaker-sounding weapon believed they were at a disadvantage and performed worse, despite both teams having functionally identical weapons. Replacing the weaker sound effects with something more satisfying fixed the perceived weapon imbalance. Game audio doesn’t simply play a passive support role in game design.

Logos: The Logical Appeal

Another important function in game audio in particular is the ability to communicate factual information to the audience. What exactly is making the sound? What direction is the sound coming from? From how far away? In what kind of space? Is the audience in that space or a different space? Can your audience discern all of these things or are they intended to? Lack of clarity and focus should be an intentional choice, not the result of carelessness or oversight.

Much like the emotional appeal, this too is a practical game design consideration. Audio information provided to the player can directly influence their decision-making and behavior in the game space, in a wide variety of contexts.

  • The recognizable sound of an enemy charging a powerful attack helps the player discern when to evade.
  • The distinct sound of a sniper rifle being fired makes them reconsider peeking around a corner.
  • The suddenly loud crack of their foot-steps on a tile floor tells them that sneaking will be difficult and may require them to slow down.
  • The clarity, volume, and propagation of sounds in competitive multiplayer games can significantly impact what kind of information players have about strategies of their opponents, even without line of sight.

In Counter-Strike, for example, players have to be mindful of moving at full speed, because running foot steps and jump landings can give away valuable information to their opponents with hearing and inform counter strategies. At the same time, being aware of this fact allows players to intentionally make noise to create misinformation.

Below is a clip of a CS:GO streamer, DaZeD, faking a drop by jumping on the ledge above. The opposing players throw a flash grenade and attempt to retake the room, expecting him to be below and blinded, but they don’t predict his superior positioning and lose the fight.

This only works because both teams are aware of the landing sounds and because these sounds are audible from positions outside of the room.

A subsequent update added unique landing sounds per surface, which complicates this scenario. In this clip, he actually jumps on a wood surface at the end of the upper tunnel. Now, an observant player could note that this surface sound effect is not what would they would hear when opposing players drop on the stone floor below. If he instead faked further to the left, the sounds would match as they did on older versions of the game.

Sound effects can provide extremely valuable information to players beyond the limitations of line of sight. It’s important to keep this in mind, even for members of the development team who don’t deal directly with audio. If footstep propagation distance determines when and where players can afford to move at full speed, this can influence how major routes through the map are designed. If this isn’t accounted for, it can have unintended consequences on player behavior and map flow. This applies in many other seemingly non-audio design contexts as well.

Ethos: The Appeal to Character

In the context of sound design, it’s useful to think of ethos as authenticity. Does the audience accept that this sound belongs in the space? Does it fit the art direction of the game? What stylistic considerations must be made to ensure that is the case? If the game is heavily stylized, there is plenty of room for stylized sound effects. If the game strives for pseudo-realism and photo-realistic graphics, it is probably appropriate to keep the sound effects relatively grounded. Often, however, what the audience expects is very different from  reality. Authenticity is what it seems like something should sound like, rather than necessarily what it actually does.

Practically, this has a large degree of overlap with Pathos, the emotional appeal, in that the most emotionally resonant sounds should also be authentic, but they are distinct. An ambience could be suitably unsettling, but not feel authentic in the wrong space. Creaking wood and howling wind might suit a creepy, old house, but be very much out of place in an abandoned space station, even though both evoke a lonely, isolated atmosphere. An impact could be distinct and punchy, but not fit the style of the game or the source object or actor.


A very common example of all of these elements in action is in effective gun shot sound effects, particularly for real world weapons. Fire arm field recordings on their own are rarely very interesting or particularly distinct. This is in part because of the difficulty in capturing the character and impact of sounds at extreme volume levels. Raw field recordings of fire arms tend to sound similar. To account for this, sound designers need create hyper-realistic gun shot sounds with a variety of explosive, mechanical, and environmental layers and processing in order to create the explosive, powerful sounds that audiences expect. This is both more authentic than a simple gun shot field recording, and more emotionally impactful. A core goal of satisfying weapon sounds is to recreate the visceral, explosive impact of firing them.

Given that, in situations where a large variety of weapons are called for, the sound designer will need to differentiate each of these weapons. This is especially true of games with a large selection of realistic weapons. It is important to both establish unique character for each and communicate that distinction to the player, who should ideally be able to tell what weapon is being fired at them from the sound. A sniper rifle might have an exaggerated, long reverb tell to really sell its firepower. Pistols and submachine guns might emphasize the mechanical elements over the explosive punch and the reverb tail to make it feel smaller. An assault rifle might lie somewhere in between.

Establishing these rhetorical choices and applying them consistently provides emotional satisfaction, authenticity, and clarity to the player.