Music in Video Games

Updated: Mar 30, 2020

What is the aim of adaptive music?

In modern video games it could be argued that many game studios are striving to achieve the level of emotional sophistication usually ascribed to Hollywood movies. Many of today’s games utilise similar scoring and orchestration to set pieces as one would expect to find in an epic blockbuster movie. "There are many ways in which music can evoke or induce emotions, but there is clear evidence that strong or “peak” emotions in response to music (such as chills, lump in the throat, etc.) are associated with the creation of, and confirmation or violation of, (p. 149) expectancy (Sloboda 1991)". (R.Stevens. 2014). The music composed for interactive games however, has a number of technical constraints and additional considerations that composers of linear music, such as in films, do not have to contend with.

In a film the composer is able to write a form and structure which flows exactly with the scenes aesthetic and emotional content and will be played back exactly the same every time the viewer watches. This is not the case in an interactive environment such as a video game. In video games the player is able to choose their own actions at will, for example, in many games there is an option to play stealthily or to play an action combat role, in the same area of the game. These two different scenarios if part of a film would clearly be scored in two separate ways, this is the case in video games too, however the two events will not follow in a linear path necessarily, as the player may choose to opt for one play method and then switch to another part way through.

And this is part of the problem as Collins (2008. P3) states: “While they are still, in a sense, the receiver of the end sound signal, they are also partly the transmitter of that signal, playing an active role in the triggering and timing of those audio events.” This means that the player is an integral part of the way in which the music will be played, and the musical score must allow for many different eventualities and many different player types. This can have a knock-on effect on both the musical structure and musical repetition versus the players own autonomy. (R.Stevens 2017). Players will choose a genre of game that best matches their intrinsic needs (Madigan 2012) and they will also adopt different gameplay strategies according to their personality type (Bartle 1996). A player’s desire for relatedness or fellowship (Hunicke, LeBlanc, and Zubek 2004) might be met through music that rewards cooperative play (Kristian and Girard 2011) or that allows them the ability to perform music with others. (R.Stevens 2014)

Within adaptive music there are a number of different techniques which can be employed in order to help mitigate the issues which arise through player autonomy. The two main methods employed are vertical remixing or layering and horizontal resequencing which can be broken down further into sub techniques which will be discussed. As well as these two primary techniques I will also discuss more directly adaptive techniques as employed in some rhythm action games and the recent Tetris effect game in which the players actions directly affect the beat of the game.

Parallel Forms

Vertical remixing or parallel forms is the adaptive technique whereupon composers break up a music cue into multiple musical layers these layers can be broken down by instrument family, musical function or some other means. The composer will often work with the programmers to figure out which game events will trigger the layers to enter and exit the score. The system is then able to represent multiple variables through different layers such as enemy proximity, player health, a special mode, (such as stealth) time-based effects, the time of day, battle intensity or the number of enemies. (R Stevens 2017.) The more layers the music has then the more control inputs are required from the game, many games just employed two layers, one for exploring and then another one that gets added for combat making it easier for the programming team. (Sweet 2016.) Many open world adventure or action games such as Splinter Cell, Fallout, Red Dead Redemption and Portal 2 employ this methodology. The advantages to this type of adaptive music are that immediate changes can be made to the music based on an in-game event and this can be less impactful to the listener in that the change from one cue to another can be a subtler smoother transition. This is especially true in a parallel form version of this system where multiple layers of the overall composition are continuously playing however the game states switch on and off various layers depending on the in-game scenario.

This system is not without its drawbacks, musical phrases are often easily interrupted by player actions causing layers to fade in and out in a non-musical fashion, the system is often musically inflexible as the score cannot easily change key or tempo based on game events. This can lead to many odd and jarring scenarios occurring in a game such as music playing at seemingly inappropriate times because the player has entered the proximity of nearby enemy, triggering battle scenario music cue, even if that enemy is occluded from the player by building or other in game object or the player is out of combat but the cue continues then fades to the next but by the time the next state cue has been triggered the player is back into another state causing the system to constantly change from one state to another.

Transitional Forms

Another common method of composition for interactive games is known as horizontal resequencing. Also known as transitional forms, this method of adaptive composition allows the development team to dynamically piece the music together based on the actions of the player. The decisions of the player are considered and music cues which are broken down into constituent musical chunks or segments such as intro, exploration loop one (multiple variations), outro and ambient often with conjoining pickup lines and outro lines to each segment. They are played back according to game events based on the player’s decisions. In its most simple form this is done by cross fading between states as the player moves from one action in the game to the next. The main issues with this type of system are shown in the video below in that the transitions between one state and another are not always well timed and can sound clunky or interrupted.

The disadvantages of a horizontal resequencing method can be mitigated by some variations on the simple cross fading technique. Systems such as stingers have the advantage of playing into the expectations of structure that the player has for the music to coincide with major events within the game. This means that the player is able to have a satisfying fanfare play when they have destroyed their enemies, adding to the positive feedback loop between the game and the player. The system allows for ease of composition and implementation into the game allowing the composer to spend more time writing music. (Sweet 2016.) Another advantage to this type of system is that the system can be instructed to wait until the next logical musical juncture, i.e. the next bar. This means that the music will flow well from one state to the next, horizontal resequencing done in this manner is sometimes referred to as phrase branching and can be considered one of the most musical variations.

The methodology of horizontal resequencing in one form or another is often employed in action adventure game such as World of Warcraft, Prince of Persia, SS X Tricky and beat em up’s such as Killer Instinct. Sometimes the juxtaposition between one musical cue and another is too jarring to simply jump from one state to the next in these cases a subtle variation to the technique is added in the form of a bridge transition. This smooths over the link between two disparate music cues while still allowing for immediate change in music cue synced with the player state. This allow for punctuation at the beginning and end of musical cues and is more musical than cross fading but less musical than phrase branching. It also allows for the ability to change tempo harmony instrumentation or melody based on a game event.

Ornamental Forms

Many development teams opt for another version of horizontal resequencing known as Stinger-based composition or ornamental forms. In this system an ambient bed of music will play for different areas in the game and will serve as the main backdrop for the stingers to play over. The stingers are then assigned to different in game events which can be triggered by the player at any time often allowing the stingers to be overlapped in some cases. Generally, they do not have a connecting musical framework and are instead made up of accents and crescendos which serve to punctuate different game events (Sweet 2016). Events such as the striking of an enemy, falling into a precipice, enemy or player death or near death, entering or exiting an area, to highlight to the player that they should take a certain course of action in a mystery game for example or any other of the myriad of micro-events which can be assigned a stinger to provide the player with an auditory reward or sonic feedback. Meaning that stingers can often serve a ludic function in the game as well as being part of the overall musical backdrop."The ludic or metonymic is not separable from the metaphoric (that which relates to the game as a story or world; Whalen 2004). A piece of music may confirm that an action has been successful (defeat of the enemy) and thus provide the positive reinforcement important to flow". (R.Stevens 2014).

The down side to this system is that typically the phases need to be non-melodic in order to fit well and that if not done well can feel like micky mousing or mimicking the players actions too closely. This can make scoring difficult as the elements often have no rhythmic framework linking them and phrase lengths rely heavily on the way in which the game is scripted. (Sweet 2016).

Adaptive Music

As well as these common methodologies for adaptive music, in some games the music itself either generates the gameplay as in a rhythm action game or is directly adaptive to the players game inputs in real time as in Tetris Effect. In the contemporary take on the classic Tetris game, the developers opted for the players ability to match and move shapes on the board to directly influence the game music. It does this in a variety of different ways depending upon the theme of the level. The longer the player is able to keep the Tetris game in action by creating lines, the further the music track develops, this also has an effect on the instrumentation which the player is responsible for adding to the musical mix by way of their shape matching actions.

The choice to make the music and player input feel as though they are one adds to the frenetic feel of the game whilst it’s serving as auditory feedback and reward all at the same time. This is especially the case when the player enters into the special game state by building up their zone meter, this allows the player to hit a trigger combo which slows down time allowing them to clear lines up to the value of 16 (not possible in previous versions) once the player has successfully completed a large block line in this mode or fails a line, the execution of this triggers filtering effects and the escalating of the music once more in a satisfying crescendo as the mode is exited. In some of the levels the player making a connection on the board coincides directly with a percussive hit or a musical Stinger being triggered, in other levels moving the shapes around the board has a direct effect on the musical lines in the game. Surprisingly few games have this direct relationship between player input and music outside of rhythm action games. It stands to reason that this type of adaptive music would be well suited to a puzzle game where fast action input and a constantly changing board allow for ‘Mickey Mousing’ of player action in a way which will not become annoying or break player immersion. A direct relationship between the music and real-time player inputs in an action adventure game could easily become ridiculous and counter-productive to the overall experience, however it is interesting to see that this type of adaptive music has its place in the right environment.


