Audio Implementation in Godot for Hauma
First I'd like to mention that the setup described below is by no means production ready. It's more of a rough prototype of an idea.
Every game has different needs and scope. The Hauma Prologue is a relatively short 25min 2D prototype that takes places in about ~6 different locations and has a hand full of special audio events.
Since it's a visual novel style game it's very dialogue heavy, the ambiances mainly acted as sort of background beds for the voice overs.
The music is just loops that we cross fade between. If you're looking for a way to dynamically change music tracks in a synced manner, this ain't it :).
But, for this szenario I felt using full blown audio middleware solutions like FMOD/Wwise would be overkill. I'm also a fan of working within native engine audio features as long as they allow for what the project needs without making it too difficult.
For Hauma we needed a way to transition between ambiances of different scene locations. For instance in front of a club to in the club, to a different room etc. as well as music and special sequences.
Our first idea was to just use Tweens and fade audio streams in and out. But while that was working fine, it wasn't very comfortable to work with. Tunning the transition durations and setting the correct easing for the tweens was cumbersome. Ideally I wanted to have a visual way to author these transitions and also play the occasional oneshot sound, for instance if you enter a building and want a door sound to play at the right time during the transition.
I've setup a small demo project where you can checkout the very core of the transition implementation.
You can find it on my gitlab:
To host all our audio streams and the related scripts we created a dedicated AutoLoad Scene (Singleton) named AudioEngine (should've named it Manager instead :)).
Regarding how to setup an Animation Tree to work as a state machine please refer to the official documentation. The play mode of the Animation Trees shoud be set to "Travel".
Our AudioEngine scene contains separate Animation Tress and Animation Players for the music and ambiances, a dedicated audio stream player to handle oneshot sounds during transitions and some stuff related to voice overs to which I will get later.
Both animations players also contain an empty looping animation named "init" which acts as the root node of the state machine as seen in the picture above. More on that later.
The actual ambiance states are looping animations. These can either be empty, so the state machine just "waits" until it needs to transition to another state, or they can be used to play random oneshot sounds to add some more variance to a static loop.
We used this for the exterior ambiance to play random traffic sounds, dogs barking and the like. This part of the system is still in a "whatever works" state for now and definitely needs to be improved.
Here's how a typical transition animation looks like, transitioning from an outside ambiance to an inside ambiance while also playing a door open sound.
Three things are happening here, we:
- fade out the front of club ambiance stream player and stop it after the fade out is finished
- play a one shot door open sound via the oneshot player
- start the in the club amiance and fade it in
For the fade in/out animations I had the best results using a cubic curve setting with easing set to 0.35 for fade ins and 1.35 for fade outs. Also, notice how the fade in/out are slightly offset from one another? That's the fine control you gain using this setup. Setting that up with simple tweens would get a bit involved quickly.
Now this is important, when connecting the Animation Nodes in the Animation Tree we need to pay attention to the switch mode of the transitions. So lets say we want are currently in the "outside" state and want to travel to inside:
# outside -> to_inside -> inside
We want the 2nd transtion to have a switch mode "AtEnd", meaning when the Animation Tree reaches it it will play out the "to_inside" animation, containing our fade logic completly before entering the looping "inside" animation.
Transitions are triggered from code. Hauma uses Ink to manage all the dialogue and game progression with custom commands that trigger scene transitions and subsequently the audio transitions too.
Here's the code snippet to trigger an ambiance transition. It works the same for the music.
# _amb # is a reference to the ambiance animation tree player playback # # amb_to_play # is the target animation (state) # the animation tree player should travel to func play_amb(amb_to_play): # the state machine takes a couple of frames to start playing if not _amb.is_playing(): Logger.debug("AE: ambiance statemachine not playing yet") _amb.start("init") yield(get_tree().create_timer(0.5), "timeout") if amb_to_play != $AmbianceAnims.get_current_animation(): Logger.debug("AE: amb to play is %s" % amb_to_play) _amb.travel(amb_to_play)
As you can see we ran into one caveat. The Animation Tree (state machine) takes a couple of frames to start before it's playing.
If a transition is triggered for the first time and the animation tree player is not playing yet, we start it from the aformentioned "init" animation, then give it a bit time to "get going" before finally transitioning to where we want to go (if you know a better way to do this please let me know).
Also, it's important to know where the game can transition to when launching, for example from a save game, and connect the init state to all possible target states.
Again, as I mentioned earlier this is all in very rough prototype stage. This is one of the areas that I feel can be solved in a better way.
For this prototype phase we didn't want to overthink things too much and just ran with what seemed to be working.
Automating Audio Bus Effect Parameters
Another Idea I tried to implement using this system was dynamically changing effect paramaters on the Music Audio Bus. We have a room next to the main room of the club and when entering I wanted to apply a low pass filter to the music bus and pan the club ambiance/music a bit to make it appear to come from the next room.
To do this I've exposed the needed parameters in the AudioEngine.gd script with setters/getters. So whenever the parameters get changed within an animation the settings are applied.
export(float, EXP, -1.0, 1.0, 0.1) var fx_music_pan setget set_fx_music_pan, get_fx_music_pan export(float, EXP, 20.0, 20000.0, 2.0) var fx_music_lpf setget set_fx_music_lpf, get_fx_music_lpf export(float, EXP, -80, 6.0, 0.5) var fx_music_vol setget set_fx_music_vol, get_fx_music_vol # [...] func set_fx_music_pan(value): # set via bezier curves as -100 - 100, easier to edit if value > 1 or value < 1: fx_music_pan = value * 0.01 # set in editor (probably never gonna use that but anyway) else: fx_music_pan = value AudioServer.get_bus_effect(AudioServer.get_bus_index("Music"), 1).pan = fx_music_pan func get_fx_music_pan(): return fx_music_pan func set_fx_music_lpf(value): fx_music_lpf = value AudioServer.get_bus_effect(AudioServer.get_bus_index("Music"), 0).cutoff_hz = fx_music_lpf func get_fx_music_lpf(): return fx_music_lpf func set_fx_music_vol(value): fx_music_vol = value AudioServer.set_bus_volume_db(AudioServer.get_bus_index("Music"), value) func get_fx_music_vol(): return fx_music_vol
Without going into too much detail, all the voice overs in the game are triggered via Ink and a custom Dialog system. Each Voice Over line is assigned a unique ID in the ink script. The function that plays the voice over looks if it finds a corresponding wav file, loads the stream, stops any currently playing voice overs and then dynamically creates a Stream Player and starts playback.
The created Stream Players are attached to a node "CurrentVoiceOvers" and either disposed of automatically after finishing playing by making use of the Stream Player "finished" event or as mentioned before stopped and disposed when the system tries to play another voice over. This basically allows for skipping voice overs.
Here's just the function that plays the voice overs for reference.
func play_voice_over(speaker, correlation_id, delay = 0): var voice_over_path = "res://Audio/VoiceOver/%s/en_%s_%s.wav" \ % [speaker, correlation_id, speaker] var stream = null if ResourceLoader.exists(voice_over_path): stream = load(voice_over_path) else: Logger.warning(("Could not find voice over audio file for speaker %s" + \ " and correlation id %s.") % [speaker, correlation_id]) return # see if any other voice over is playing and stop it (fade-out) stop_voice_over() var vo = AudioStreamPlayer.new() vo.set_stream(stream) vo.set_bus("VoiceOver") vo.connect("finished", self, "_on_vo_stream_player_finished") $VoiceOver/CurrentVoiceovers.add_child(vo) if delay > 0: yield(get_tree().create_timer(delay), "timeout") vo.play()
The main limitation of this setup gets apparent if it comes to mixing and volumes. All the cross fades are going from 0 to -80 or vise versa. Tweaking individual volumes of a stream player after you've setup all the transitions is not really feasible. Instead I've done all the volume adjustments of individual assets outside of the engine. Sth. that is perfectly doable for a project of this limited scope.
Of course this is not really desireable for bigger projects and that's the first thing I would try to address when Hauma enters the next project phase. One way to do this could be to have a audio stream player script attached to each of the stream players with a method that scales the stream player volume relatively to the volume it is set to.
Another thing that's not ideal is that the ambiance streams always start from the beginning. This could be randomized as well.
There's also currently no way to detect if a transition has finished playing to see if it's safe to trigger the next one!
And a more workflow related issue I found is that you can't zoom the Ambiance Tree window. There's also no way to select multiple animations to easily re-order things. You can also not vertically zoom the parameter lanes in the animation editor and the curves are always displayed as lines regardless of the curve settings (except if you use bezier curves). Seeing which interpolation mode the current paramter line is in is also rather difficult. For my older eyes at least :).
In the end I got used to it. Mouse over animation keys is your friend :).
Again I want to stress that all of this resembles the prototype of an idea of using Godot's provided tools to solve our audio needs for the Hauma prototype with the provided engine features.
For what it is, it's working very well for us, but of course there's plenty of room for improvements, some of wich, like the engine UX related parts are probably not solved quickly.
Still, Godot's build in audio features and the Animation Tree players are already pretty powerful tools and this setup probably only scratches the surface of what's possible when put more thought and time to it.
If you have any questions feel free to reach out any time via email@example.com.
Thanks for reading <3