Random deadlocks playing WAV sounds (Windows, MSVC)
Bruce Pascoe

Not sure if this was fixed in 5.1.9 since there are no pre-built binaries available, although I don't see anything in the release notes, so here goes:

Playing a WAV sound via the audio stream API often causes Allegro to deadlock (I can't even close the display window). This never happened in 5.0.10, so it's apparently a regression in 5.1. Other types of sounds (Ogg, MOD) seem to be fine, except for WAV. Unfortunately I can't track the bug down as it doesn't appear to happen when running my engine in debug mode. Seems like a race condition of some sort.


Could you test to see if it happens with the exact same file but in a different format (try keeping the depth (e.g. keep it 16 bit) the same as well)? I recall some changes in the streaming code, but almost none in the actual wav loading code (I see one change that seems innocuous at a first glance). The streaming code changes, however, could have screwed something up for particular file lengths.

Bruce Pascoe

I will do some testing later anyway, but the deadlocks seem to have stopped since upgrading my engine to use Allegro 5.1.9.


There were some changes made in 5.1.9 to looping... so it's not impossible that this was fixed that way.

Bruce Pascoe

Okay, it looks like 5.1.9 made things better, but the deadlocks still happen sometimes. And you're right, it's not just with WAVs but with any supported stream format--I've now seen my engine lock up twice trying to play background music in Ogg format.

From my testing, it appears to happen more often on slower machines; my i3 and i7 laptops hardly ever experience the issue, but I can reproduce it much more easily on my cruddy AMD E2 desktop. Which, again, points to a race condition. Now that I have Allegro building on Windows, I might try to look into it myself, but then there's the issue of my not being able to reproduce it under the debugger... :-/

Edit: Well, I found the source of the deadlocks anyway: al_destroy_audio_stream. I manually attached the debugger to a release build .exe (I had the .pdb for it) while it was locked up and it's stuck in an al_cond_wait.

Here's the function in my minisphere engine causing the lockup:

166bool 167reload_sound(sound_t* sound) 168{ 169 ALLEGRO_AUDIO_STREAM* new_stream; 170 171 if (!(new_stream = al_load_audio_stream(sound->path, 4, 1024))) 172 return false; 173 if (sound->stream != NULL) 174 al_destroy_audio_stream(sound->stream); // <-- deadlock here 175 sound->stream = new_stream; 176 al_set_audio_stream_gain(sound->stream, 1.0); 177 al_attach_audio_stream_to_mixer(sound->stream, al_get_default_mixer()); 178 al_set_audio_stream_playing(sound->stream, false); 179 return true; 180}

The reason this function exists is because, if you don't reload an audio stream after it plays through to completion, you can't play it again even if you manually seek to the start, as the feed thread has already terminated. So to get around this, I manually recreate the stream each time my engine's play_sound() function is called. This technique worked fine in 5.0, but apparently 5.1 doesn't seem to like it and deadlocks.


Hmm, that doesn't seem very nice (that the thread exits), and probably should be changed. I'll investigate the deadlock anyway though.

Bruce Pascoe

Yeah, that was the results of my study of the stream code: for streams created from files (al_load_audio_stream), the thread that feeds the stream exits permenantly if it runs out of data, with no way to restart it without another al_load_audio_stream() call.

Edit: Two observations I've made: 1) The deadlock is more likely to happen the sooner al_destroy_audio_stream() is called after creation, and 2) Calling al_drain_audio_stream first appears to fix the deadlocks.

I'm beginning to wonder if it's not a race condition with the al_wait_for_event() call in _al_kcm_feed_stream. From what I could tell from the documentation, the only event a stream can emit is ALLEGRO_EVENT_AUDIO_STREAM_FRAGMENT. This is just a theory, but if the stream happens to be destroyed while the feeder is waiting for a fragment event, could it get stuck waiting forever?

Edit 2: Well, I was close. Here's where the deadlock originates (helper.c:14):

12quit_event.type = _KCM_STREAM_FEEDER_QUIT_EVENT_TYPE; 13al_emit_user_event(al_get_audio_stream_event_source(stream), &quit_event, NULL); 14al_join_thread(stream->feed_thread, NULL); // <-- here 15al_destroy_thread(stream->feed_thread);

I then switched over to _al_kcm_feed_stream - It never gets the _KCM_STREAM_FEEDER_QUIT_EVENT_TYPE event, it just gets stuck responding to fragment events forever.

It's exceedingly difficult to reproduce in a debug build, I only managed to do it once and that was without the debugger attached--I had to attach it afterwards.


Yeah, I tried to reproduce this, but I have not yet succeeded :/. Naturally I examined the code and nothing appeared wrong... I think what might happen here is that perhaps we can change it to keep that thread alive so you wouldn't have to re-create it to begin with (hopefully this won't introduce 10 bugs on its own).

Bruce Pascoe

I just found a reliable way to reproduce the deadlock, even in a debug build:
al_destroy_audio_stream(al_load_audio_stream("test.ogg", 4, 1024));

That's the cause: If you destroy a stream too quickly after it's created, it locks up. I am attempting to diagnose it now.

Edit: AH HA! I was right! If the stream is destroyed too soon after it's created, _al_kcm_feed_stream deadlocks on the al_wait_for_event call at kcm_stream.c:682. I confirmed this by breaking into the debugger while it was locked up and then attempting "run to cursor" on the next line after the event wait. It never returns to the debugger.

This may not actually be a bug in the audio code but instead in the event queue system...

Edit 2: Fixed with the following bit of shameless hackery:

682while (!al_wait_for_event_timed(queue, &event, 0.05)) { 683 if (al_get_thread_should_stop(self)) { 684 event.type = _KCM_STREAM_FEEDER_QUIT_EVENT_TYPE; 685 break; 686 } 687}

:P You should look into that event queue deadlock though.

Edit 3: Any joy on diagnosing this? It's definitely a race condition in the event queue system. Best I can tell, if you emit an event too soon after the event source is created, it gets lost. Thus in this case, the stream feeder gets stuck in a waitlock because the quit event has been "lost in the mail" so to speak and no other events are being emitted to satisfy the condition. And since now the main thread is waiting on the feeder to quit... deadlock.


Destroying the stream right away actually seems to deadlock for me too! Seems like we're in business... I'll see what I can do.

EDIT: Alright. Please try the attached patch, and see if it helps. It fixes the deadlock that I managed to reproduce, but I'm specifically interested if it helps your actual use case in your game.

Bruce Pascoe

So at first I thought it wasn't fixed, but it turns out I accidentally linked against an older Allegro lib I built. Oops! I applied the patch, recompiled Allegro for both x86 and x64 and the deadlocks went away. I even tried having it destroy and reload the stream a few times in succession; this obviously caused a sizable delay in loading sounds, but no deadlock! :P

Thanks for the fix. This was a pretty annoying issue, and it seems I'm not the first to discover it:

What's odd is that this never happened to me with 5.0.10. What's changed since then?


Not sure, maybe something, somewhere got a little slower.

Anyway, I committed that patch, so it all should be good now. Thanks for your help!

Thread #615266. Printed from Allegro.cc