Everybody says to use the stream functions to do this sort of thing.
As far as I can tell, the way the stream functions work is to use al_create_audio_stream(), passing parameters to set the latency to minimum by having two fragments, with a small size for these fragments (to minimize the time needed to play one, reducing latency).
Then as far as I can tell, you load the audio data with al_load_sample(). However, this data isn't any format I can recognize, with a preponderance of zeros, making no sense for floats or ints.
Supposedly, as a ALLEGRO_EVENT_AUDIO_STREAM_FRAGMENT event occurs (using al_get_audio_stream_fragment() to verify), you'd copy the audio data from the first sample that's supposed to be currently playing (to erase data from the last time it was played), then add the data from any concurrently playing samples to mix them. You'd have to check for clipping as you go, I suppose, and SSE functions would work admirably for this purpose, as well as speeding it up.
Take all this with a very large grain of salt, as
"The way to get a correct answer on the internet is to post the wrong answer"
Hopefully somebody more knowledgeable will post the correct info.
Apparently I need to use al_get_sample_data()