Hello all,
I'm trying to get all frequencies of a sound sample (22 kbps Mono).
That is to say, if I break down the sample in pieces of (let's say) 0.01 secs the sound frequency of each piece.
(This has nothing to do with al_get_sample_frequency, which returns 22kbps.)
So to do this, I was looking at void *al_get_sample_data(const ALLEGRO_SAMPLE *spl)
ALLEGRO_SAMPLE* pSample = /* ...the loaded wav file*/ ; int nDataLength = al_get_sample_length(pSample); int16_t* ptr = (int16_t*)al_get_sample_data(pSample);
In my case, nDataLength returns 47936 (My sample is 1.780 sec long)
However if I look precisely at the ptr memory, the int16_t values are as follows
ptr[ 0] = 3 ptr[ 1] = 0 ptr[ 2] = -1 ptr[ 3] = 1 ptr[ 4] = 0 ptr[ 5] = 0 ptr[ 6] = 0 ptr[ 7] = 2 ptr[ 8] = 0 ptr[ 9] = 1 ptr[10] = -1 ptr[11] = 0 ptr[12] = 2 ptr[13] = 2 ptr[14] = 1 ptr[15] = 1 ptr[16] = -1 ptr[17] = -2 ptr[18] = 0 ptr[19] = -1 ptr[20] = 0 ... ptr[10000] = -201 ... ptr[20000] = -346 ... ptr[30000] = -226 ... ptr[40000] = -91 ... ptr[47935] = 34
...It does not seems to be the raw frequencies of the sample pieces.
So what are these values? (especially negative values?)
Is there a way to get small parts frequencies?
al_get_sample_data documentation is somewhat cryptic here... It says "Return a pointer to the raw sample data."
Thank you
I believe you want to use Fast Fourier Transform on the sample data.
Thanks for your answer.
It seems that fft is the right way to go, but is it possible to do fft with Allegro 5 or C++?
Another question is: Why nDataLength = 47936 since my sample is 1.780 sec long? 1.780 * 22050 = 39249 (not 47936). What does ptr[] contain?
Thanks
Why nDataLength = 47936 since my sample is 1.780 sec long?
I don't know.
The ptr[] evidently contains int16_t values, which are signed short ints.
I was fiddling with a guitar tuner last summer, but I've reinstalled the OS to a bigger hard drive and haven't reinstalled Allegro. Give me an hour and I'll see if I can give you an example program that allows you to input sounds through the microphone and see how the "main frequency" and the sample distributions change (also known as timbre).
[EDIT] Well, the last announcement for a WIP in Allegro Development tried to give me allegro-5.0.11 something, which I already had from 2014, so I tried allegro-5.1.8. I can't get this stuff to compile, haven't used it so long I'm a n00b again too.
I think this zip file has the source for some sort of FFT thing, IIRC it tried to print the nearest note of the predominant frequency and how much it was off by. It uses OpenGL, sorry. You may be able to get some relevant bits out of it.
As for how FFT works, I have some totally wild guesses.
<total n00bishness follows>
The FFT thing works by making a sine wave of some given frequency and compares it to the samples to see how well it follows the curve. A good fit results in a large value, so the "bin" for that frequency gets the high value. It then tries higher and higher frequencies to see how well they fit in turn and stores those values in their respective bins. And for low values too, obviously. However, since the generated sine wave starts from a zero point, and the samples waveforms start from a random point, it would most likely guess wrong unless the sample could be forced to start from the zero point of a waveform. This is about as difficult in itself, so what the FFT does is make a "window" to copy the sample to, and reduces the frequencies toward both ends so it "looks" like it starts from a zero point. OTOH, this "window" necessarily averages quite a few sample points to fit one bin, so eventually you have to compromise between accuracy and speed. Also, I got stuck for awhile on the fact that the display of the "bins" seemed to produce a useless S shaped curve, but what I didn't realize was that it was symmetrical and I only needed to examine the very first few "bins" at the beginning to get something reasonable. At least, that's my memory of what I thought last summer.
Note: edited the above again
Thank you for your time and trouble...
(Well, I can't compile 5.1 either; I'm sticking to 5.0.10 for now...)
If I understand your example correctly, it seems that you calculates volume (
magnitudes[i] variable) using root mean square (average of sqrt(v[i]*v[i]))
It seems to be like in this other thread https://www.allegro.cc/forums/thread/610890 (see SiegeLord's message)
However I don't see the relation between the array of instant volumes and an array of frequencies I'm searching...
Thanks again.
Yes I know what FFT is.
(Transforming "Time domain" overlapping cosine-like waveforms into a set of "Frequency domain" values.)
But, to do this, I think I must first know what I get from al_get_sample_data.
If I can't figure what it is (seems that the length of the array does not match...) I won't be able to perform any math.
But, to do this, I think I must first know what I get from al_get_sample_data.
Most likely it's the amplitude. Essentially it relates to the electrical signal being sent to the speaker, but more basically, it determines the physical position the speaker should have at that point in time; 0 is centered ("at rest"), negative values pull in, and positive values push out. By constantly oscillating in and out, the speaker compresses the air and creates sound waves.
Why the array is longer than the sound length, I can't say. Are you sure the sample is exactly 1.780 seconds long, and not 1.780 seconds of sound followed by a bit of silence?
it determines the physical position the speaker should have at that point in time; 0 is centered ("at rest"), negative values pull in, and positive values push out. By constantly oscillating in and out, the speaker compresses the air and creates sound waves.
This ^^
If you were to grab those signed 16 bit ints and draw the values on the screen in the x direction with increasing index values (scaling appropriately so they'll fit) then you'd see an oscilloscope-like pattern on the screen. This would be the time domain mentioned in the video.
Thank you for the further explanation!
Now I still need to figure out how to FFT the small chunks of my waveform into sets of "frequency windows bins", each chunk giving a set of bins(an amplitude value for each frequency window)
[Edit]
I have solved my problem with the following code, thanks to Arthur's example.
I'm letting all commented code to see the difference with the original code.
At the end I get frequencies magnitudes in magnitudes[i].
The frequencies are output on screen (1024x800)
The only thing I don't know is the frequency scale, that is to say, what is the frequency window of "magnitudes[i]"... but I'll try to find out
Thanks again.