|
How to get raw sample frequencies |
anto80
Member #3,230
February 2003
|
Hello all, I'm trying to get all frequencies of a sound sample (22 kbps Mono). So to do this, I was looking at void *al_get_sample_data(const ALLEGRO_SAMPLE *spl) ALLEGRO_SAMPLE* pSample = /* ...the loaded wav file*/ ; int nDataLength = al_get_sample_length(pSample); int16_t* ptr = (int16_t*)al_get_sample_data(pSample);
In my case, nDataLength returns 47936 (My sample is 1.780 sec long) ptr[ 0] = 3 ptr[ 1] = 0 ptr[ 2] = -1 ptr[ 3] = 1 ptr[ 4] = 0 ptr[ 5] = 0 ptr[ 6] = 0 ptr[ 7] = 2 ptr[ 8] = 0 ptr[ 9] = 1 ptr[10] = -1 ptr[11] = 0 ptr[12] = 2 ptr[13] = 2 ptr[14] = 1 ptr[15] = 1 ptr[16] = -1 ptr[17] = -2 ptr[18] = 0 ptr[19] = -1 ptr[20] = 0 ... ptr[10000] = -201 ... ptr[20000] = -346 ... ptr[30000] = -226 ... ptr[40000] = -91 ... ptr[47935] = 34
...It does not seems to be the raw frequencies of the sample pieces. al_get_sample_data documentation is somewhat cryptic here... It says "Return a pointer to the raw sample data." Thank you ___________ |
Arthur Kalliokoski
Second in Command
February 2005
|
I believe you want to use Fast Fourier Transform on the sample data. They all watch too much MSNBC... they get ideas. |
anto80
Member #3,230
February 2003
|
Thanks for your answer. Another question is: Why nDataLength = 47936 since my sample is 1.780 sec long? 1.780 * 22050 = 39249 (not 47936). What does ptr[] contain? Thanks ___________ |
Arthur Kalliokoski
Second in Command
February 2005
|
anto80 said: Why nDataLength = 47936 since my sample is 1.780 sec long? I don't know. The ptr[] evidently contains int16_t values, which are signed short ints. I was fiddling with a guitar tuner last summer, but I've reinstalled the OS to a bigger hard drive and haven't reinstalled Allegro. Give me an hour and I'll see if I can give you an example program that allows you to input sounds through the microphone and see how the "main frequency" and the sample distributions change (also known as timbre). [EDIT] Well, the last announcement for a WIP in Allegro Development tried to give me allegro-5.0.11 something, which I already had from 2014, so I tried allegro-5.1.8. I can't get this stuff to compile, haven't used it so long I'm a n00b again too. I think this zip file has the source for some sort of FFT thing, IIRC it tried to print the nearest note of the predominant frequency and how much it was off by. It uses OpenGL, sorry. You may be able to get some relevant bits out of it. As for how FFT works, I have some totally wild guesses. <total n00bishness follows> Note: edited the above again They all watch too much MSNBC... they get ideas. |
anto80
Member #3,230
February 2003
|
Thank you for your time and trouble... (Well, I can't compile 5.1 either; I'm sticking to 5.0.10 for now...) If I understand your example correctly, it seems that you calculates volume ( However I don't see the relation between the array of instant volumes and an array of frequencies I'm searching... ___________ |
Arthur Kalliokoski
Second in Command
February 2005
|
anto80
Member #3,230
February 2003
|
Thanks again. Yes I know what FFT is. But, to do this, I think I must first know what I get from al_get_sample_data. ___________ |
Kitty Cat
Member #2,815
October 2002
|
anto80 said: But, to do this, I think I must first know what I get from al_get_sample_data. Most likely it's the amplitude. Essentially it relates to the electrical signal being sent to the speaker, but more basically, it determines the physical position the speaker should have at that point in time; 0 is centered ("at rest"), negative values pull in, and positive values push out. By constantly oscillating in and out, the speaker compresses the air and creates sound waves. Why the array is longer than the sound length, I can't say. Are you sure the sample is exactly 1.780 seconds long, and not 1.780 seconds of sound followed by a bit of silence? -- |
Arthur Kalliokoski
Second in Command
February 2005
|
Kitty Cat said: it determines the physical position the speaker should have at that point in time; 0 is centered ("at rest"), negative values pull in, and positive values push out. By constantly oscillating in and out, the speaker compresses the air and creates sound waves. This ^^ If you were to grab those signed 16 bit ints and draw the values on the screen in the x direction with increasing index values (scaling appropriately so they'll fit) then you'd see an oscilloscope-like pattern on the screen. This would be the time domain mentioned in the video. They all watch too much MSNBC... they get ideas. |
anto80
Member #3,230
February 2003
|
Thank you for the further explanation! Now I still need to figure out how to FFT the small chunks of my waveform into sets of "frequency windows bins", each chunk giving a set of bins(an amplitude value for each frequency window) [Edit] At the end I get frequencies magnitudes in magnitudes[i]. 1#include <stdio.h>
2#include <allegro5/allegro.h>
3#include <allegro5/allegro_audio.h>
4#include <allegro5/allegro_primitives.h>
5#include <allegro5/allegro_acodec.h>
6#include "kiss_fft.h"
7#include "kiss_fftr.h"
8
9#define ASSERT(a) assert(a)
10#define MIN(a,b) (a<b)?a:b
11#define MAX(a,b) (a>b)?a:b
12
13
14
15/* Comment out the following line to use 16-bit audio */
16#define WANT_8_BIT_DEPTH
17
18#define SCRW 1024
19#define SCRH 800
20
21#define BUFFSIZE (SCRW*8)
22
23#define CHUNK_SIZE 512
24
25const ALLEGRO_AUDIO_DEPTH audio_depth = ALLEGRO_AUDIO_DEPTH_UINT8;
26//typedef uint8_t* audio_buffer_t;
27//const uint8_t sample_center = 128;
28const short min_sample_val = -32768; //const int8_t min_sample_val = 0x80;
29const short max_sample_val = 32767;//const int8_t max_sample_val = 0x7f;
30const int sample_range = 0xFFFF; //const int sample_range = 0xff;
31//const int sample_size = 1;
32
33const unsigned int samples_per_fragment = BUFFSIZE;
34
35const unsigned int frequency = 44100;
36const unsigned int max_seconds_to_record = 60 * 5;
37
38const unsigned int playback_fragment_count = 4;
39const unsigned int playback_samples_per_fragment = 4096;
40
41double cardfreq = 44100;
42
43float unbiased[BUFFSIZE];
44float array[BUFFSIZE];
45float magnitudes[BUFFSIZE];
46//float buf[BUFFSIZE]; //used to rebuild time from frequency
47
48kiss_fft_cpx* copycpx(float *mat, int nframe)
49{
50 int i;
51 kiss_fft_cpx *mat2;
52 mat2=(kiss_fft_cpx*)KISS_FFT_MALLOC(sizeof(kiss_fft_cpx)*nframe);
53 kiss_fft_scalar zero;
54 memset(&zero,0,sizeof(zero) );
55 for(i=0; i<nframe ; i++)
56 {
57 mat2[i].r = mat[i];
58 mat2[i].i = zero;
59 }
60 return mat2;
61}
62
63int main(int argc, const char **argv)
64{
65 //ALLEGRO_AUDIO_RECORDER *r;
66 //ALLEGRO_AUDIO_STREAM *s;
67
68 ALLEGRO_EVENT_QUEUE *q;
69 ALLEGRO_DISPLAY *d;
70 float maxmag;
71 float ultimatemag = 0.0;
72 float sum;
73
74 int prev = 0;
75
76 al_init();
77
78 if (!al_init_primitives_addon())
79 {
80 fprintf(stderr,"Unable to initialize primitives addon\n");
81 }
82
83 if (!al_install_keyboard())
84 {
85 fprintf(stderr,"Unable to install keyboard\n");
86 }
87
88 if (!al_install_audio())
89 {
90 fprintf(stderr,"Unable to initialize audio addon\n");
91 }
92
93 if (!al_init_acodec_addon())
94 {
95 fprintf(stderr,"Unable to initialize acodec addon\n");
96 }
97
98 /* Note: increasing the number of channels will break this demo. Other
99 * settings can be changed by modifying the constants at the top of the
100 * file.
101 */
102 // r = al_create_audio_recorder(1000, samples_per_fragment, frequency,
103 // audio_depth, ALLEGRO_CHANNEL_CONF_1);
104 //if (!r)
105 //{
106 // fprintf(stderr,"Unable to create audio recorder\n");
107 //}
108
109 // s = al_create_audio_stream(playback_fragment_count,
110 // playback_samples_per_fragment, frequency, audio_depth,
111 // ALLEGRO_CHANNEL_CONF_1);
112 //if (!s)
113 //{
114 // fprintf(stderr,"Unable to create audio stream\n");
115 //}
116
117 al_reserve_samples(0);
118 //al_set_audio_stream_playing(s, false);
119 //al_attach_audio_stream_to_mixer(s, al_get_default_mixer());
120
121
122 ALLEGRO_SAMPLE* pSample = al_load_sample("E:\\devt\\mywork\\FFT\\FFT\\test.wav");
123 int nSampleLength = al_get_sample_length(pSample);
124 int16_t* ptr = (int16_t*)al_get_sample_data(pSample);
125
126 ALLEGRO_SAMPLE_INSTANCE* pSampleInstance = al_create_sample_instance(pSample);
127 ASSERT(pSampleInstance != NULL);
128
129 bool bRet = al_attach_sample_instance_to_mixer(pSampleInstance, al_get_default_mixer());
130
131
132
133 q = al_create_event_queue();
134
135 /* Note: the following two options are referring to pixel samples, and have
136 * nothing to do with audio samples. */
137 al_set_new_display_flags(ALLEGRO_WINDOWED|ALLEGRO_RESIZABLE);
138
139 al_set_new_display_option(ALLEGRO_SAMPLE_BUFFERS, 1, ALLEGRO_SUGGEST);
140 al_set_new_display_option(ALLEGRO_SAMPLES, 8, ALLEGRO_SUGGEST);
141
142 d = al_create_display(SCRW, SCRH);
143
144 //al_register_event_source(q, al_get_audio_recorder_event_source(r));
145 //al_register_event_source(q, al_get_audio_stream_event_source(s));
146 al_register_event_source(q, al_get_display_event_source(d));
147 al_register_event_source(q, al_get_keyboard_event_source());
148
149 //create necessary fft buffers
150 kiss_fft_cpx out_cpx[BUFFSIZE],*cpx_buf;
151 kiss_fftr_cfg fft = kiss_fftr_alloc(BUFFSIZE*2 ,0 ,0,0);
152 //kiss_fftr_cfg ifft = kiss_fftr_alloc(BUFFSIZE*2,isinverse,0,0);
153
154 //al_start_audio_recorder(r);
155
156 //while (true)
157 //{
158 // ALLEGRO_EVENT e;
159
160 // al_wait_for_event(q, &e);
161
162 //if (e.type == ALLEGRO_EVENT_AUDIO_RECORDER_FRAGMENT)
163 //{
164
165 int nChunkCursor;
166
167 for (nChunkCursor=0; nChunkCursor < nSampleLength; nChunkCursor+=CHUNK_SIZE)
168 {
169
170 //ALLEGRO_AUDIO_RECORDER_EVENT *re = al_get_audio_recorder_event(&e);
171 //audio_buffer_t input = (audio_buffer_t) re->buffer;
172 int sample_count = MIN(CHUNK_SIZE, nSampleLength-nChunkCursor); //re->samples;
173 //const int R = sample_count / BUFFSIZE;
174 int i;
175
176
177
178
179 al_clear_to_color(al_map_rgb(0,0,0));
180
181 //draw the original waveform (time domain)
182 //for (i = 1; i < SCRW; ++i)
183 //{
184 // int j, c = 0;
185
186 // /* Take the average of R samples so it fits on the screen */
187 // for (j = i * R; j < i * R + R && j < sample_count; ++j)
188 // {
189 // c += ptr[nChunkCursor+j]; // input[j] - sample_center;
190 // }
191 // c /= R;
192
193 // /* Draws a line from the previous sample point to the next */
194 // al_draw_line(i - 1,
195 // SCRH/4 + ( ( (prev - min_sample_val) / (float) sample_range) * SCRH/2 - 256),
196 // i,
197 // SCRH/4 + ( ( (c - min_sample_val) / (float) sample_range) * SCRH/2 - 256),
198 // al_map_rgb(255,255,255), 1.2);
199
200 // prev = c;
201 //}
202
203 //copy to another buffer and remove any bias
204 sum = 0.0;
205 for(i=0;i<BUFFSIZE;i++)
206 {
207 unbiased[i] = ptr[nChunkCursor+i]; // input[i];
208 sum += unbiased[i];
209 }
210
211 sum /= (float)BUFFSIZE;
212
213 for(i=0;i<BUFFSIZE;i++)
214 {
215 unbiased[i] -= sum;
216 }
217
218 maxmag = 0.0;
219 //filter out "low input" like noise gate
220 for(i=0;i<BUFFSIZE;i++)
221 {
222 if(maxmag < unbiased[i])
223 maxmag = unbiased[i];
224 }
225 if(maxmag < 2.0)
226 goto noise_gate;
227
228 //copy to another buffer while clamping to a triangular "window"
229 //Wild Ass Guess: Pinching off the ends to zero simulates having the sinusoids
230 //begin and end at the beginning and end of the buffer to avoid distorting the fft
231 //while still keeping the buffer size at a power of two.
232 for(i=0;i<BUFFSIZE/2;i++)
233 {
234 float ttmp = (float)i/(BUFFSIZE/2);
235 array[i] = unbiased[i] * ttmp;
236 array[BUFFSIZE - i - 1] *= ttmp; //mirror it on the second half
237 }
238
239 cpx_buf = copycpx(array,BUFFSIZE);
240 kiss_fftr(fft,(kiss_fft_scalar*)cpx_buf, out_cpx);
241
242 //kiss_fftri(ifft,out_cpx,(kiss_fft_scalar*)out ); //the inverse, which we're not using
243
244 //convert the complex numbers to scalar magnitude via Pythagoras
245 maxmag = 0.0;
246 for(i=1;i<BUFFSIZE/4;i++)
247 {
248 magnitudes[i] = 0.005 * (-(sqrt(out_cpx[i].i*out_cpx[i].i) + (out_cpx[i].r * out_cpx[i].r))); //*(float)1.0/i );
249 if(magnitudes[i] < maxmag)
250 maxmag = magnitudes[i];
251 }
252
253 maxmag = -200.0/maxmag;
254
255 for(i=0;i<BUFFSIZE/4;i++)
256 {
257 magnitudes[i] *= maxmag;
258 }
259
260 //draw the frequency domain
261 /* Changed this to only include the first 1/4 of the bins because of the max expected frequency
262 * compared to the Nyquist frequency
263 for(i=1;i<BUFFSIZE/2 - 1;i++)
264 {
265 al_draw_line( (i - 1) * 2,
266 SCRH/2 + magnitudes[i-1],
267 i * 2,
268 SCRH/2 + magnitudes[i],
269 al_map_rgb(255,0,0), 1.2);
270 }
271 */
272
273 for(i=1;i<SCRW;i++)
274 {
275 al_draw_line( (i - 1) * 8,
276 SCRH/2 + magnitudes[i-1],
277 i * 8,
278 SCRH/2 + magnitudes[i],
279 al_map_rgb(255,0,0), 1.2);
280 }
281
282noise_gate:
283 al_flip_display();
284 //printf("%f\n",maxmag);
285 if(ultimatemag < maxmag)
286 ultimatemag = maxmag;
287
288 //}
289
290 //else if (e.type == ALLEGRO_EVENT_DISPLAY_CLOSE)
291 //{
292 // break;
293 //}
294 //else if (e.type == ALLEGRO_EVENT_KEY_CHAR)
295 //{
296 // if (e.keyboard.unichar == 27)
297 // {
298 // /* pressed ESC */
299 // break;
300 // }
301 //}
302 //}
303
304 }
305
306 /* clean up */
307 // al_destroy_audio_recorder(r);
308 // al_destroy_audio_stream(s);
309 kiss_fft_cleanup();
310 free(fft);
311
312 al_stop_sample_instance(pSampleInstance);
313 al_destroy_sample_instance(pSampleInstance);
314 al_destroy_sample(pSample);
315 printf("biggest mag was %f\n",ultimatemag);
316 return 0;
317}
___________ |
|