How to get raw sample frequencies

anto80

Hello all,

I'm trying to get all frequencies of a sound sample (22 kbps Mono).
That is to say, if I break down the sample in pieces of (let's say) 0.01 secs the sound frequency of each piece.
(This has nothing to do with al_get_sample_frequency, which returns 22kbps.)

So to do this, I was looking at void *al_get_sample_data(const ALLEGRO_SAMPLE *spl)

ALLEGRO_SAMPLE* pSample = /* ...the loaded wav file*/ ;
int nDataLength = al_get_sample_length(pSample);
int16_t* ptr = (int16_t*)al_get_sample_data(pSample);

In my case, nDataLength returns 47936 (My sample is 1.780 sec long)
However if I look precisely at the ptr memory, the int16_t values are as follows

ptr[ 0] =  3
ptr[ 1] =  0
ptr[ 2] = -1
ptr[ 3] =  1
ptr[ 4] =  0
ptr[ 5] =  0
ptr[ 6] =  0
ptr[ 7] =  2
ptr[ 8] =  0
ptr[ 9] =  1
ptr[10] = -1
ptr[11] =  0
ptr[12] =  2
ptr[13] =  2
ptr[14] =  1
ptr[15] =  1
ptr[16] = -1
ptr[17] = -2
ptr[18] =  0
ptr[19] = -1
ptr[20] =  0
 ... ptr[10000] = -201	
 ... ptr[20000]	= -346	
 ... ptr[30000]	= -226	
 ... ptr[40000]	= -91	
 ... ptr[47935] = 34

...It does not seems to be the raw frequencies of the sample pieces.
So what are these values? (especially negative values?)
Is there a way to get small parts frequencies?

al_get_sample_data documentation is somewhat cryptic here... It says "Return a pointer to the raw sample data."

Thank you

Arthur Kalliokoski

I believe you want to use Fast Fourier Transform on the sample data.

anto80

Thanks for your answer.
It seems that fft is the right way to go, but is it possible to do fft with Allegro 5 or C++?

Another question is: Why nDataLength = 47936 since my sample is 1.780 sec long? 1.780 * 22050 = 39249 (not 47936). What does ptr[] contain?

Thanks

Arthur Kalliokoski

anto80 said:

Why nDataLength = 47936 since my sample is 1.780 sec long?

I don't know.

The ptr[] evidently contains int16_t values, which are signed short ints.

I was fiddling with a guitar tuner last summer, but I've reinstalled the OS to a bigger hard drive and haven't reinstalled Allegro. Give me an hour and I'll see if I can give you an example program that allows you to input sounds through the microphone and see how the "main frequency" and the sample distributions change (also known as timbre).

[EDIT] Well, the last announcement for a WIP in Allegro Development tried to give me allegro-5.0.11 something, which I already had from 2014, so I tried allegro-5.1.8. I can't get this stuff to compile, haven't used it so long I'm a n00b again too.

I think this zip file has the source for some sort of FFT thing, IIRC it tried to print the nearest note of the predominant frequency and how much it was off by. It uses OpenGL, sorry. You may be able to get some relevant bits out of it.

As for how FFT works, I have some totally wild guesses.

<total n00bishness follows>
The FFT thing works by making a sine wave of some given frequency and compares it to the samples to see how well it follows the curve. A good fit results in a large value, so the "bin" for that frequency gets the high value. It then tries higher and higher frequencies to see how well they fit in turn and stores those values in their respective bins. And for low values too, obviously. However, since the generated sine wave starts from a zero point, and the samples waveforms start from a random point, it would most likely guess wrong unless the sample could be forced to start from the zero point of a waveform. This is about as difficult in itself, so what the FFT does is make a "window" to copy the sample to, and reduces the frequencies toward both ends so it "looks" like it starts from a zero point. OTOH, this "window" necessarily averages quite a few sample points to fit one bin, so eventually you have to compromise between accuracy and speed. Also, I got stuck for awhile on the fact that the display of the "bins" seemed to produce a useless S shaped curve, but what I didn't realize was that it was symmetrical and I only needed to examine the very first few "bins" at the beginning to get something reasonable. At least, that's my memory of what I thought last summer.

Note: edited the above again

anto80

Thank you for your time and trouble...

(Well, I can't compile 5.1 either; I'm sticking to 5.0.10 for now...)

If I understand your example correctly, it seems that you calculates volume (
magnitudes[i] variable) using root mean square (average of sqrt(v[i]*v[i]))
It seems to be like in this other thread https://www.allegro.cc/forums/thread/610890 (see SiegeLord's message)

However I don't see the relation between the array of instant volumes and an array of frequencies I'm searching...

Arthur Kalliokoski

The vertical bouncing green bars on his graph each represent a small range of frequencies, the "bins" I speak of above. It's like the histogram thing you might see on an audio player.

anto80

Thanks again.

Yes I know what FFT is.
(Transforming "Time domain" overlapping cosine-like waveforms into a set of "Frequency domain" values.)

But, to do this, I think I must first know what I get from al_get_sample_data.
If I can't figure what it is (seems that the length of the array does not match...) I won't be able to perform any math.

Kitty Cat

anto80 said:

But, to do this, I think I must first know what I get from al_get_sample_data.

Most likely it's the amplitude. Essentially it relates to the electrical signal being sent to the speaker, but more basically, it determines the physical position the speaker should have at that point in time; 0 is centered ("at rest"), negative values pull in, and positive values push out. By constantly oscillating in and out, the speaker compresses the air and creates sound waves.

Why the array is longer than the sound length, I can't say. Are you sure the sample is exactly 1.780 seconds long, and not 1.780 seconds of sound followed by a bit of silence?

Arthur Kalliokoski

Kitty Cat said:

it determines the physical position the speaker should have at that point in time; 0 is centered ("at rest"), negative values pull in, and positive values push out. By constantly oscillating in and out, the speaker compresses the air and creates sound waves.

This ^^

If you were to grab those signed 16 bit ints and draw the values on the screen in the x direction with increasing index values (scaling appropriately so they'll fit) then you'd see an oscilloscope-like pattern on the screen. This would be the time domain mentioned in the video.

anto80

Thank you for the further explanation!

Now I still need to figure out how to FFT the small chunks of my waveform into sets of "frequency windows bins", each chunk giving a set of bins(an amplitude value for each frequency window)

[Edit]
I have solved my problem with the following code, thanks to Arthur's example.
I'm letting all commented code to see the difference with the original code.

At the end I get frequencies magnitudes in magnitudes[i].
The frequencies are output on screen (1024x800)
The only thing I don't know is the frequency scale, that is to say, what is the frequency window of "magnitudes[i]"... but I'll try to find out
Thanks again.

#SelectExpand
   1#include <stdio.h>
   2#include <allegro5/allegro.h>
   3#include <allegro5/allegro_audio.h>
   4#include <allegro5/allegro_primitives.h>
   5#include <allegro5/allegro_acodec.h>
   6#include "kiss_fft.h"
   7#include "kiss_fftr.h"
   8
   9#define ASSERT(a)          assert(a)
  10#define MIN(a,b)          (a<b)?a:b
  11#define MAX(a,b)          (a>b)?a:b
  12
  13
  14
  15/* Comment out the following line to use 16-bit audio */
  16#define WANT_8_BIT_DEPTH
  17
  18#define SCRW 1024
  19#define SCRH 800
  20
  21#define BUFFSIZE (SCRW*8)
  22
  23#define CHUNK_SIZE    512
  24
  25const ALLEGRO_AUDIO_DEPTH audio_depth = ALLEGRO_AUDIO_DEPTH_UINT8;
  26//typedef uint8_t* audio_buffer_t;
  27//const uint8_t sample_center = 128;
  28const short min_sample_val = -32768; //const int8_t min_sample_val = 0x80;
  29const short max_sample_val = 32767;//const int8_t max_sample_val = 0x7f;
  30const int sample_range = 0xFFFF; //const int sample_range = 0xff;
  31//const int sample_size = 1;
  32
  33const unsigned int samples_per_fragment = BUFFSIZE;
  34
  35const unsigned int frequency = 44100;
  36const unsigned int max_seconds_to_record = 60 * 5;
  37
  38const unsigned int playback_fragment_count = 4;
  39const unsigned int playback_samples_per_fragment = 4096;
  40
  41double cardfreq = 44100;
  42
  43float unbiased[BUFFSIZE];
  44float array[BUFFSIZE];
  45float magnitudes[BUFFSIZE];
  46//float buf[BUFFSIZE];  //used to rebuild time from frequency
  47
  48kiss_fft_cpx* copycpx(float *mat, int nframe)
  49{
  50  int i;
  51  kiss_fft_cpx *mat2;
  52  mat2=(kiss_fft_cpx*)KISS_FFT_MALLOC(sizeof(kiss_fft_cpx)*nframe);
  53  kiss_fft_scalar zero;
  54  memset(&zero,0,sizeof(zero) );
  55  for(i=0; i<nframe ; i++)
  56  {
  57    mat2[i].r = mat[i];
  58    mat2[i].i = zero;
  59  }
  60  return mat2;
  61}
  62
  63int main(int argc, const char **argv)
  64{
  65  //ALLEGRO_AUDIO_RECORDER *r;
  66  //ALLEGRO_AUDIO_STREAM *s;
  67
  68  ALLEGRO_EVENT_QUEUE *q;
  69  ALLEGRO_DISPLAY *d;
  70  float maxmag;
  71  float ultimatemag = 0.0;
  72  float sum;
  73
  74  int prev = 0;
  75
  76  al_init();
  77
  78  if (!al_init_primitives_addon())
  79  {
  80    fprintf(stderr,"Unable to initialize primitives addon\n");
  81  }
  82
  83  if (!al_install_keyboard())
  84  {
  85    fprintf(stderr,"Unable to install keyboard\n");
  86  }
  87
  88  if (!al_install_audio())
  89  {
  90    fprintf(stderr,"Unable to initialize audio addon\n");
  91  }
  92
  93  if (!al_init_acodec_addon())
  94  {
  95    fprintf(stderr,"Unable to initialize acodec addon\n");
  96  }
  97
  98  /* Note: increasing the number of channels will break this demo. Other
  99  * settings can be changed by modifying the constants at the top of the
 100  * file.
 101  */
 102  // r = al_create_audio_recorder(1000, samples_per_fragment, frequency,
 103  //   audio_depth, ALLEGRO_CHANNEL_CONF_1);
 104  //if (!r)
 105  //{
 106  //  fprintf(stderr,"Unable to create audio recorder\n");
 107  //}
 108
 109  // s = al_create_audio_stream(playback_fragment_count,
 110  //    playback_samples_per_fragment, frequency, audio_depth,
 111  //    ALLEGRO_CHANNEL_CONF_1);
 112  //if (!s)
 113  //{
 114  //  fprintf(stderr,"Unable to create audio stream\n");
 115  //}
 116
 117  al_reserve_samples(0);
 118  //al_set_audio_stream_playing(s, false);
 119  //al_attach_audio_stream_to_mixer(s, al_get_default_mixer());
 120
 121
 122  ALLEGRO_SAMPLE* pSample = al_load_sample("E:\\devt\\mywork\\FFT\\FFT\\test.wav");
 123  int nSampleLength = al_get_sample_length(pSample);
 124  int16_t* ptr = (int16_t*)al_get_sample_data(pSample);   
 125  
 126  ALLEGRO_SAMPLE_INSTANCE* pSampleInstance = al_create_sample_instance(pSample);
 127  ASSERT(pSampleInstance != NULL);
 128
 129  bool bRet = al_attach_sample_instance_to_mixer(pSampleInstance, al_get_default_mixer());
 130
 131
 132
 133  q = al_create_event_queue();
 134
 135  /* Note: the following two options are referring to pixel samples, and have
 136  * nothing to do with audio samples. */
 137  al_set_new_display_flags(ALLEGRO_WINDOWED|ALLEGRO_RESIZABLE);
 138    
 139  al_set_new_display_option(ALLEGRO_SAMPLE_BUFFERS, 1, ALLEGRO_SUGGEST);
 140  al_set_new_display_option(ALLEGRO_SAMPLES, 8, ALLEGRO_SUGGEST);
 141
 142  d = al_create_display(SCRW, SCRH);
 143
 144  //al_register_event_source(q, al_get_audio_recorder_event_source(r));
 145  //al_register_event_source(q, al_get_audio_stream_event_source(s));
 146  al_register_event_source(q, al_get_display_event_source(d));
 147  al_register_event_source(q, al_get_keyboard_event_source());
 148
 149  //create necessary fft buffers
 150  kiss_fft_cpx out_cpx[BUFFSIZE],*cpx_buf;
 151  kiss_fftr_cfg fft = kiss_fftr_alloc(BUFFSIZE*2 ,0 ,0,0);
 152  //kiss_fftr_cfg ifft = kiss_fftr_alloc(BUFFSIZE*2,isinverse,0,0);
 153
 154  //al_start_audio_recorder(r);
 155
 156  //while (true)
 157  //{
 158  //  ALLEGRO_EVENT e;
 159
 160  //  al_wait_for_event(q, &e);
 161
 162    //if (e.type == ALLEGRO_EVENT_AUDIO_RECORDER_FRAGMENT)
 163    //{
 164
 165  int nChunkCursor;
 166
 167  for (nChunkCursor=0; nChunkCursor < nSampleLength; nChunkCursor+=CHUNK_SIZE)
 168  {
 169
 170      //ALLEGRO_AUDIO_RECORDER_EVENT *re = al_get_audio_recorder_event(&e);
 171      //audio_buffer_t input = (audio_buffer_t) re->buffer;
 172      int sample_count = MIN(CHUNK_SIZE, nSampleLength-nChunkCursor);  //re->samples;
 173      //const int R = sample_count / BUFFSIZE;
 174      int i;
 175
 176
 177
 178
 179      al_clear_to_color(al_map_rgb(0,0,0));
 180
 181      //draw the original waveform (time domain)
 182      //for (i = 1; i < SCRW; ++i)
 183      //{
 184      //  int j, c = 0;
 185
 186      //  /* Take the average of R samples so it fits on the screen */
 187      //  for (j = i * R; j < i * R + R && j < sample_count; ++j)
 188      //  {
 189      //    c += ptr[nChunkCursor+j]; // input[j] - sample_center;
 190      //  }
 191      //  c /= R;
 192
 193      //  /* Draws a line from the previous sample point to the next */
 194      //  al_draw_line(i - 1,
 195      //         SCRH/4 + ( ( (prev - min_sample_val) / (float) sample_range) * SCRH/2 - 256),
 196      //         i,
 197      //         SCRH/4 + ( ( (c - min_sample_val) / (float) sample_range) * SCRH/2 - 256),
 198      //         al_map_rgb(255,255,255), 1.2);
 199
 200      //  prev = c;
 201      //}
 202
 203      //copy to another buffer and remove any bias
 204      sum = 0.0;
 205      for(i=0;i<BUFFSIZE;i++)
 206      {
 207        unbiased[i] = ptr[nChunkCursor+i]; // input[i];
 208        sum += unbiased[i];
 209      }
 210
 211      sum /= (float)BUFFSIZE;
 212
 213      for(i=0;i<BUFFSIZE;i++)
 214      {
 215        unbiased[i] -= sum;
 216      }
 217
 218      maxmag = 0.0;
 219      //filter out "low input" like noise gate
 220      for(i=0;i<BUFFSIZE;i++)
 221      {
 222        if(maxmag < unbiased[i])
 223          maxmag = unbiased[i];
 224      }
 225      if(maxmag < 2.0)
 226        goto noise_gate;
 227
 228      //copy to another buffer while clamping to a triangular "window"
 229      //Wild Ass Guess:  Pinching off the ends to zero simulates having the sinusoids
 230      //begin and end at the beginning and end of the buffer to avoid distorting the fft
 231      //while still keeping the buffer size at a power of two.
 232      for(i=0;i<BUFFSIZE/2;i++)
 233      {
 234        float ttmp = (float)i/(BUFFSIZE/2);
 235        array[i] = unbiased[i] * ttmp;
 236        array[BUFFSIZE - i - 1] *= ttmp;  //mirror it on the second half
 237      }
 238
 239      cpx_buf = copycpx(array,BUFFSIZE);
 240      kiss_fftr(fft,(kiss_fft_scalar*)cpx_buf, out_cpx);
 241
 242      //kiss_fftri(ifft,out_cpx,(kiss_fft_scalar*)out );  //the inverse, which we're not using
 243
 244      //convert the complex numbers to scalar magnitude via Pythagoras
 245      maxmag = 0.0;
 246      for(i=1;i<BUFFSIZE/4;i++)
 247      {
 248        magnitudes[i] =  0.005 * (-(sqrt(out_cpx[i].i*out_cpx[i].i) + (out_cpx[i].r * out_cpx[i].r))); //*(float)1.0/i );
 249        if(magnitudes[i] < maxmag)
 250          maxmag = magnitudes[i];
 251      }
 252
 253      maxmag = -200.0/maxmag;
 254
 255      for(i=0;i<BUFFSIZE/4;i++)
 256      {
 257        magnitudes[i] *= maxmag;
 258      }
 259
 260      //draw the frequency domain
 261      /* Changed this to only include the first 1/4 of the bins because of the max expected frequency
 262       * compared to the Nyquist frequency
 263      for(i=1;i<BUFFSIZE/2 - 1;i++)
 264      {
 265        al_draw_line( (i - 1) * 2,
 266                SCRH/2 + magnitudes[i-1],
 267               i * 2,
 268          SCRH/2 + magnitudes[i],
 269          al_map_rgb(255,0,0), 1.2);
 270      }
 271      */
 272
 273      for(i=1;i<SCRW;i++)
 274      {
 275        al_draw_line( (i - 1) * 8,
 276            SCRH/2 + magnitudes[i-1],
 277            i * 8,
 278            SCRH/2 + magnitudes[i],
 279            al_map_rgb(255,0,0), 1.2);
 280      }
 281
 282noise_gate:
 283      al_flip_display();
 284      //printf("%f\n",maxmag);
 285      if(ultimatemag < maxmag)
 286        ultimatemag = maxmag;
 287
 288    //}
 289
 290    //else if (e.type == ALLEGRO_EVENT_DISPLAY_CLOSE)
 291    //{
 292    //  break;
 293    //}
 294    //else if (e.type == ALLEGRO_EVENT_KEY_CHAR)
 295    //{
 296    //  if (e.keyboard.unichar == 27)
 297    //  {
 298    //    /* pressed ESC */
 299    //    break;
 300    //  }
 301    //}
 302  //}
 303
 304  }
 305
 306  /* clean up */
 307  // al_destroy_audio_recorder(r);
 308  // al_destroy_audio_stream(s);
 309  kiss_fft_cleanup();
 310  free(fft);
 311  
 312  al_stop_sample_instance(pSampleInstance);  
 313  al_destroy_sample_instance(pSampleInstance);
 314  al_destroy_sample(pSample);
 315  printf("biggest mag was %f\n",ultimatemag);
 316  return 0;
 317}

Thread #615596. Printed from Allegro.cc