I see two main issues: yes the collision and the other is the drawing. Let's tackle the drawing first. That's a slightly easier fix.
You're drawing 10,000 pixels to the screen 60 times a second.
Drawing individual pixels is surprisingly slow this way. You're making 10,000 calls to whatever is drawing your screen in under 1/60th of a second.
First: use al_put_pixel(). The difference is al_draw_pixel uses current blender and transformation settings. If you're just trying to put pixels use that.
Second: al_lock_bitmap() your target bitmap before you start drawing the pixels then al_unlock_bitmap() it when you're finished. That holds off all drawing to that until it's unlocked, so it can do it in one batch (theoretically anyway).
Now, the collision is much more complicated: you're testing 10,000 objects against 9,999 other objects. That's 99,990,000 tests I think. If all you're doing is stopping the sand from falling when it encounters more sand, instead consider just testing the next pixel it might move to to see if it's sand colored. al_get_pixel, but be sure to lock the bitmap before you start this process for the same reason.
If you're going to reuse that color, I would instead draw all the sand to a "sand bitmap" that is clear other than sand, then draw that to the buffer.
This is from my experience, maybe try that and see if anyone comes along to tell you otherwise.