![]() |
|
Fast loops? |
gnolam
Member #2,030
March 2002
![]() |
What's the fastest way of doing a loop like this:
With a loop like the example, without doing anything besides just running the loop, I get around 10 FPS! (with optimization enabled (-O2) in gcc, around 30). This is far too slow to be useful... and this is just with MAX_DIST set to 30 (not very far). The reason I'm asking is because I have just resurrected my old voxel project, and I have to raytrace each pixel of the screen each frame (That is, unless I magically come up with how to do wave surfing with 5 degrees of freedom instead of 4 overnight). I have already planned to use lookup tables for all that nasty trigonometry, but I can't find a way to make the loops themselves run any faster... does anyone have any tricks to speed them up? -- |
Gabhonga
Member #1,247
February 2001
![]() |
try to make it a single loop, if possible, in example: for(i;i<(640*480*MAX_DIST);i++) {...}
other ways of making the loop faster would depend very much on the actual work done inside...have you tried enabling you compiler's optimizer, btw? -------------------------------------------------------- |
Bob
Free Market Evangelist
September 2000
![]() |
Consider that you're doing 30*640*480 itterations each time; that's 9216000 iterations. If you have a 4 cycle average loop overhead (which is typical), you'll end up with 36864000 cycles of wasted time. You're better off trying to do less work. For example, don't update the whole screen, but only the parts that have changed. Don't loop 30 times per pixel if you can help it; see if you can find a way to do it in 1 or 2 iterations of that inner loop. -- |
LEDominant
Member #2,218
April 2002
![]() |
Sounds like you are trying to do a true 3D raytracing engine. Programs like 3D Studios and POV-Ray use that. IT CAN TAKE HOURS TO RENDER A FRAME Try doing Height-map Voxel(Delta Force), Sector Voxel(Duke Nukem 3D) or BSP Voxel(Doom).
-------------------------------------------------------- |
Plucky
Member #1,346
May 2001
![]() |
Hmmm. For my own voxel engine project I get ~60-100+ fps (depending on the number steps with matching step size) on my Athlon 1.2GHz, GeForce2 computer. However I'm not quite as ambitious as you are with 640x480. What I'm doing:
</li> /* essentially: */ //blah, blah for (column=0; column<360; column++) { // blah, blah for (step = 0; step<MAXSTEP; step++) { // Get Terrain height while (ray < terrain height) { // check if we're still in viewport // putpixel // move ray up and other blahs } // move ray and other blah, blahs } // other blah, blahs }
Perhaps your voxel algorithm is not particularly efficient. Other things: Do you really need 640x480 viewport? Fixed point works well. Bilinear filtering makes things look better but requires work to prevent it from bogging down the engine too much. After some distance away, step size could get larger since detail gets blurry at a distance anyways (I've experimented with this but haven't implemented it.) BTW I'm not raycasting clouds... it's just a simple moving perspective texture with some dithering to hide some of the blockiness. If the view haven't moved, there's no need to perform a voxel rendering loop... at those 'idle' times my fps at least doubles (200+). Try reducing the rate of user input so you have more idle frames where no voxel rendering is required. Holler if you want more ideas. |
spellcaster
Member #1,493
September 2001
![]() |
The problem is that he wants a voxel engine and you're describing a heightmapper. Which is a similar concept, but still a different thing. A true voxel engine really uses "3d pixels" == VOlumetric piXEL. So you could create tables, chairs, etc. in true 3d with it. -- |
Plucky
Member #1,346
May 2001
![]() |
Usually when someone talks about a voxel engine, they refer to a terrain mapping engine. It's still uncommon to use voxels for texturing / rendering a 3D object. Furthermore, he mentions using the wave-surfing technique, which is exactly what I'm doing. And voxels for terrain is still voxels... it's still a 3d 'pixel' where the view is only from above. |
gnolam
Member #2,030
March 2002
![]() |
Plucky: I couldn't find a way to use wave surfing for more than 4 DOF, so I had to resort to raytracing... besides, I had done a true voxel renderer (like Spellcaster describes) a while back, which I unfortunately broke some time ago and never botherered to restore, and that used the same technique that I am using now. That one was too slow to use for more than rendering static images though... so this time around I was trying to make it work in realtime. Guess my options are:
-- |
Plucky
Member #1,346
May 2001
![]() |
Here's my take: As we all know, the standard 4 DOF voxel terrain renderer use the first 4. You're looking for the 5th one. For the wave surfing technique, you need to calculate the initial ray vector and an algorithm to move the ray up one pixel row. For a 4 DOF renderer, this can be precalculated. For a 5 DOF renderer, this needs to be calculated for each frame. I think this is the only real difference. In other words, 5 DOF voxel terrain renderer is done similarly to a 4 DOF one. (This is definitely true for limited up/down pitch... haven't confirmed that this would work all 360 deg. I'll reason it out aloud below.) Let's try an example. I think the key point is that the row of pixels to start the wave surfing would not be the bottom row of the viewport. This is only true if you cannot see the ground directly below you (the standard case with 4 DOF.) Otherwise, you need to calculate which two pixel rows that straddle the ground directly below. Then you need to wave surf two rays in opposite directions (as described in the previous paragraph) filling the pixel column up and down. Hope this helps. |
Mr Moose
Member #1,245
April 2001
|
Well for what your doing i call that extremely fast heh.. If you want that much freedom in your voxel engine then Yea its probably gonna be slowish, but you could consider making it more blocky, or blocky but bilinear'ed to make smooth, eg skip every 2 pixels (x and y) and you should get 4x the speed. Or you could be lazy and make it just a hightmapper instead heh soon as how there best for games i supose. But anyways hope you find a solution |
uruloki
Member #1,836
January 2002
![]() |
you probably already know, but in such loops be sure to access your data in the order they are stored in memory, to avoid cache misses. Or let the hardware do it for you, by using opengl, setup your voxels using display list and don't access directly to the frame buffer. __ |
Thomas Harte
Member #33
April 2000
![]() |
With all due respect, if the question is only one of taking a heightfield input and producing a 3d output with 5/6DOF, and you have hardware 3d support of any description, then the ROAM algorithm is a much better choice than explicitly using voxels. I'm not the best person to explain, but I'll summarise it briefly. The algorithm works in terms of a recursive structure, the basic building block of which is an isoceles triangle. If you split an isoceles triangle halfway along its longest side and join that point to the opposed point, you get two more isoceles triangles. With ROAM you typically do a coverage calculation that determines how inaccurately a given triangle models the heightfield region it covers based on factors including distance from the camera, although some slightly less mathematically accurate methods better tailored to quick computation are possible, and if the inaccuracy is above a certain threshold you divide and recurse. Obviously you pick the threshold and tweak it based on the polygon complexity your card is happy with and the frame rate you are getting. There are many considerable optimisations based on considering coherences from one frame to the next, but basically that is it. Its a quite simple but very effective means of adaptive meshing and you've probably seen it in a couple of games if you've played any recently, because its quite popular. Also : although this is not true for voxels, and therefore strictly irrelevant to the discussion, I wish to remind anyone who hasn't spotted it yet that using compiled vertex arrays is faster than display lists for most purposes [My site] [Tetrominoes] |
|