|
Value Noise optimisation |
SiegeLord
Member #7,827
October 2006
|
I am making a game where the users play in a randomly generated galaxy. I make it using a collection of brushes which I then perturb using a value noise function that I salvaged from this site. Normally I wouldn't ask a question as broad as this, but I am reaching the limits of my abilities. I wish to have any comments on possible optimizations that can be done with the functions associated with the actual perturbation process. I call the PerturbGalaxy function and it calls the rest of the functions as appropriate. It returns usually after ~40 seconds (all compiler optimizations turned on) on my 1.58 Ghz AMD.
And for the curious here's what the end result looks like: {"name":"590913","src":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/9\/6\/96ad169bcd47a12eaf2f683fdc8b935b.jpg","w":817,"h":635,"tn":"\/\/djungxnpq2nug.cloudfront.net\/image\/cache\/9\/6\/96ad169bcd47a12eaf2f683fdc8b935b"} Also, a related question, would you think that 40secs is too much to wait for a map to be created? It would be done before every match... Thanks. "For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18 |
Bruce Perry
Member #270
April 2000
|
OK, this is going to be painful ... Keep the current copy in case you decide to go back to it. I'd say start by inlining as many functions as you can. At the moment, the code is written very modularly with the lower levels quite generic; you can probably flatten it out and replace parameters with hardcoded values, unroll loops, and then remove unnecessary work. For example, I haven't looked closely but if you unroll that octave iteration, then the highest frequency in the Perlin noise will not need interpolating if the point spacing is one pixel. Having done that, you'll find you have those pseudo-random numbers being calculated for every pixel and then interpolated. First, arrange to calculate them on demand and then keep them around until you've finished with them. There will be runs of pixels within which every pixel is calculated using the same four points with increasing x-interpolation coefficients and constant y-interpolation coefficients. You should be able to get the four values once, precalculate the y-interpolation, and then just do the x-interpolation per pixel, using the same two values for every pixel in that run. This will enormously reduce the number of times you calculate those values. It could get tricky if you try to do all these in one loop, so you might want to precalculate the noise value per pixel and do it in several passes. You could run through an array of floats and put the high-frequency noise in, then run through again and add in the mid-frequency noise using the above interpolation trick, then run through again and do the low-frequency noise - with as many passes as necessary. Having done all that, you may not even have to unroll the octave loop - except perhaps for the highest frequency if it does indeed not need interpolating! Finally, when you do parabolic interpolation like what you've got there (parabolic means A*x^n + B*x^(n-1) + C*x^(n-2) + ... + D*x^2 + E*x + F), you can usually calculate some delta values (like differentiation but with discrete steps) and get rid of the multiplication. The trick is to work out an equation for how much the value of the interpolation function will change per pixel, so df = f(x+1) - f(x). If that is variable, work out how much THAT will change per pixel (ddf = df(x+1) - df(x)). Eventually you'll end up with a constant, and you'll have some code inside the loop for updating the value as follows: f+=df; df+=ddf; ddf+=dddf;//this would be the last line if dddf turned out to be constant You then just need to set f, df, etc. to their initial values before entering the loop, and bam, your per-pixel multiplications have gone. Let us know if you want any steps clarifying or fleshing out. Depending on the game, another trick you could do (though it might clash with the above optimisations) is to calculate a coarse image, set the game going, then refine the image over multiple steps - so the game can start even though it isn't fully formed. Looks good by the way -- |
SiegeLord
Member #7,827
October 2006
|
The inlining did not increase the speed much, only about 1-2 seconds of speedup. Your suggestion to remove the redundancies in calculation did move me to reexamine my pseudo-random table generation. I have modified it such that it would generate a smoothed noise from the start, and since the table has less entries than the galaxy bitmap has pixels, the smoothing convolution is done less number of times. The speed increase was substantial: the generation time is now ~16 seconds. Thank you for your suggestions. "For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18 |
Jakub Wasilewski
Member #3,653
June 2003
|
I have an optimised implementation for something like that. It's not Perlin noise, but it is very similar - specifically, the final output is a sum of 2D random functions with varying frequency and amplitude. The only difference is that I don't use specific pseudo-random functions, but use the built-in (or other) RNG. The workhorse function is provided below. It does one pass over the terrainData array, adding the values from a new function. The amplitude parameter specifies the largest displacement that could be added, and the "tileSize" parameter determines frequency. Basically, it's the distance betweeen two genuine random values, and everything in between is interpolated. I use cosine interpolation, but you can easily change that and calculate different mu coefficients in the first few lines.
This could calculate several 640x480 with 64 base tile size (so, 8 iterations of onePass) landscapes in a second on my Athlon XP, if I recall correctly (including post-processing operations like ensuring the proper area under-water, generating the map bitmap and so on). You can optimize it further probably, especially if it is guaranteed that TERRAIN_WIDTH and TERRAIN_HEIGHT are always divisible by tileSize. --------------------------- |
Bob
Free Market Evangelist
September 2000
|
Notice that you're calling prand() about 600 times for each pixel in your final image. Instead, you may want to precompute the prand() for each (x,y) location you're interested in, then perform a (separable) convolution to sum up and scale each component of the noise. That should greatly speed up the process. -- |
Sirocco
Member #88
April 2000
|
Quote: Also, a related question, would you think that 40secs is too much to wait for a map to be created? Yes, definitely. If a map can't be generated in under 5 seconds or so I'd be extremely disappointed. 40 seconds is too long to wait for anything when you're a gamer As Bob mentioned all the random number calls are bogging things down. In instances where I've needed fast noise generation I've often employed short arrays of noise offsets and used a single random number call for each vertical line to determine which offset array to use. So instead of doing 307,200 prands for each frame you'd be doing either 640 or 480 (for a 640x480 screen). --> |
SiegeLord
Member #7,827
October 2006
|
Okay, I made several more optimizations, and got the generation speed down to 6-7 seconds. I decided not to utilize the multi-pass approach, since my initial tests did not result in any speed increase... At this point, the generation time is to my liking, and I shall wait until more problems arise before optimizing further. Here's my newest version of the associated code:
Sirocco said: 40 seconds is too long to wait for anything when you're a gamer This shall be a turn based game, and I shall implement a minimum time interval that a turn has to take, as to combat that rushing mentality. "For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18 |
gillius
Member #119
April 2000
|
Is this a network game? If so you could have the computers all generate the map in parallel and break up the work . Gillius |
|