Allegro.cc - Online Community

Allegro.cc Forums » Off-Topic Ordeals » Any interest in a 'ray casting' competition?

This thread is locked; no one can reply to it. rss feed Print
Any interest in a 'ray casting' competition?
Thomas Harte
Member #33
April 2000
avatar

Quote:

So it's going to be 'most portable raycaster' compo instead of 'best raycaster' compo? If yes, then I quit.

What, because you suggest:

"IMHO better is to entitle judges that can't run an entry to just not judge it. Any entry have to work at least on 50% judges' computers to be considered valid entry."

And I say:

"We'll make it a requirement that programs work on all judges machines ... if any judge thinks an entry does not meet the competition criteria during judging then he indicates as such. If any entry obtains such indications from at least half of the judges then it is disqualified."

i.e. the exact rule you wanted is implemented, so now you're thinking of quitting?

Quote:

But, I found that drawing 640 scaled wall slices at twice the size they were before using 23yrold3yrold's DrawSlice()

23s DrawSlice seems to use a weird version of Bresenham's line drawer to eliminate a divide? I've come up with this (for 2 byte pixel modes on systems where a short is two bytes) instead:

1void DrawSlice16(BITMAP *src, BITMAP *dst, int texx, int scrx, int scry, fixed adjust)
2{
3 acquire_bitmap(src);
4 acquire_bitmap(dst);
5 
6 fixed srcx = 0;
7 if(scry < 0)
8 {
9 srcx -= scry*adjust;
10 scry = 0;
11 }
12 
13 short *DPTR = (short *)src->line[texx];
14 
15 while((srcx < (src->h << 16)) && (scry < SCREEN_H))
16 {
17 ((short *)(dst->line[scry]))[scrx] = DPTR[srcx >> 16];
18 srcx += adjust;
19 scry++;
20 }
21 
22 release_bitmap(dst);
23 release_bitmap(src);
24}

The clever thing about it is that it assumes you already know 'adjust' because it is just a multiple of the distance to the slice, and distance is something you've worked out already. For example, my code does this:

... calculate distance & texoffset to slice in column c...
DrawColumn16(blah blah bblah, (SCREEN_H >> 1) - ((int)((float)(tex->h / distance)) >> 1), ftofix(distance));

Obviously cut out the float stuff - it's just to make the point more easily. I could cut out the divide to calculate y offset by drawing the slice in DrawSlice to a secondary bitmap then blitting that according to half its height. And you can cut out the acquire/releases if you remember to do them outside of drawslice.

You can also optimise if you can assume a constant offset in memory address space from one bitmap line to the next for the destination bitmap. This is always true in Windows and with memory bitmaps, so I've factored it into my code as a runtime once only test. Turns "((short *)(dst->line[scry]))[scrx]" into a single dereference rather.

And all textures will be drawn rotated by 90 degrees for cache reasons. Just pre-rotate them!

Fladimir da Gorf
Member #1,565
October 2001
avatar

Quote:

I like the idea of a hardware scaling routine, but I of course have no idea how to do that.

OpenLayer ;)

Quote:

To get nice visuals (normalmapping and stuff) one probably will have to use inline assembly, like MMX optimizations

Normal mapping in software... hmmm... sounds an interesting, but rather slow feature.

OpenLayer has reached a random SVN version number ;) | Online manual | Installation video!| MSVC projects now possible with cmake | Now alvailable as a Dev-C++ Devpack! (Thanks to Kotori)

Krzysztof Kluczek
Member #4,191
January 2004
avatar

Quote:

Any entry have to work at least on 50% judges' computers to be considered valid entry.

Quote:

We'll make it a requirement that programs work on all judges machines

Quote:

i.e. the exact rule you wanted is implemented, so now you're thinking of quitting?

I meant that entries should run on most popular configurations (like MMX-capable 800MHz+ Pentium running Windows 98/ME/XP), not on every possible configuration judges can have. Failing to run on some machines shouldn't decrease score of the entry, as long it runs at least on 50% of them. By making judges give 0 points for entries that they didn't get to run would require developers to focus more on portability than on the renderer itself, which IMHO isn't the way to go. :)

Quote:

Normal mapping in software... hmmm... sounds an interesting, but rather slow feature.

With MMX it shouldn't take too much CPU time. :)

Thomas Harte
Member #33
April 2000
avatar

Quote:

By making judges give 0 points for entries that they didn't get to run would require developers to focus more on portability than on the renderer itself

The polling method had been intended to be a form of single transferable vote, so in effect entries would be penalised by not being compatible with all hardware.

On the one hand, an entry shouldn't be penalised because somebody with an awkward computer (e.g. mine) comes along but similarly an entry probably should be penalised if it doesn't run on low end machines if it is badly written or has a habit of crashing on 25% of configurations. These are the two conflicting interests that inform how this matter should be addressed.

I propose that judges be obliged to give a "0 rating" to any program which they "should" be able to run but cannot. That means entries that crash or in some other way are unusable even though they purport to be. There is also an "unrated" designation for programs that they "shouldn't" be able to run and in fact cannot (e.g. a Windows binary in Linux).

The suggested form remains single transferrable. For each category a judge should produce an ordered list of programs they think should be awarded the prize. This will include programs they tried but don't think should win towards the bottom of the list.

Any competition entries which don't appear on the list are taken to be "unrated".

"Top of list" votes are totted up. This produces a number of "should win" votes for every entry. That is scaled so that people who were unable to rate the entry do not negatively affect this round of voting.

The entry or entries if more than one has an equal vote at the bottom of the list are eliminated and the initial count occurs again so that anybody who has eliminated entries at the top of their list is taken to vote for their first non-eliminated choice.

This continues until there is only one non-eliminated entry or several entries with equal votes. The remaining non-eliminated entries are declared the winners of that category.

It would be possible for judges to also submit a list of entries they could run, etc, but don't think should be considered to win a particular category. But we don't want to complicate things too much for judges.

There'll probably be sufficiently few votes that I can double check any irregularities in voting with the voters in question. Quite probably the official vote will be semi-private with only me knowing who has voted for what, but that doesn't prevent whatever discussion people want to have on the boards, etc.

If it is acceptable, I'll come up with a neat way of expressing this and write it into our Pixelate wikipedia entry. It may sound slightly complicated but its actually really easy in practice.

Quote:

Normal mapping in software... hmmm... sounds an interesting, but rather slow feature.

I have to admit that I vaguely considered this but don't have any artwork with normal maps. So I'll be focussing on whatever technical accomplishments I can make possible without too heavy an artwork focus! Hopefully including some nice surprises.

For the record, I've already vaguely started although I have yet to even jump the hurdle of fixed point maths giving me absolutely horrid precision. I'd just use floats but the float to int conversion is an absolutely gigantic killer on PowerPC derivatives - much worse than 80x86s.

Krzysztof Kluczek
Member #4,191
January 2004
avatar

Quote:

but similarly an entry probably should be penalised if it doesn't run on low end machines if it is badly written or has a habit of crashing on 25% of configurations.

If it's badly written or crashing, then yes. But entry having MMX as one of it's minimal requirements and running at unacceptable frame rates on slow machines (like 200MHz), while making use of many effects is perfectly fine for me and I don't think it should be penalized. Options for turning some features off aren't solution in this case, because some scores in categories like 'best visuals' would be based on tests with some features off.

Quote:

I propose that judges be obliged to give a "0 rating" to any program which they "should" be able to run but cannot. That means entries that crash or in some other way are unusable even though they purport to be. There is also an "unrated" designation for programs that they "shouldn't" be able to run and in fact cannot (e.g. a Windows binary in Linux).

I think it's fine when it's put this way. :)

Quote:

I have to admit that I vaguely considered this but don't have any artwork with normal maps. So I'll be focussing on whatever technical accomplishments I can make possible without too heavy an artwork focus! Hopefully including some nice surprises.

Making normalmaps for wall tileable texture is usually quite easy, especially if you are making textures yourself, as heightmap is usually one of the side products of texture making process. :)

Fladimir da Gorf
Member #1,565
October 2001
avatar

Quote:

I have to admit that I vaguely considered this but don't have any artwork with normal maps.

Maybe you could make cheap bump maps by making the textures grayscale and then claculating a normal map from that?

EDIT: By the way, wouldn't SSE and floating points be better for lighting calculations?

OpenLayer has reached a random SVN version number ;) | Online manual | Installation video!| MSVC projects now possible with cmake | Now alvailable as a Dev-C++ Devpack! (Thanks to Kotori)

Thomas Harte
Member #33
April 2000
avatar

Quote:

But entry having MMX as one of it's minimal requirements and running at unacceptable frame rates on slow machines (like 200MHz), while making use of many effects is perfectly fine for me and I don't think it should be penalized.

That will be a matter for individual judges. If you turn in an entry which it appears to them they should be able to play on their machine based on other software they've used, but which toddles along at 2fps then they should be entitled to mark it down - e.g. a game that appears exactly like Wolfenstein 3d but runs at 10fps on a 1Ghz P3.

I really don't think we should be prescriptive about that element.

EDIT: on this topic, can some people try the attached thing and give me an idea of frame rates? I'm getting absolutely tiny numbers on my machine but Allegro pleads no hardware acceleration so it isn't desperately surprising. I definitely need to see if I can get AllegroGL set up for this competition.

Fladimir da Gorf
Member #1,565
October 2001
avatar

FPS: 250

AMD64 3200+
1 GB of 333MHz memory

If you fill the whole screen with that, really might be a bit slow.

OpenLayer has reached a random SVN version number ;) | Online manual | Installation video!| MSVC projects now possible with cmake | Now alvailable as a Dev-C++ Devpack! (Thanks to Kotori)

Thomas Harte
Member #33
April 2000
avatar

Quote:

If you fill the whole screen with that, really might be a bit slow.

Yeah - wonder what I'm doing wrong. This not being a programming thread let's not discuss it at any length. Note that you can use the cursor keys to move, so you can find out exactly how slow it is when it fills the whole screen!

HoHo
Member #4,534
April 2004
avatar

FPS: ~650
p4 3.0@3.8, 512ram r9550se

__________
In theory, there is no difference between theory and practice. But, in practice, there is - Jan L.A. van de Snepscheut
MMORPG's...Many Men Online Role Playing Girls - Radagar
"Is Java REALLY slower? Does STL really bloat your exes? Find out with your friendly host, HoHo, and his benchmarking machine!" - Jakub Wasilewski

Oscar Giner
Member #2,207
April 2002
avatar

Testing on a PIII@733 (and a PCI voodoo3 card) it gets 70-90 fps.

[edit]
I compiled with optimizations on: -O2 -ffast-math -funroll-loops -fomit-frame-pointer

[edit2]
But considering that there's almost nothing on screem, it seems like the bottleneck currently is the buffer->screen blit.

HoHo
Member #4,534
April 2004
avatar

Recompiled with -O3 -march=pentium4 -msse2 and now I get ~1000fps.

__________
In theory, there is no difference between theory and practice. But, in practice, there is - Jan L.A. van de Snepscheut
MMORPG's...Many Men Online Role Playing Girls - Radagar
"Is Java REALLY slower? Does STL really bloat your exes? Find out with your friendly host, HoHo, and his benchmarking machine!" - Jakub Wasilewski

Thomas Harte
Member #33
April 2000
avatar

Quote:

But considering that there's almost nothing on screem, it seems like the bottleneck currently is the buffer->screen blit.

Yeah, I figured it would be. Let me not divert this thread!

Krzysztof Kluczek
Member #4,191
January 2004
avatar

Quote:

on this topic, can some people try the attached thing and give me an idea of frame rates?

Depending on how camera position, from 150 FPS up (but it's drawing only a single line of pixels on black background).

Erkle
Member #3,493
May 2003
avatar

Has anybody tried looking at the duke3d or doom sources to see how they fill each column?

If the writing above has offended you, you've read it wrong.....fool.
And if you read that wrong it means 'All of the above is opinion and is only to be treated as such.'

Krzysztof Kluczek
Member #4,191
January 2004
avatar

Quote:

Has anybody tried looking at the duke3d or doom sources to see how they fill each column?

I can't imagine better method than using fixed point math, so I guess they are using it. Something like:

byte *scr;        // array of 320x200 bytes
int x,y1,y2;      // where to fill
byte *tex;        // selected texture column (array of 256 bytes)

byte *dst = scr + (y1*320+x);
byte *end = src + (y2*320+x);

unsigned int txpos = y1tex<<24;  // y1 maps to tex[y1tex]
int txadd = float(1<<24)/scale;  // scale = pixels per texel

while(dst<=end)
{
  *dst = tex[txpos>>24]
  txpos += txadd;
  dst += 320;
}

I don't think you can get much better than this. :)

Billybob
Member #3,136
January 2003

Take raycaster A and raycaster B, both very, very similar except that one can run on many platforms and the other can only run on one.
Which is the best?

Though I don't want to vote for a portability requirement, I think it should be a bonus. Or as it seems there's many different categories a raycaster can win in. So maybe a non-portable one will win in everything but the "Best of the Best".

Thomas Harte
Member #33
April 2000
avatar

Quote:

I don't think you can get much better than this.

The code I posted earlier does better than that! Yours has a preliminary divide which mine doesn't due to the observation that your variable txadd is just a multiple of distance to the column, and ray casters already know the distance - that's pretty much "what they do". Although I have to admit that you've used an extra 8 bits of accuracy.

Quote:

Has anybody tried looking at the duke3d or doom sources to see how they fill each column?

Ummm, I know that Doom and Rise of the Triad (yuck) use the same code! And Wolfenstein does the equivalent of building a table of compiled sprites in Allegro terms. Which is good because it means you can eliminate all the divides from wall drawing.

Krzysztof Kluczek
Member #4,191
January 2004
avatar

Quote:

Take raycaster A and raycaster B, both very, very similar except that one can run on many platforms and the other can only run on one.
Which is the best?

Take raycaster A, which looks great but requires Pentium MMX+SSE 1GHz+, but looks great and raycaster B, which runs everywhere, but just looks like Doom 1/2 - how can you even compare them? Probably 'most portable' could be additional category.

Quote:

The code I posted earlier does better than that! Yours has a preliminary divide which mine doesn't due to the observation that your variable txadd is just a multiple of distance to the column, and ray casters already know the distance - that's pretty much "what they do". Although I have to admit that you've used an extra 8 bits of accuracy.

I meant only the inner loop, not the preparation code. Extra 8 bits of precision were used primarily to avoid having to mask overflow bits every pixel (unsigned int just overflows and bit shift does the rest). You can probably optimize inner loop using asm (like using loop with CX), but probably not too much. :)

Thomas Harte
Member #33
April 2000
avatar

I've updated the competition page to include the latest draft of the rules. I've gone out on a limb as to how vote scaling should be calculated and suggested that an entry that doesn't work on more than 25% of machines doesn't get a fully scaled score. Hopefully my initial draft of the rules makes sufficient sense to explain this (although I doubt it - the wording isn't strongly checked).

Also, please could every entrant who has time add their machine specifications to the page? It should help determine the way we should be heading with vote scaling in practical terms and will probably prove to make the whole discussion academic.

Carrus85
Member #2,633
August 2002
avatar

Both of my computer's specifications have been listed on the page.

EDIT: Also added note to the front page of the wiki about the raycasting competition ( in news area ).

aybabtu
Member #2,891
November 2002

I'll go an add my machine's specifications to the page...it's not gonna be pretty.:P

Trezker
Member #1,739
December 2001
avatar

The participants should be able to send their code to others to test it on various platforms during the competition. It isn't safe to write something on one platform and assume it'll work on others, even if you try to follow all the guidelines. [/tip]

HoHo
Member #4,534
April 2004
avatar

Added three computers I have constant access to. Actually I have access to three more PC's but I don't think p200 and p3 450 are so important. My roommate has a64 and I can test some stuff there too if required.

__________
In theory, there is no difference between theory and practice. But, in practice, there is - Jan L.A. van de Snepscheut
MMORPG's...Many Men Online Role Playing Girls - Radagar
"Is Java REALLY slower? Does STL really bloat your exes? Find out with your friendly host, HoHo, and his benchmarking machine!" - Jakub Wasilewski

jcnossen
Member #205
April 2000
avatar

I noticed that Thomas is using a Mac.
The thing is that I would like to use my WIP script compiler (which is x86 only), and maybe some inline assembly to speed up the inner raycasting loops. The latter is probably something that people other than me might do as well.
Im going to stay compatible only with x86 and just hope my entry still works on >=75% of the judges. Sorry Thomas



Go to: