![]() |
|
3D accel newbie |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
I just got an old computer that happens to have 3d accelerated graphics. I've been googling a few times trying to get up to speed on OpenGL, alleggl, etc. but those seem to deal with toy examples to show some object rotating, waving whatever. I already have skimmed through the Redbook and Bluebook. I guess what I'm looking for is a much more hardware oriented viewpoint for 3d acceleration. I want to find some web pages that answer questions like: does the 3d card have a floating point unit of its own? Or why certain constraints come into play, such as texture size limits? Any good links? They all watch too much MSNBC... they get ideas. |
Steve Terry
Member #1,989
March 2002
![]() |
3D cards of today are becoming more processor based than generalized like the 3D cards of a few years ago, hence the term GPU. They can now run generalized fragment programs and pixel shaders with excellent floating point throughput, in most cases faster than a CPU. Before long graphics cards will be a processor of their own and real multi-cpu systems will be available (multi-core/processor, and a generalized graphics processor, sound processor, etc). Well we are pretty much already there ___________________________________ |
HoHo
Member #4,534
April 2004
![]() |
Steve Terry said:
Before long graphics cards will be a processor of their own and real multi-cpu systems will be available (multi-core/processor, and a generalized graphics processor, sound processor, etc). Well we are pretty much already there Actually situation is getting quite interesting. In the old times (and now too) most general purpose computing (e.g scene management) was done on CPU and rendering on GPU. I plan to reverse that: use a two dual core processor CPU machine for rendering and a powerful GPU for managing the scene data. Should be fun. Too bad it's only in planning stage right now, also without that monster PC it would be quite hard to implement well __________ |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
I guess I was premature posting this anyway, after I left allegro.cc I wound up in Nvidia website that had a bit of info about ALU's etc. They also had some SDK stuff with a screen shot that had individual blades of grass visible for 30-50 meters out, and some text blitted showing something like 100K polygons with 35 fps! They all watch too much MSNBC... they get ideas. |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: does the 3d card have a floating point unit of its own? Or why certain constraints come into play, such as texture size limits? If you have particular questions about GPUs, you're free to ask me. Consummer 3D cards have had multiple floating-point units since the GeForce 256 (4 of them, in fact). The GeForce 7800 GTX has, in comparison, 224 programmable floating-point units (FMAD), and a whole bunch more used for texture filtering, addressing and blending. There is also specialized circuits for interpolating attributes and computing transcendentals. The texture size constraints come in mainly from precision constraints, due to chip area constrains. It's much more expensive to support larger texture sizes because of the adders and multipliers that are needed for addressing (and addressing textures is very complex!). You also incur a cost in terms of cache tag size. -- |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
Thanks, Bob! Right now I'm googling for more GL tutorials, more like the Allegro docs than just a bunch of examples like the NeHe (which are very impressive). Just today I got gluLookAt to work because I hadn't noticed I'd always been using all zeros for the up vector.::) Looking through the old Pixelate things today also, have to look in them in depth when I get home again. Searching Allegro forums for polls on memory etc don't seem to bring up anything about what video cards are capable of, and the few computer capability things I saw were several years old. Could I assume that the usual video card of today has 64Mb video memory? Can do 1024 x 1024 textures? How many vertices fit into a GL_TRIANGLE_STRIP? Thanks again They all watch too much MSNBC... they get ideas. |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: Could I assume that the usual video card of today has 64Mb video memory? Most likely. All low-end video cards for the last 2 years have had 128 MB of video memory accessible to them. So 64 MB cards are on cards older than that. Quote: Can do 1024 x 1024 textures? Every GPU ever built, except the Voodoo 1 to 3. Quote: How many vertices fit into a GL_TRIANGLE_STRIP? As many as you want. The data is streamed to the GPU, so there is no limit. -- |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
All right! Guess I have to figure it out and port a ton of software rendering stuff to make a demo to brag on! Nobody paid any attn to some island thing I hacked into Allegro software rendering thing last spring. My mind still boggles over all that hardware in a video card, why don't we have the same on the mobo so we can have our own supercomputers? Or it's too specialized maybe? [EDIT] They all watch too much MSNBC... they get ideas. |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: why don't we have the same on the mobo so we can have our own supercomputers? Or it's too specialized maybe? GPUs cost too much for motherboards. A chipset's sell price (in bulk) is on the order of 20 to 30 USD. In that 30 USD, you need to fit the whole chipset. There is little room (cost-wise) to put a fancy GPU there. We do see the odd integrated GPU every now and then, but they're not very powerful: typically, they're ~half the speed of the slowest discreet solution available at the time. Quote: ROAM is supposed to work very well with 3d acceleration Unfortunately, it's not really hardware-friendly. You need to recreate your vertex and index buffers almost every frame, since you'll be generating a new polygon soup. Plus, ROAM has that nasty feature of taking more time to compute the scene when your frame rate diminishes, which lowers your frame rate and causes more computations on the next frame, etc. Quote: Also was thinking about (in software renderer) to run the game with flat shading, automatic movement of camera while running while I sleep, would color each polygon a unique color, and keep a database of which polygons were visible from certain areas. Would have to compress to run length bitfields, haven't ever seen this elsewhere. You're probably better off doing some form of BSP instead. -- |
Thomas Harte
Member #33
April 2000
![]() |
Quote: Unfortunately, it's not really hardware-friendly. You need to recreate your vertex and index buffers almost every frame, since you'll be generating a new polygon soup. I don't see how this is more true of ROAM than any other system of adaptive meshing? I also take it there is not yet an analogue to pixel/vertex shaders that allow you to programmatically create a vertex list? I've never owned a card with any sort of programmable functionality, so please excuse my ignorance. EDIT: but I did do quite a lot in the realm of full software 3d before obtaining a 3dfx Voodoo, too many years ago for me to be willing to remember. So don't feel the need to patronise. Quote: Actually situation is getting quite interesting. In the old times (and now too) most general purpose computing (e.g scene management) was done on CPU and rendering on GPU. I plan to reverse that: use a two dual core processor CPU machine for rendering and a powerful GPU for managing the scene data. I really don't see how this would be beneficial. [My site] [Tetrominoes] |
HoHo
Member #4,534
April 2004
![]() |
Quote: I really don't see how this would be beneficial. GPU's are not very good for ray tracing, at least not for now. Their flexibility is practically nonexistant compared to CPU's. Space partitioning tree building takes a lot of FP power and probably can be given fot GPU to process so that would give more CPU time for CPU's to deal with rendering. __________ |
Krzysztof Kluczek
Member #4,191
January 2004
![]() |
Quote: Space partitioning tree building takes a lot of FP power and probably can be given fot GPU to process so that would give more CPU time for CPU's to deal with rendering.
I don't think so. GPU is prepared to handle large number of short very specific task in vertex and pixel shaders. It works more or less like this: It probably is possible to write pixel shader, which will build the tree, but it will be highly redundant, because pixel shader has very limiting output (several floating point numbers - not enough to output too much of the tree). Also pixel and vertex shaders probably don't have enough temporary memory to build partitioning tree. GPUs are built to execute simple operations on loads of input data, but they aren't created to do complex task, unless the task is polygon rendering. I think it should be probably easier to do space partitioning on CPU and do raytracing with pixel shaders on GPU. ________ |
HoHo
Member #4,534
April 2004
![]() |
so much about the original topic I haven't really studied how could KD-tree building be done with GPU but I know there are some very efficient sorting algorithms working on GPU's. Quote: It probably is possible to write pixel shader, which will build the tree, but it will be highly redundant, because pixel shader has very limiting output (several floating point numbers - not enough to output too much of the tree). In KD-tree for CPU, a single branch or leaf takes 32 bits. There are usually 1-3 triangles per leaf and tens of thousands to several millions triangles per scene. Quote:
I think it should be probably easier to do space partitioning on CPU and do raytracing with pixel shaders on GPU. There have been attempts to do so. Probably the latest and most successful one is described here (check the thesis). Also as weird as it may sound using shaders in GPU based ray tracer will probably be quite complicated Anyone interested in real time ray tracing check out OpenRT. You can register to get a noncommercial version for Linux there __________ |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: I don't see how this is more true of ROAM than any other system of adaptive meshing? Some adaptive meshing scehemes are better than others, but most of them are suboptimal; That is, there comes a time when NOT doing adaptive subdivision is faster than doing it, simply because GPUs become faster than CPUs, more quickly. Quote: I also take it there is not yet an analogue to pixel/vertex shaders that allow you to programmatically create a vertex list? Not yet, but it should come soon enough. It's been one of the touted features of DX10. Quote: I know there are some very efficient sorting algorithms working on GPU's. They're not all that efficient. See Purcell and al. Sorting doesn't scale with computations, it scales with memory bandwidth (which grows much slower). -- |
HoHo
Member #4,534
April 2004
![]() |
Quote: They're not all that efficient. See Turcell and al. Sorting doesn't scale with computations, it scales with memory bandwidth (which grows much slower). Have you seen this? __________ |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: Have you seen this?
Sure, but that doesn't disprove what I said above -- |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
Quote: You're probably better off doing some form of BSP instead I'm trying to do landscapes, which I can't see working very well with BSP trees. On a hilltop near a corner of the map, you'd have most polygons visible in a single frame. Working toward a 3d car racing game. I've gotten GL to do a "landscape" of colored triangles (probably get textures in it in an hour or two), can't get display list to work right yet so I'm still doing it with a loop passing parms to gl functions. Still about 4x faster than my software renderer though. Although this is only a 466Mhz Celeron, and when I've got 100K colored triangles on screen at once I'm only getting 5-8 fps... Still got a lot to learn here. They all watch too much MSNBC... they get ideas. |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: I've gotten GL to do a "landscape" of colored triangles (probably get textures in it in an hour or two), can't get display list to work right yet so I'm still doing it with a loop passing parms to gl functions. Still about 4x faster than my software renderer though. Although this is only a 466Mhz Celeron, and when I've got 100K colored triangles on screen at once I'm only getting 5-8 fps... Still got a lot to learn here.
You'll want to use Vertex Buffer Objects instead of display lists or immediate mode. That's if you care about performance -- |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
This computer has an Intel 810 chipset, and the ARB thing in Alleggl examples says it (or the drivers) don't support the ARB extension.:-/ They all watch too much MSNBC... they get ideas. |
HoHo
Member #4,534
April 2004
![]() |
Quote: and when I've got 100K colored triangles on screen at once I'm only getting 5-8 fps.. I think that's normal considering that to my knowledge most q3a levels had less triangles in total Perhaps if VBO's are not supported then perhaps EXT_vertex_array can help you a bit. Its ugly to use compared to VBO's but it should give some speed boost compared to direct rendering if CPU is holding you back. [edit] Quote: You'll want to use Vertex Buffer Objects instead of display lists or immediate mode Funny, I've always thought display lists are the most efficient things for static geometry. Are VBO's really faster for static stuff? __________ |
Krzysztof Kluczek
Member #4,191
January 2004
![]() |
Quote: I think that's normal considering that to my knowledge most q3a levels had less triangles in total I think that 200 000 OpenGL calls per frame (or more if he isn't using triangle strips) is more likely to slow it down. That's why vertex arrays were introduced. Quote: Perhaps if VBO's are not supported then perhaps EXT_vertex_array can help you a bit. Its ugly to use compared to VBO's but it should give some speed boost compared to direct rendering if CPU is holding you back. EXT_vertex_array is part of OpenGL 1.1 core, which means it should work everywhere. Also vertex arrays have quite similar interface to VBOs, which allows creating intelligent vertex buffer class with nice interface capable of using vertex arrays when VBOs aren't supported. Quote: Are VBO's really faster for static stuff?
Yes and for dynamic, too. ________ |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
My windows dll's don't have a EXT_vertex_array, but they do have a GL_EXT_compiled_vertex_array... I see the Alleggl stuff has the EXT_vertex_array, but I'm having way too much trouble with it on windows, my slackware distro compiled version wouldn't respond to any input except cntrl-alt-backspacing my way back to console, and I need to get some glut stuff to get it to compile on my old Mandrake to check it out on that. The 100k polygons reference was comparing to the NVidea demo thing, so (ignoring furthur optimizations I'm missing) the NVidea computer is about 7 times faster. The car racing game will "skip" some vertices to make larger polygons in the distance to cut the total down. "Mipbumping?" Somewhat ROAM like. The Allegro thing I put in off topic ordeals last spring had a crude version. They all watch too much MSNBC... they get ideas. |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: My windows dll's don't have a EXT_vertex_array EXT_vertex_array is part of OpenGL 1.1. So if you have OpenGL 1.1, you don't need to check for EXT_vertex_array: it's already available. -- |
Arthur Kalliokoski
Second in Command
February 2005
![]() |
So much to check, so little time... I just reran the old Allegro thing on this Celeron, it got 12-13 fps @ 320x240x16 at the corner of a 128x vertex array, the gl thing got 23 fps under the same conditions, zbuffering and backface culling didn't seem to slow it down. The allegro thing had prerendered lighting on the slopes (but there was some sort of fencepost error, but that'd be trivial to fix) and there was an occasional clipping error. Got to split up the texture map into small strips for gl, also Thomas Harte was saying to do that for the software thing anyway, it'd cache better too. I forget (already) what my own software stuff did on the AMD K6-2, was much faster than allegro but fog etc. would have slowed it down again. can't run it on this celeron because the stupid vesa implementation sucks so bad and scitech doesn't grok, but gl is better even if I don't get all the fancy stuff down. There, I admit it! [EDIT] They all watch too much MSNBC... they get ideas. |
Bob
Free Market Evangelist
September 2000
![]() |
Quote: NO dll had the EXT_vertex_array. Yes, that's expected. I don't think there are any platforms that expose plain old OpenGL 1.0 (where EXT_vertex_array is meaningful). Windows is perpetually stuck at GL 1.1, which does mean that EXT_vertex_array is part of the core. That is, you can use the functions (glVertexPointer and family) as you would normally do any other GL function. -- |
|