[Al 5.0.5] Works on Linux, doesn't in WIndows
Niunio

That's it: some examples of my new Allegro.pas wrapper that works on Linux (Xubuntu) raises a SIGSEGV on Windows XP SP2.

I suspect it's a problem drawing on memory bitmaps, because if the program doesn't do that (i.e. it uses OpenGL only) it works without problems.

For example, it fails at line:
al_put_pixel (i, j, al_color_hsv (hue, sat, 1));
Where:

  i = 0
  j = 9
  hue = 180
  sat = 0.335790485

GDB backtrace is:

#SelectExpand
1Program received signal SIGSEGV, Segmentation fault. 20x3f2a09a2 in ?? () 3(gdb) bt 4#0 0x3f2a09a2 in ?? () 5#1 0x3f800000 in ?? () 6#2 0x3f800000 in ?? () 7#3 0x3f800000 in ?? () 8#4 0x0ba708a0 in ?? () 9#5 0x0ba708e4 in ?? () 10#6 0x0142fc08 in ?? () 11#7 0x7c91ee18 in strchr () from ntdll.dll 12#8 0x7c936ac8 in ntdll!iswdigit () from ntdll.dll 13#9 0xffffffff in ?? () 14#10 0x7c936abe in ntdll!iswdigit () from ntdll.dll 15#11 0x7c9368ad in ntdll!iswdigit () from ntdll.dll 16#12 0x00040000 in ?? () 17#13 0x40000060 in ?? () 18#14 0x7c92056d in ntdll!RtlFreeThreadActivationContextStack () from ntdll.dll 19#15 0x00000000 in ?? () from 20#16 0x00000000 in ?? () from 21#17 0x001389c0 in ?? () 22#18 0x00000004 in ?? () 23#19 0x00276728 in ?? () 24#20 0x0142fc90 in ?? () 25#21 0x77c06fab in qsort () from C:\WINDOWS\system32\msvcrt.dll 26#22 0x00276724 in ?? () 27---Type <return> to continue, or q <return> to quit--- 28#23 0x00000004 in ?? () 29#24 0x67353d44 in d3d_display_thread_proc () 30 from C:\WINDOWS\system32\allegro-5.0.5-monolith-mt-debug.dll 31#25 0x00402284 in INIT () at examples/ex_blit.pas:251 32#26 0x00402308 in main () at examples/ex_blit.pas:269

As you see I'm using "allegro-5.0.5-monolith-mt-debug.dll" but it doesn't show a lot...

Another example (actually ex_gldepth), it shows an "Assertion failed!" message at file "text.c" line 102, Expression: text. But if I comment the lines that draws text on the texture it works perfectly.

May be it's related with the Allegro4 polygon3d issue, Free Pascal enables CPU exceptions for data conversion.

Elias

Can you set a breakpoint on al_put_pixel then step through and see how it ends up in qsort? (It shouldn't...)

Thomas Fjellstrom

That stack trace looks broken :o

Make sure you're also compiling your code in debug mode, as well as linking to the debug versions of the system libraries. If that doesn't help, you're blowing up the stack.

Niunio

I've recompiled the Allegro 5 libraries (I was using Matthew's precompiled ones) but stack trace is still broken. ???

Elias said:

Can you set a breakpoint on al_put_pixel then step through and see how it ends up in qsort?

I did but GDB seems to be unable to step-in Allegro. If I debug the C examples it steps in correctly.

[off-topic] BTW, why does it use qsort? I've read the code and I don't understand the benefits of sorting it (and with qsort!!! Remember it's not the fastest in all cases ;) ). May be that's why Al5 looks slower than Al4 in software rendering.[/off-topic]

Make sure you're also compiling your code in debug mode, as well as linking to the debug versions of the system libraries.

I have revised it and it is correct.

Quote:

If that doesn't help, you're blowing up the stack.

Question is, how did I blowed up the stack? ??? In Linux it looks (and runs) nice.

May be I should ask to Free Pascal/Lazarus experts. :-[

Elias
Niunio said:

BTW, why does it use qsort?

That's the point, it does not. So I'm puzzled how how you can get a crash inside qsort... :P

Niunio

I've realized that my installation was old (I was using GDB 6!) so I've updated and recompiled... And now I have different information.

I did a step-by-step execution to see if it can debug inside the DLL, for example:

#SelectExpand
1(gdb) s 2EXAMPLEBITMAP (W=100, H=100) at examples/ex_blit.pas:29 329 mx := w * 0.5; 4(gdb) n 530 my := h * 0.5; 6(gdb) n 731 Pattern := al_create_bitmap (w, h); 8(gdb) n 90x00401530 in al_create_bitmap () 10(gdb) n 11Single stepping until exit from function al_create_bitmap, 12which has no line number information. 13al_create_bitmap (w=100, h=100) 14 at C:\msys\1.0\...\allegro-5.0.5\src\bitmap.c:142 15142 { 16(gdb) n 17143 ALLEGRO_BITMAP *bitmap = do_create_bitmap(w, h); 18(gdb) n 19144 if (bitmap) { 20(gdb) n 21145 _al_register_destructor(_al_dtor_list, bitmap, 22(gdb) n 23...

You see it works.

But it still fails at the same line but showing different information:

#SelectExpand
1Breakpoint 3, EXAMPLEBITMAP (W=100, H=100) at examples/ex_blit.pas:44 244 al_put_pixel (i, j, al_color_hsv (hue, sat, 1)); 3(gdb) print i 4$10 = 0 5(gdb) print j 6$11 = 9 7(gdb) print hue 8$12 = 180 9(gdb) print sat 10$13 = 0.335790485 11(gdb) step 12 13Program received signal SIGSEGV, Segmentation fault. 140x3f2a09a2 in ?? () 15(gdb) backtrace 16#0 0x3f2a09a2 in ?? () 17#1 0x3f800000 in ?? () 18#2 0x3f800000 in ?? () 19#3 0x3f800000 in ?? () 20#4 0x0bae2ec0 in ?? () 21#5 0x00402274 in INIT () at examples/ex_blit.pas:251 22#6 0x004022f8 in main () at examples/ex_blit.pas:269 23(gdb)

It looks like it fails before it calls al_color_hsv, doesn't it? ???

BTW, I'll ask at the Free Pascal mailing list because there are some odd warning messages that I don't understand, they're from another module but who knows::).

Thomas Fjellstrom

Lots of ??s in stacktraces is a big clue to memory and stack corruption (so long as you have compiled everything with debug symbols).

It's hard to track down if you don't have access to tools like valgrind.

Elias

Where to did you set breakpoint 3? I'm still unsure as to where exactly it crashes from what you have shown...

Niunio
Elias said:

Where to did you set breakpoint 3?

It's from a translation of the Allegro's ex_blitmap.c example to Pascal. The breakpoint 3 is in next function:

#SelectExpand
1 FUNCTION ExampleBitmap (CONST w, h: INTEGER): ALLEGRO_BITMAPptr; 2 VAR 3 i, j: INTEGER; 4 mx, my, a, d, sat, hue: SINGLE; 5 State: ALLEGRO_STATE; 6 Lock: ALLEGRO_LOCKED_REGIONptr; 7 Pattern: ALLEGRO_BITMAPptr; 8 BEGIN 9 mx := w * 0.5; 10 my := h * 0.5; 11 Pattern := al_create_bitmap (w, h); 12 al_store_state (State, ALLEGRO_STATE_TARGET_BITMAP); 13 al_set_target_bitmap (Pattern); 14 Lock := al_lock_bitmap (Pattern, ALLEGRO_PIXEL_FORMAT_ANY, ALLEGRO_LOCK_WRITEONLY); 15 FOR i := 0 TO w - 1 DO 16 BEGIN 17 FOR j := 0 TO h - 1 DO 18 BEGIN 19 a := arctan2 (i - mx, j - my); 20 d := sqrt (power (i - mx, 2) + power (j - my, 2)); 21 sat := power (1 - 1 / (1 + d * 0.1), 5); 22 hue := 3 * a * 180 / ALLEGRO_PI; 23 hue := (hue / 360 - floor (hue / 360)) * 360; 24 al_put_pixel (i, j, al_color_hsv (hue, sat, 1)); { <-- breakpoint 3 } 25 END; 26 END; 27 al_put_pixel (0, 0, al_map_rgb (0, 0, 0)); 28 al_unlock_bitmap (Pattern); 29 al_restore_state (State); 30 ExampleBitmap := Pattern; 31 END;

It fails in the 10th iteration of the inner loop, when i=0 and j=9.

As you see, the translation is almost "word-by-word".

I was thinking about the problem is a bad function declaration, but if it is then why does it run on Linux?

Elias
Niunio said:

I was thinking about the problem is a bad function declaration, but if it is then why does it run on Linux?

That's the beauty of undefined behavior... it may run. Or not.

The only thing I notice, when you pass State, shouldn't you pass a pointer? I.e. @State?

Also, does it still crash if you simplify the function, like remove all the math and just put black pixels.

SiegeLord

Could this be related to the struct return calling convention mess? Try doing this (or the equivalent in Pascal):

ALLEGRO_COLOR c;
c.r = c.g = c.b = c.a = 1;
al_put_pixel(i, j, c);

Elias

Oh, good point. And I should have thought of it since I need to work around it in the Python wrapper.

Niunio

Elias, SiegeLord, it fails in both cases, but trying as SiegeLord says the program raises an ASSERTION:

Microsoft Virtual C++ Runtime Library

Asertion failed!

Program: ...\ex_blit.exe
File: ...\bitmap.c
Line: 200

expression: bitmap != dest && bitmap != dest->parent

Also GDB warns inside of an Allegro function that "bitmap" points to an invalid data space (0x00a00000, VGA? ::)).

I think that's the problem: it doesn't assigns correctly the "bitmap", because it fails in any example that tries to draw anything in a memory bitmap but works if it draws directly on the display (i.e: without changing the target bitmap or using OpenGL only). Now I feel as I should say this in the first post... :-[

I'll check all functions and procedures than assigns/sets bitmap targeting and/or options to see if they're correct.

Elias said:

The only thing I notice, when you pass State, shouldn't you pass a pointer? I.e. @State?

I decided to use a more Pascal-style for the new Allegro.pas version so I used "VAR" that (as FPC documentation says) translates the parameter to a pointer when calling C functions. But I think I should check them too.

Elias

Ok, so what did you do for al_put_pixel? Is the color var as well?

Also, how certain are you that everything works if you don't use memory bitmaps? If you comment out the one line to make the bitmap a memory bitmap instead of video bitmap in the example, does everything work?

Niunio
Elias said:

Ok, so what did you do for al_put_pixel? Is the color var as well?

No, it isn't. It doesn't use a pointer on C so I didn't in Pascal.

Quote:

Also, how certain are you that everything works if you don't use memory bitmaps? If you comment out the one line to make the bitmap a memory bitmap instead of video bitmap in the example, does everything work?

After read again the documentation, now I'm not sure as it looks like Allegro creates new bitmaps from video memory by default.

I did some more tests with more examples:

"ex_line" is the only example that works without any change.

"ex_gldepth" works if I comment the lines that draws text on the texture.

"ex_rotate" renders the first frame but then raises a "runtime error 216". ???

Similar behaviour for "ex_warp_mouse": renders the first frame correctly but fails when I move the mouse. If I comment all mouse events, it runs but raises a "runtime error 217" at exit.

I've look for information about those 216 and 217 runtime error and I find that 217 is fixed by register cleaning and 216 is a trojan. ??? So I did updated my antivir (again ::)) and started a full scan, but I'm not sure if it's the problem.
__________________________________________

[edit] Ok, I've found an issue:

Free Pascal optimizes enum sizes but GCC does not optimizes. This means that Pascal defines enum as BYTE, WORD or LONGWORD but GCC espects all as LONGWORD. I don't know if this is enough to corrupt the stack but it seems possible.

Also, I should verify the bool size.

I must do more testing.

Thomas Fjellstrom

If those enums ever hit the stack, any enums going to pascal from C, will push more onto the stack than expected, and going the other way, C will pop off more than expected.

Elias

The Allegro functions likely use "cdecl" or something which means Pascal has to use the C calling convention (which should include enum sizes and things like that - else nothing would work).

But there's some border cases like the mentioned return of a struct which happens with a few functions like al_map_rgb and al_get_pixel. Like for example if gcc makes the al_map_rgb in the DLL return the color by putting it into the EAX and EDX registers but Pascal expects it on some stack location... the consequence will be undefined behavior. This hit me in the Python wrapper for example [1]

Peter Wang

If it helps, we could add alternative versions of those functions which update an output structure, if there aren't too many.

Niunio

I'm still testing and updating (I have no much free time) but I think that the problem affects only some operations. For example, if I remove all pixel drawing[1] and text drawing some examples work nice (i.e. the ex_gldepth). But as I said I must do more testing and checking.

Elias said:

But there's some border cases like the mentioned return of a struct which happens with a few functions like al_map_rgb and al_get_pixel. Like for example if gcc makes the al_map_rgb in the DLL return the color by putting it into the EAX and EDX registers but Pascal expects it on some stack location... the consequence will be undefined behavior.

It looks as an optimization. Is the Allegro's DLL "Debug" optimized? If so, how can I deactivate the optimization? I use CMake only to compile and install Allegro. :-/

IIRC PASCAL calling standard expects the first atomic parameters (INTEGER, BYTE...) into registers.

If it helps, we could add alternative versions of those functions which update an output structure, if there aren't too many.

It may help but what alternative?

References

  1. may be it's related with last Elias' post about Python.
Elias
Niunio said:

It may help but what alternative?

Well, for example:

al_get_pixel_color(x, y, &color);
al_get_rgba_color(r, g, b, a, &color);

That's the only two (plus other forms al_map_rgba) I remember from the Python wrapper. Inside the wrapper I'd then implement those other functions using the new functions instead of calling the DLL versions.

Niunio

I still don't understand. The procedures that are problematic are the ones that draw pixels and text, not the ones that retrieve data... ???

Elias

In the Python wrapper (and anywhere else someone attempts to directly call DLL functions) only those functions (who have a struct as return value) should be affected by the mentioned bug.

Niunio

Ok, I did more testing, tweaking and abracadabring and I think I finally found where the problem is (but I'm not sure why it is).

As you, Elias, said the problem is the "makecolor" family. If I do this:

al_draw_line (a, b, c, d, al_map_rgb_f (1, 1, 1), 1);

it fails, but if I do this:

ColorWhite := al_map_rgb_f (1, 1, 1);
al_draw_line (a, b, c, d, ColorWhite, 1);

then it works.

I suspect the problem is that Free Pascal enables CPU exceptions for data conversion or the optimizations that you said, or both.

Now, I'll try a to find a workaround, but I've found that font drawing still raises a "Runtime error 217..." at exit. ??? But it draws the text correctly using the workaround above.

Thread #609612. Printed from Allegro.cc