Allegro.cc - Online Community

Allegro.cc Forums » Allegro Development » Old crash in d3d_shutdown still there...

Credits go to Aaron Bolyard and SiegeLord for helping out!
This thread is locked; no one can reply to it. rss feed Print
Old crash in d3d_shutdown still there...
Edgar Reynaldo
Member #8,592
May 2007
avatar

My test program for my Eagle library still crashes upon exit in d3d_shutdown. Here is the code that reproduces it :

#SelectExpand
1 2int GuiTestMain2(int argc , char** argv) { 3 4 (void)argc; 5 (void)argv; 6 7 EagleSystem* sys = 0; 8 sys = new Allegro5System(); 9 if (sys->Initialize(EAGLE_FULL_SETUP) != EAGLE_FULL_SETUP) { 10 delete sys; 11 return 1; 12 } 13 EagleGraphicsContext* win = sys->CreateGraphicsContext(800,600 , EAGLE_WINDOWED); 14 EAGLE_ASSERT(win); 15 16 win->Clear(EagleColor(0,127,64)); 17 18 win->FlipDisplay(); 19 20 sys->Rest(2.0); 21 22 delete sys; 23 24/// al_uninstall_system(); 25 26 return 0; 27 28}

If I uncomment al_uninstall_system(), Allegro shuts down successfully and doesn't crash. If I don't call it, then it crashes here (scroll to bottom to see backtrace) :


c:\ctwoplus\progcode\Eagle5GUI\cbbuild\bin>gdb Libtest-debug.exe
GNU gdb (GDB) 7.6.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "mingw32".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from c:\ctwoplus\progcode\Eagle5GUI\cbbuild\bin\Libtest-debug.exe...done.
(gdb) break GuiTestMain.cpp:37
Breakpoint 1 at 0x402813: file C:\ctwoplus\progcode\Eagle5GUI\src\tests\GuiTestMain.cpp, line 37.
(gdb) run
Starting program: c:\ctwoplus\progcode\Eagle5GUI\cbbuild\bin/Libtest-debug.exe
[New Thread 4192.0x16ec]
Creating object at 00c83770 named Allegro5System at 00c83770
BFD: C:\Windows\system32\atiumdva.dll: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .text
BFD: C:\Windows\system32\atiumdva.dll: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .rdata
BFD: C:\Windows\system32\atiumdva.dll: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .data
Eagle : Initialized system.
Creating object at 00c85448 named Allegro5Timer at 00c85438
[New Thread 4192.0x162c]
Allegro5Timer::Destroy this=0xc85438
Allegro5Timer::Create this=0xc85438
[New Thread 4192.0x168c]
Eagle : Initialized the system state.
Eagle : Initialized images.
Eagle : Initialized fonts.
Eagle : Initialized TTF fonts.
[New Thread 4192.0x11b8]
[New Thread 4192.0x11dc]
[New Thread 4192.0x16b0]
Eagle : Initialized audio.
Eagle : Initialized shaders.
Eagle : Initialized primitives.
Eagle : Installed keyboard.
Eagle : Installed mouse.
[New Thread 4192.0x1538]
Eagle : Installed joystick.
Creating object at 00c8cf10 named Allegro5GraphicsContext at 00c8cf10
Creating object at 00c8cf5c named Allegro5Image at 00c8cf5c
[New Thread 4192.0x14a4]
[New Thread 4192.0x1498]
[New Thread 4192.0x10ac]
[New Thread 4192.0x1078]
[New Thread 4192.0xb48]
[New Thread 4192.0x11c4]
Allegro5Timer::Destroy this=0xc85438
Destroying object at 00c85448 named Allegro5Timer at 00c85438
Destroying object at 00c8cf5c named Allegro5Image at 00c8cf5c
Destroying object at 00c8cf10 named Allegro5GraphicsContext at 00c8cf10
Destroying object at 00c83770 named Allegro5System at 00c83770

Breakpoint 1, _fu6__EAGLE_FULL_SETUP () at C:\ctwoplus\progcode\Eagle5GUI\src\tests\GuiTestMain.cpp:37
37         return 0;
(gdb) next
39      }
(gdb)
main (argc=1, argv=0xc836b0) at C:\ctwoplus\progcode\Eagle5GUI\src\tests\Libtest.cpp:41
41      }
(gdb)
__mingw_CRTStartup () at ../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/crt1.c:260
260     ../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/crt1.c: No such file or directory.
(gdb)
262     in ../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/crt1.c
(gdb)

Program received signal SIGSEGV, Segmentation fault.
0x677a6ded in d3d_shutdown () at C:\mingw\LIBS\A5GIT\allegro\src\win\d3d_disp.cpp:2683
2683       _al_d3d->Release();
(gdb) bt
#0  0x677a6ded in d3d_shutdown () at C:\mingw\LIBS\A5GIT\allegro\src\win\d3d_disp.cpp:2683
#1  0x67795db3 in win_shutdown () at C:\mingw\LIBS\A5GIT\allegro\src\win\wsystem.c:196
#2  0x677327fe in shutdown_system_driver () at C:\mingw\LIBS\A5GIT\allegro\src\system.c:81
#3  0x67725e33 in _al_run_exit_funcs () at C:\mingw\LIBS\A5GIT\allegro\src\exitfunc.c:92
#4  0x67732b83 in al_uninstall_system () at C:\mingw\LIBS\A5GIT\allegro\src\system.c:313
#5  0x6ab41028 in __dll_exit () at ../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/dllcrt1.c:159
#6  0x6ab410a1 in DllMainCRTStartup@12 (hDll=0x6ab40000, dwReason=0, lpReserved=0x1) at ../mingwrt-4.0.3-1-mingw32-src/src/libcrt/crt/dllcrt1.c:138
#7  0x7799ded4 in ntdll!RtlDecodePointer () from C:\Windows\system32\ntdll.dll
#8  0x7798a959 in ntdll!RtlExitUserProcess () from C:\Windows\system32\ntdll.dll
#9  0x7798a8db in ntdll!RtlExitUserProcess () from C:\Windows\system32\ntdll.dll
#10 0x777c3d77 in KERNEL32!ExitProcess () from C:\Windows\system32\kernel32.dll
#11 0x00000000 in ?? ()
(gdb) info locals
_func_ = "d3d_shutdown"
(gdb)

Can anyone give me any advice on how to debug this? It's been happening for quite some time now, and I am no closer to a (real) solution than I was before :
Relevant threads :
https://www.allegro.cc/forums/thread/613372 ( a little confusing to follow)
https://www.allegro.cc/forums/thread/613778 (I tried toggling WANT_DLL_TLS to on in cmake , but no go)

I'm using Allegro 5 GIT branch, MinGW 4.8.1 and Vista.

SiegeLord
Member #7,827
October 2006
avatar

I have to say, your Eagle library is very hard to install. First, your main development branch has a space in it, so doing:

svn checkout svn://svn.code.sf.net/p/eagle5gui/code/trunk eagle5gui-code

does something very silly. This is probably why the subversion snapshot is empty for that branch. That wasted me like 5 minutes :P. Then, your Code::Blocks projects have a few hard-coded include paths... and your post-install scripts seem to fail.

But anyway. The bad news is that I got it compiled, and it didn't crash for me on my desktop. The good news is that it does crash on my laptop (both are Win7). I've no clue what could be different between them at the moment, but there you have it. Typically what I can crash, I can fix... but the fact that it only crashes on the laptop might put a damper on things.

I've attached the binary I created... might be worthwhile to see if it also crashes on your system. I merely replaced the contents of GuiTestMain in one of the test files with what you have in your post.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Aaron Bolyard
Member #7,537
July 2006
avatar

I'd like to note SiegeLord's build crashes on my one desktop with an old Nvidia card (9800 GTX+) as described. I'll try it on my other one tomorrow.

Edgar Reynaldo
Member #8,592
May 2007
avatar

Sorry, my bad. I didn't think you were going to be compiling. Let me commit all my latest changes first. And sorry about the space in the branch name, that's SourceForge's stupid default name for trunk I don't know how to change it. Also, sorry for only having CB projects, I'll update those too. I tried to make everything relative paths but I guess I missed some.

I'll try your binary here in a minute...

It crashes for me, here :

c:\ctwoplus\progcode\Eagle5GUI\TestBinaries\eagle_test>gdb test.exe
...
Reading symbols from c:\ctwoplus\progcode\Eagle5GUI\TestBinaries\eagle_test\test
.exe...done.
(gdb) run
Starting program: c:\ctwoplus\progcode\Eagle5GUI\TestBinaries\eagle_test/test.ex
e
[New Thread 3600.0x128c]
Creating object at 009736f8 named Allegro5System at 009736f8
Eagle : Initialized system.
Creating object at 00975220 named Allegro5Timer at 00975210
[New Thread 3600.0x10b0]
Allegro5Timer::Destroy this=0x975210
Allegro5Timer::Create this=0x975210
[New Thread 3600.0x10fc]
Eagle : Initialized the system state.
[New Thread 3600.0x10d8]
Eagle : Initialized images.
Eagle : Initialized fonts.
Eagle : Initialized TTF fonts.
[New Thread 3600.0x129c]
[New Thread 3600.0x1294]
Eagle : Initialized audio.
Eagle : Initialized shaders.
Eagle : Initialized primitives.
Eagle : Installed keyboard.
Eagle : Installed mouse.
[New Thread 3600.0x11f8]
[New Thread 3600.0xe44]
[New Thread 3600.0x1284]
Eagle : Installed joystick.
Creating object at 0097d090 named Allegro5GraphicsContext at 0097d090
Creating object at 0097d0d8 named Allegro5Image at 0097d0d8
BFD: C:\Windows\system32\atiumdva.dll: Warning: Ignoring section flag IMAGE_SCN_
MEM_NOT_PAGED in section .text
BFD: C:\Windows\system32\atiumdva.dll: Warning: Ignoring section flag IMAGE_SCN_
MEM_NOT_PAGED in section .rdata
BFD: C:\Windows\system32\atiumdva.dll: Warning: Ignoring section flag IMAGE_SCN_
MEM_NOT_PAGED in section .data
[New Thread 3600.0x240]
[New Thread 3600.0x1074]
[New Thread 3600.0x1054]
[New Thread 3600.0x12b8]
[New Thread 3600.0x12c0]
[New Thread 3600.0xcfc]
Allegro5Timer::Destroy this=0x975210
Destroying object at 00975220 named Allegro5Timer at 00975210
Destroying object at 0097d0d8 named Allegro5Image at 0097d0d8
Destroying object at 0097d090 named Allegro5GraphicsContext at 0097d090
Destroying object at 009736f8 named Allegro5System at 009736f8

Program received signal SIGSEGV, Segmentation fault.
0x61571f77 in _al_d3d_shutdown_display ()
    at C:/dev/Allegro5/src/win/d3d_disp.cpp:2682
2682    C:/dev/Allegro5/src/win/d3d_disp.cpp: No such file or directory.
(gdb) bt
#0  0x61571f77 in _al_d3d_shutdown_display ()
    at C:/dev/Allegro5/src/win/d3d_disp.cpp:2682
#1  0x61560e91 in win_shutdown () at C:/dev/Allegro5/src/win/wsystem.c:180
#2  0x614f32f2 in shutdown_system_driver () at C:/dev/Allegro5/src/system.c:81
#3  0x614e670b in _al_run_exit_funcs () at C:/dev/Allegro5/src/exitfunc.c:92
#4  0x614f367b in al_uninstall_system () at C:/dev/Allegro5/src/system.c:314
#5  0x6ab411b4 in _CRT_INIT@12 (hDllHandle=hDllHandle@entry=0x6ab40000,
    dwReason=dwReason@entry=0, lpreserved=lpreserved@entry=0x1)
    at C:/git/mingw/mingw-w64-crt-git/src/crt/mingw-w64-crt/crt/crtdll.c:144
#6  0x6ab41325 in __DllMainCRTStartup (
    hDllHandle=hDllHandle@entry=0x6ab40000, dwReason=0,
    lpreserved=lpreserved@entry=0x1)
    at C:/git/mingw/mingw-w64-crt-git/src/crt/mingw-w64-crt/crt/crtdll.c:211
#7  0x6ab41433 in DllMainCRTStartup@12 (hDllHandle=0x6ab40000, dwReason=0,
    lpreserved=0x1)
    at C:/git/mingw/mingw-w64-crt-git/src/crt/mingw-w64-crt/crt/crtdll.c:171
#8  0x7752ded4 in ntdll!RtlDecodePointer () from C:\Windows\system32\ntdll.dll
#9  0x7751a959 in ntdll!RtlExitUserProcess ()
   from C:\Windows\system32\ntdll.dll
#10 0x7751a8db in ntdll!RtlExitUserProcess ()
   from C:\Windows\system32\ntdll.dll
#11 0x77253d77 in KERNEL32!ExitProcess ()
   from C:\Windows\system32\kernel32.dll
#12 0x00000000 in ?? ()
(gdb) info locals
_func_ = "_al_d3d_shutdown_display"
(gdb)

Line 2682 is _al_d3d->Release().

I did a total clean and rebuild of everything, and it still crashes, intermittently in the same place. I tried static linking too, but for some reason my exe kept depending on eagled.dll and eagle_a5d.dll, even though I was linking to the static import libraries...??? Now the static version won't crash. I'll try the dll's again...I cleaned and rebuilt zlib and physfs and allegro first though. And now when I link the static version, it depends on the dlls again? WTF? The names are similar, could g++ be mixing them up or picking up the wrong one somehow? For example, dynamic debug are named libeagled.dll.a and libeagle_a5d.dll.a and static debug are named libeagled.a and libeagle_a5d.a. Why are they getting mixed up? I am clearly linking to -leagled and -leagle_a5d in the Libtest project but the dlls are in the dependencies again - why? Maybe it's picking up the .def files somehow? If I rename libeagled.dll.a to something else, and libeagle_a5d.dll.a to somethine else too, then it static links properly. Triple WTF am I doing wrong?

Okay did complete rebuild of dynamic and static versions. I can't get the static version to crash anymore, but now the dynamic version seems to crash all the time.

Here's a dynamic build. Try it. And update from svn, I did a big commit, fixing the things you mentioned (mostly except for the batch scripts, sorry bout that...it depends on c:\mingw being there).

In the backtrace, there's a frame called __dll_exit. Clue?

SiegeLord
Member #7,827
October 2006
avatar

I spent a bit of time investigating this. So here's my best hypothesis at what happens. You're calling al_init() inside your DLL, which registers al_uninstall_system() to run via atexit. What I think happens is that the DLL with Allegro etc gets unloaded before atexit is called, and when it does get called, everything crashes. So what I think you should do is call al_install_system(ALLEGRO_VERSION_INT, NULL) instead of al_init() (thus disabling the atexit functionality) and call al_uninstall_system() in your A5 backend Shutdown() method. For me, this modification made it not crash.

What led me down this hypothesis is that calling al_init() inside the application before calling the Initialize method made everything work ok.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Edgar Reynaldo
Member #8,592
May 2007
avatar

Thanks for looking into this. ;)

I will try and investigate the cause further myself as well. You said calling al_init in the example before sys->Initialize made it not crash? Do dlls and the main thread have their own atexit routines? It would seem when my dll exits (notice __dll_exit in the stack trace) it has its own atexit routine which is calling al_uninstall_system (see the back trace from the OP). I will try the code change you suggested and see if it works to alleviate the problem for me.

Edit
I did some further investigation, and it appears that every dll gets its own atexit function, as well as the main thread. I counted one address of atexit for my eagle dll, one for the allegro dll, and one for the program using them.

It would have been nice to know that but nowhere is this behaviour documented. I suppose it follows that every linked program or dll gets its own atexit routines, but I thought there would only be one, linked into the dll or program when the libstdc++ is linked. That's what makes sense to me, that every object file would link to the same atexit function in the std c++ lib.

There is also a clue that someone else knew about this in the manual entry for al_init :

Quote:

Like al_install_system, but automatically passes in the version and uses the atexit function visible in the current compilation unit.

So, I can pass the atexit visible in main to my system to pass on to allegro and everything should work fine. Also, your fix seems to do the trick too, passing NULL for atexit and calling al_uninstall_system in my Allegro5System::Shutdown method.

Thanks a lot SiegeLord, I don't know if I ever would have figured it out on my own.

Aaron Bolyard
Member #7,537
July 2006
avatar

I really wanted to get back to this thread a couple days ago but was having computer trouble (my cooling fan for the CPU decided to break one or more of its legs...).

The reason behind the C runtime having its own state between modules is explained in various places, example: https://msdn.microsoft.com/en-us/library/ms235460.aspx

Basically, each module has its own state. Your DLL could be compiled with a different CRT than, say, the user's program, or however it goes; and even if the CRT is the same, there's a host of other issues from sharing the state between different modules.

This is also why you shouldn't malloc/free or new/delete between DLLs and so on. You may be able to get away with it, but that doesn't mean it won't break in different operating systems, or different configurations even.

SiegeLord
Member #7,827
October 2006
avatar

Hmm, 'current compilation unit' seems wrong too, since that's just an individual object (surely not every object gets it's own atexit function?). I'll add an additional note to the documentation explicitly mentioning DLLs.

"For in much wisdom is much grief: and he that increases knowledge increases sorrow."-Ecclesiastes 1:18
[SiegeLord's Abode][Codes]:[DAllegro5]:[RustAllegro]

Edgar Reynaldo
Member #8,592
May 2007
avatar

Basically, each module has its own state. Your DLL could be compiled with a different CRT than, say, the user's program, or however it goes; and even if the CRT is the same, there's a host of other issues from sharing the state between different modules.

Thanks for your input Aaron.

The thing about atexit though is that subsequent calls to it are supposed to be called in last in first out order to properly synchronize shutdown functions. But when every dll and every module has its own atexit you're not guaranteed that the calls will actually execute in that order as it appears that atexit calls registered inside a dll will be called upon the dll's exit routine (__dll_exit in my case) and calls registered in separate modules will have one shared atexit that runs before global destructors for each when linked into a program. So the atexit inside of main will get called first, as any dlls it depends on can't be unloaded until they are no longer in use. Then the dll's exit routines are run calling their own atexit functions. This is where my dll's atexit routine tried to shutdown allegro after the allegro dll was unloaded (theoretically, not proven yet).

Also, it is difficult to ascertain the true order of what is happening as some of the registered atexit functions in a test I'm conducting are not being called even though atexit is returning 0 when they are registered. Also for some reason it appears as though the global destructors for my dll are not being called or else printf is no longer a valid function at that time and its behaviour is unreliable. And, for some reason a static variable inside a constructor is being created twice and the only atexit routine running is the one in the main thread. What is going on with this stuff?

Edit
Now I am running the same test program, using al_init inside my dll and it is not crashing anymore. Shouldn't it be? And now sometimes only some of the atexit routines get called. I wish I knew what was driving these seemingly irrational program behaviours.

Thomas Fjellstrom
Member #476
June 2000
avatar

I have (almost) always[1] been in the opinion that depending on global destructors of any kind (statically allocated objects, compiler attributes on functions, atexit, etc) is just a bad idea. Without a lot of extra work and headaches, you can't guarantee proper destruction order. Since you are generally already able to track your allocations, you can just destruct every long term object yourself when you know you need to.

References

  1. At least since i first learned about it

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Elias
Member #358
May 2000

And now sometimes only some of the atexit routines get called.

I'd assume atexit() only works for things using the .exe atexit(), not any DLL atexit.

--
"Either help out or stop whining" - Evert

Edgar Reynaldo
Member #8,592
May 2007
avatar

Actually, the dll atexits seem to run now. Go figure. Here's a sample log showing the creation and destruction order of globals and atexit routines :

#SelectExpand
1c:\ctwoplus\progcode\Eagle5GUI\cbbuild\bin>libtest-debug.exe 7 2 3ShutdownVar object 'Eagle global variable (Object.cpp)' created 4ShutdownVar object 'Main global variable' created 5 6Registering 'shutdown_main' with atexit. Atexit ptr is 004015B0 7atexit(shutdown_main) returned 0 8 9register object shutdown function called. Address of atexit is 670C1140 10atexit(shutdown_object_module_function) returned 0 11 12Creating object at 003c8ce0 named Allegro5System at 003c8ce0 13al_install_system called. visible atexit ptr is 67701140 14atexit(allegro_atexit_func) returned 0 15 16Eagle : Initialized system. 17 18Creating object at 003ca4b8 named Allegro5Timer at 003ca4a8 19Allegro5Timer::Destroy this=0x3ca4a8 20Allegro5Timer::Create this=0x3ca4a8 21 22Eagle : Initialized the system state. 23Eagle : Initialized images. 24Eagle : Initialized fonts. 25Eagle : Initialized TTF fonts. 26Eagle : Initialized audio. 27Eagle : Initialized shaders. 28Eagle : Initialized primitives. 29Eagle : Installed keyboard. 30Eagle : Installed mouse. 31Eagle : Installed joystick. 32 33Creating object at 003cdb40 named Allegro5GraphicsContext at 003cdb40 34Creating object at 003cdb8c named Allegro5Image at 003cdb8c 35 36EagleSystem::Shutdown called 37 38Allegro5Timer::Destroy this=0x3ca4a8 39Destroying object at 003ca4b8 named Allegro5Timer at 003ca4a8 40Destroying object at 003cdb8c named Allegro5Image at 003cdb8c 41Destroying object at 003cdb40 named Allegro5GraphicsContext at 003cdb40 42 43al_uninstall_system called. call # 1 44Destroying object at 003c8ce0 named Allegro5System at 003c8ce0 45 46shutdown_main called 47 48Destroying eagle shutdown variable 'Main global variable' 49 50Eagle atexit function shutdown_object_module_function called 51 52Destroying eagle shutdown variable 'Eagle global variable (Object.cpp)' 53 54Allegro atexit function called. 55 56c:\ctwoplus\progcode\Eagle5GUI\cbbuild\bin>

The destruction order seems to be :
1) Main atexit routines
2) Main globals
3) Eagle DLL atexit routines
4) Eagle DLL globals
5) Allegro DLL atexit routines
6) Presumably Allegro DLL globals would come next

Which seems to make sense, given that it is in order from most dependent to least dependent.

Go to: