I'm having trouble getting A5 to work when statically linked. Maybe I'm doing something wrong... but I don't know what else to try.
The following program crashes:
The only output is "about to load font...".
Here is the makefile (which builds with no warnings or errors):
CC=gcc CFLAGS=-Wall -DMINGW32 -DNDEBUG -O2 -DALLEGRO_STATICLINK LDFLAGS=-static-libgcc -static-libstdc++ LIBS=-lallegro -lallegro_primitives -lallegro_font -lallegro_image -lallegro_color LIBS+=-lstdc++ -lgdiplus -luuid -lkernel32 -lwinmm -lpsapi -lopengl32 -lglu32 -luser32 -lcomdlg32 -lgdi32 -lshell32 -lole32 -ladvapi32 -lws2_32 LIBS:=$(LIBS:-lallegro%=-lallegro%-static) test.exe: test.c $(CC) $(CFLAGS) $(LDFLAGS) $< -o $@ $(LIBS)
It produces a command like this:
gcc -Wall -DMINGW32 -DNDEBUG -O2 -DALLEGRO_STATICLINK -static-libgcc -static-libstdc++ test.c -o test.exe -lallegro-static -lallegro_primitives-static -lallegro_font-static -lallegro_image-static -lallegro_color-static -lstdc++ -lgdiplus -luuid -lkernel32 -lwinmm -lpsapi -lopengl32 -lglu32 -luser32 -lcomdlg32 -lgdi32 -lshell32 -lole32 -ladvapi32 -lws2_32
Which includes all of the libraries listed here except -lgcc_eh, which gcc says it cannot find.
I'm using MinGW, gcc v4.6.0.
To build allegro, downloaded v5.0.3 [1], I unzipped it to C:\tools\allegro5, and from that directory I typed
mkdir build cd build cmake-gui ..
First I selected mingw-makefiles, and generated the makefiles with the default settings.
make make install
It seemed to work with no problems.
I then invoked cmake-gui again, deselected SHARED and generated the makefiles again; and again did make and make install; again there didn't seem to be any problems.
So I think I've installed allegro properly with shared and static libraries, and I think I'm linking to all the right stuff, and I'm compiling with ALLEGRO_STATICLINK defined as you can see - and it compiles without warnings or errors - but it crashes. (The non-static version works without problems.)
Have I missed something, or is this an allegro bug?
[edit]
By the way, fixed_font.tga is a file from the allegro examples data.
Can you try with the debug version (set CMAKE_BUILD_TYPE to Debug)? It should create an allegro.log which may help, but more importantly you should be able to get a gdb backtrace telling why it crashes. My suspicion is it has to do with gcc 4.6.
allegro.log:
system W C:\tools\allegro5\src\win\wsystem.c:550 load_library_at_path [ 0.00024] Failed to load C:\Programming\scratch\bug\d3d9.dll (error: 126) system I C:\tools\allegro5\src\win\wsystem.c:543 load_library_at_path [ 0.02992] Loaded C:\Windows\system32\d3d9.dll d3d I C:\tools\allegro5\src\win\d3d_disp.cpp:879 _al_d3d_init_display [ 0.13966] Render-to-texture: 1 system I C:\tools\allegro5\src\system.c:268 al_install_system [ 0.13970] Allegro version: 5.0.3
backtrace:
0 _al_register_destructor dtor.c 161 0x407782 1 al_create_bitmap bitmap.c 145 0x423591 2 _al_load_tga_f tga.c 355 0x4c85d6 3 _al_load_tga tga.c 553 0x4c8ec5 4 al_load_bitmap bitmap_io.c 219 0x40ba5a 5 _al_load_bitmap_font fontbmp.c 149 0x4c5595 6 al_load_font font.c 350 0x4c472d 7 main test.c 31 0x4013f3
more detailed backtrace:
Thread 1 (Thread 4032.0xaa8): #0 0x00407782 in _al_register_destructor (dtors=0x8e31f8, object=0x8e5e98, func=0x423596 <al_destroy_bitmap>) at C:\tools\allegro5\src\dtor.c:161 dtor_owner_count = 0xd8 __func__ = "_al_register_destructor" #1 0x00423591 in al_create_bitmap (w=513, h=97) at C:\tools\allegro5\src\bitmap.c:145 bitmap = 0x8e5e98 #2 0x004c85d6 in _al_load_tga_f (f=0x8e5e30) at C:\tools\allegro5\addons\image\tga.c:355 image_id = <data>} id_length = 0 '\000' palette_type = 0 '\000' image_type = 2 '\002' palette_entry_size = 0 '\000' bpp = 32 ' ' descriptor_bits = 8 '\b' first_color = 0 palette_colors = 0 left = 0 top = 0 image_width = 513 image_height = 97 left_to_right = true top_to_bottom = false c = 5130702 i = 5297452 y = 5124283 compressed = 8 bmp = 0x50d52c lr = 0x4e30bb buf = 0x40e8fe "UôB\004EôÇ@\030" premul = true __func__ = "_al_load_tga_f" #3 0x004c8ec5 in _al_load_tga (filename=0x4e30bb "fixed_font.tga") at C:\tools\allegro5\addons\image\tga.c:553 f = 0x8e5e30 bmp = 0x8e5810 #4 0x0040ba5a in al_load_bitmap (filename=0x4e30bb "fixed_font.tga") at C:\tools\allegro5\src\bitmap_io.c:219 ext = 0x4e30c5 ".tga" h = 0x8e5810 ret = 0x0 __func__ = "al_load_bitmap" #5 0x004c5595 in _al_load_bitmap_font (fname=0x4e30bb "fixed_font.tga", size=0, flags=0) at C:\tools\allegro5\addons\font\fontbmp.c:149 import_bmp = 0x4 f = 0x8e5ba0 backup = {_tls = <data>} range = {12582912, 127} #6 0x004c472d in al_load_font (filename=0x4e30bb "fixed_font.tga", size=0, flags=0) at C:\tools\allegro5\addons\font\font.c:350 i = 1992920277 ext = 0x4e30c5 ".tga" handler = 0x8e5ba0 #7 0x004013f3 in main (argc=1, argv=0x8e2e90) at test.c:31 p_font = 0x7efde000
apparently dtor_owner_count = _al_tls_get_dtor_owner_count(); returned NULL, then *dtor_owner_count caused a seg fault.
[edit]
"dtor_owner_count = 0xd8" not NULL. I just saw "seg fault" and assumed it must have been null.
Hmm
160 dtor_owner_count = _al_tls_get_dtor_owner_count(); 161 if (*dtor_owner_count > 0) 162 return;
and
871 int *_al_tls_get_dtor_owner_count(void) 872 { 873 thread_local_state *tls; 875 tls = tls_get(); 876 return &tls->dtor_owner_count; 877 }
So &tls->dtor_owner_count can't possibly be 0 or 0xd8 - I assume the crash means tls is NULL already and gdb just is confused because the function was inlined.
Which means, it's probably hitting an old bug with the Tls* functions. For some reason they are only called in DllMain but not when static linking but nobody ever fixed it so far. I don't know that Tls* API but assuming it's anything like pthread_create_key there should be no reason to need DllMain and it should be very easy to fix. I could be wrong though.
[edit:]
Here's an example how to use the Tls functions: http://msdn.microsoft.com/en-us/library/ms686991%28v=vs.85%29.aspx
It doesn't mention DllMain. So it looks that whoever implemented this way back only implemented it for the DLL version but not for the static version (which is the much simpler case). And it was forgotten ever since However the question is, why aren't there more reports of Allegro not working with static linking in Windows?
Alright. I'm looking into it now.
The thing is, I don't know anything about DllMain (other than what I read a few seconds ago), and I don't know anything about thread local storage or what it's purpose in allegro is.
From what I understand, allegro's DllMain function is responsible for allocating thread local storage for each thread allegro uses on windows. It is called every time a dll is loaded or unloaded. (it frees the allocated memory on the unload event.) So I suppose the problem is that when Allegro is statically linked, no dlls are loaded and so no thread local storage is allocated for new threads. My job then is to find where the threads are actually being created, and put the storage allocation there instead of in DllMain (or as well as in DllMain, depending on how it works.) I guess you're suggesting that I check out how the pthreads version works for comparison.
It is strange that no one else has reported this problem. I've seen a bunch of threads where people discuss the steps required to get static linking working on Windows, so presumably at least some of these people did get it to work in the end...
[edit]
By the look of it, that DllMain stuff is only meant to be used for older versions of mingw.
I haven't found the problem yet, but I've got a hunch that it's going to be something to do with bad defines somewhere causing allegro to get confused about which version of mingw I have. (ie. something related to 4.6.0, which is what you guessed in the first place.)
...
indeed, stepping through with the debugger reveals that this function is entered:
static thread_local_state *tls_get(void) { thread_local_state *t = TlsGetValue(tls_index); return t; }
which is inside the following preprocessor condition
#if (defined ALLEGRO_MINGW32 && ( \ __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2) || \ (__GNUC__ == 4 && __GNUC_MINOR__ == 2 && __GNUC_PATCHLEVEL__ < 1))) || \ defined ALLEGRO_CFG_DLL_TLS
I ran the following test:
printf("gcc v%d.%d.%d\n", __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__);
It prints 4.6.0. So my guess now is that there is something wrong somewhere in the allegro build system.
...
It looks like this is the culprit:
option(WANT_DLL_TLS "Force use of DllMain for TLS (Windows)" on)
I'm recompiling now with WANT_DLL_TLS turned off.
...
lo and behold, the test program ran without crashing.
Now for the final test... I'll recompile the static release version of allegro, and the static release version of my actual game - then try to run it on a computer that doesn't have the allegro dlls.
...
Yes. It all works. Static link, dynamic link, everything.
So, based on that, I suggest the following patch:
--- CMakeLists-old.txt 2011-05-27 13:40:49 +1000 +++ CMakeLists.txt 2011-05-27 13:40:03 +1000 @@ -152,7 +152,7 @@ endif(NOT IPHONE) option(NO_FPU "No floating point unit" off) -option(WANT_DLL_TLS "Force use of DllMain for TLS (Windows)" on) +option(WANT_DLL_TLS "Force use of DllMain for TLS (Windows)" off) option(WANT_DEMO "Build demo programs" on) option(WANT_EXAMPLES "Build example programs" on) option(WANT_POPUP_EXAMPLES "Use popups instead of printf for fatal errors" on)
By the way, does anyone want my collection of Allegro 5.0.3 binaries for MinGW 4.6.0?
I now have static release, static debug, and dynamic release. I guess it's still a long way short of all the flavours that the other binary packages have, but it's better than nothing, right?
Oh, indeed, I had read the code wrong. So the situation appears to be like this:
MSVC: Uses __declspec(thread) both for DLL and static versions
MingW: Uses __thread both for DLL and static versions
The TLS API is only used when forced on with WANT_DLL_TLS or when the above #ifdef is true (for old mingw versions), and is only implemented for DLLs but not for static linking since it probably wasn't important enough (and now is even less so).
Does that sound right?
But for some reason there was this commit setting WANT_DLL_TLS to true:
http://allefant.com/gitweb/?p=allegro.git;a=commitdiff;h=7ee13023bda58f83b66d54327ae8e157b6dc00fc
Which likely has broken static mingw versions since. The official binaries aren't built with cmake so they don't have WANT_DLL_TLS set so they always worked.
I'll apply your patch if nobody else does first...
Yes. That sounds right.
I don't know much about these different thread systems, but I searched the code for "ALLEGRO_CFG_DLL_TLS" and the only place it comes up is in the #if statement I quoted earlier. As far as I know, the DllMain stuff is only required for gcc v4.2.0 and earlier versions - and the #if already checks for that regardless of ALLEGRO_CFG_DLL_TLS.
As for the commit which made WANT_DLL_TLS default to on, I don't know what the idea was there. My first guess was that the the gcc version checks weren't in place at that time - but that's not true. So I don't know. Maybe we should ask Trent about it.
Anyway. Thanks for your help.
I asked Trent and he says it's better to use the TLS API, also with MSVC. So I'll try to change that #ifdef to never use the TLS API when static linking is on instead.
I've posted a patch for this on sourceforge. I don't think I've ever officially sent my patches like that before, so let me know if I've done it wrong.