OS X bug + patch
Dave Vasilevsky

Hi,

I'm Fink's maintainer for Allegro. There's a longstanding serious bug that we've had, and fixed in a not-so-great way, which I'd like to get fixed properly.

The bug: On OS X, Allegro command-line programs work particularly badly. There are two particular issues that cause this:

  • 1. All Allegro programs appear in the Dock (the taskbar/panel equivalent), even if they never prevent a GUI, and even if they pass SYSTEM_NONE to install_allegro(). This is more than aesthetically ugly--it also slows loading considerably. In extreme cases, such as the Liquid War build process which calls command-line Allegro programs dozens of times, it can even cause the Dock to get confused and crash.

  • 2. This is much rarer. In somewhat strange cases, a program can end up with a "dead bootstrap context" on OS X, which basically means that the Mach microkernel thinks it has given up the right to communicate with other programs via Mach ports. (The easiest way I know of to achieve this is to ssh in, start a screen session, detach, exit ssh, ssh back in, and reattach to screen.) Even with a dead bootstrap context, most command-line programs should still work--but all Allegro programs, even command-line programs, use the NSApplication class which crashes if there's no valid bootstrap context.

The fix: Originally we fixed this with some ugly #define's. Now I've got a proper fix here.

To fix issue #1, Allegro only appears in the Dock if SYSTEM_MACOSX is the current system driver. So programs which already pass SYSTEM_NONE will now do the right thing on OS X.

For issue #2, Allegro's magic main checks if there's a valid bootstrap context. If there is not, then it bypasses most of the magic main (include the NSApplication code) and goes straight to the real main. This way programs that don't use NSApplication will run. (This will break if the program really needs NSApplication, but that would break anyhow because of the bootstrap context.)

Please take a look at the patch, I'd like to get this into Allegro 4.2.

Further unrelated issues: There are a few of issues with the OS X build process that are not as serious from my perspective. I'd like to cooperate with you to fix them, but if you want to delay until after 4.2 I guess that's ok:

  • The 'install name' for liballeg is set to something like liballeg-4.2.0.dylib. This is wrong for two reasons: First, the install name should be an absolute, not a relative path. Second, the install name should not change while compatibility is being maintained. So Allegro 4.x.y should all have an install name something like '/usr/local/lib/liballeg-4.dylib' .

  • The allegro-config script is unconditionally munged to point to locations in /usr/local. There should be a way to tell it to look in other locations instead.

Thanks for all your work on Allegro!

vasi

Evert

Thanks! :D

Quote:

Please take a look at the patch, I'd like to get this into Allegro 4.2.

Looks good from my perspective, but I don't really know much about the Mac port. I'd like to hear what Peter Hull has to say about it.

Quote:

Second, the install name should not change while compatibility is being maintained. So Allegro 4.x.y should all have an install name something like '/usr/local/lib/liballeg-4.dylib' .

Slight correction, compatibility is only guarenteed for versions having the same x iff x is even. Otherwise correct. Do doublecheck that MacOS X is listed as a platform for which we maintain ABI compatibility though.

Quote:

The allegro-config script is unconditionally munged to point to locations in /usr/local. There should be a way to tell it to look in other locations instead.

Hmm... /usr/local is the default installation path under UNIX. allegro-config works properly if Allegro is installed in some other location using the --prefix option for configure. The problem is that the MacOS X port doesn't use configure and that Apple has removed /usr/local.
I agree we need to addres this, but I'm not sure how to do this elegantly. The fix.sh script should probably do it.

Dave Vasilevsky

Thanks for your comment!

Quote:

compatibility is only guarenteed for versions having the same x iff x is even

Ah ok, I was under the impression that all 4.x.y were compatible for any x, y. If they're only backwards binary compatible within a minor-version then install name should be .../liballeg-4.2.dylib . But definitely not ...-4.2.0.dylib, unless compatibility is broken with every release.

Quote:

allegro-config works properly if Allegro is installed in some other location using the --prefix option for configure

That's how Fink normally handles things... Another option is to do 'make PREFIX=<whatever>' and then let makefile.osx take care of setting up the install_name and allegro-config.

vasi

Kitty Cat
Quote:

If they're only backwards binary compatible within a minor-version then install name should be .../liballeg-4.2.dylib .

Doesn't Allegro make symlinks from x.y.z -> x.y -> x? Other programs do this. But I don't think you should remove the revision number from the library itself, in case you want to have multiple versions installed (eg. to test differences or something). Especially as the new API starts coming in 4.3, people might still want to hold on to 4.2 as backwards compatibility might start getting shakey.

Peter Hull

OK, I'll look into it! Vasi, I'm sorry if you raised these issues before and they got ignored; hopefully we can get them fixed before RC1.

Unfortunately I can't see your patch

Quote:

The requested URL /files/patches/allegro-osx-cmdline.patch was not found on this server

can you check the URL?

Issue #1 should be fine; the code to notify the Dock was added in specially anyway.
Issue #2 I've never even heard of a dead bootstrap context! I'll take your advice on that one...

With regard to the other stuff, I think the best solution would be for Allegro OSX to have its own ./configure script (remember the problem I posted on [AD] regarding allegro-config and debug vs. non debug installs) but, frankly, the GNU build tools make my head spin and I've never got to the point where I could write my own autoconf.

I'll get back to you tomorrow,

Cheers

Pete

[edit] Oh, regarding Fink, you are talking purely about the 'native' OSX version, right, and not the X version compiled for Mac?

Dave Vasilevsky

Thanks for the input folks.

Quote:

Doesn't Allegro make symlinks from x.y.z -> x.y -> x? Other programs do this.

Yes, it does this. But 'install_name' doesn't just mean the name of the library. It's a technical term having to do with the OS X linker. When a program is linked to Allegro, Allegro's install_name gets embedded in that program and tells the run-time linker where to look for the Allegro library.

So using the full liballeg-4.y.z.dylib as the install_name means if the user ever upgrades Allegro to another revision, the runtime linker will not be able to find it. And that's clearly wrong, since 4.2.1 should be compatible with 4.2.0.

Quote:

But I don't think you should remove the revision number from the library itself, in case you want to have multiple versions installed

Yes, I'm getting the impression that $prefix/lib/liballeg-4.x is the correct install_name, which will allow multiple minor versions (4.2, 4.3) but not multiple revisions (4.2.0, 4.2.1).

Quote:

can you check the URL?

Er, woops. Fixed.

Quote:

I think the best solution would be for Allegro OSX to have its own ./configure script ... but, frankly, the GNU build tools make my head spin

Heh, I know the feeling! I'll try to harass some other Fink devels who are more familiar with autotools...

Quote:

Oh, regarding Fink, you are talking purely about the 'native' OSX version, right, and not the X version compiled for Mac?

Yes, I'm talking about the native version, using Cocoa and Carbon. But we install it in a Unix-like way (via 'make install INSTALLDIR='), not in ~/Library/Frameworks or anything like that.

vasi

Peter Hull

That patch looks fine to me, however (hope you don't mind) I modified it a bit to put the dock notification code in its own function - attached. Looking at various sources on the web, the TransformProcessType function might be the official (i.e. documented) way to do it. Unfortunately it isn't in 10.2 so we can't use it, yet.

Regarding the dylib file names, my understanding of Apple's convention is that 'major' revisions, which would be incompatible, have different filenames (more info here and here). Minor revisions use the linker options -dylib_compatibility_version and -dylib_current_version to distinguish them. These numbers are to prevent a newer executable linking to an older library. The install name thing seems to be if you put a library in a non-standard place, so it can be found. Does fink have its own convention?

Anyway, the current makefile is wrong because it has compatibility version 4.0.0, and I suspect that 4.2 is not binary compatible with 4.0. That should be changed.

Pete

Dave Vasilevsky

Peter,

Your modification to the patch sounds fine.

You're not quite right about install_names. The -install_name argument to libtool (equivalent to -dylib_install_name for ld) specifies where a library should be found. I think you were looking at the distinct -dylib_file argument to ld, which is used to say "the library we're linking to isn't where the install_name says it should be", which is quite different.

Anyhow, for compatibility versions and install_names, Fink sticks to Apple's guidelines. The fact that the compatibility version in Allegro stayed the same led me to believe that all of Allegro 4.x.y were compatible with each other.

Just to make sure I've got it right now: Allegro 4.X and 4.Y are NOT guaranteed compatible, but Allegro 4.X.A and 4.X.B ARE guaranteed compatible both forwards and backwards. Is that correct?

If it is, then according to Apple and Fink's guidelines both, for Allegro 4.X.A the install_name should be $prefix/lib/liballeg-4.X.dylib and the compat version should be 4.X.0 .

Dave

Evert
Quote:

Just to make sure I've got it right now: Allegro 4.X and 4.Y are NOT guaranteed compatible, but Allegro 4.X.A and 4.X.B ARE guaranteed compatible both forwards and backwards. Is that correct?

Almost. Allegro 4.X.A and 4.X.B are guarenteed to be compatible iff X is even. If X is odd, then the version in question is a WIP version, which isn't guarenteed to be compatible with other WIP versions (for obvious reasons).

Peter Hull
Quote:

I think you were looking at the distinct -dylib_file argument to ld

You're right, I was.
I'll try again ... -dylib_file could be used when you want to link something to a 'private' version of a standard library, i.e. when you're building the executable
-dylib_install_name is for when you're building a library that's not designed to go in a standard place (for example, if it's going to go into an application framework)
Is that right?

With regard to version numbers, as I understand it now, you should have only one 4.0 library called liballeg-4.0.dylib, then many separate 4.1 versions, liballeg-4.1.1.dylib, liballeg-4.1.2.dylib, ... then only one 4.2 library, many 4.3 libraries, and so on. Within your 4.even libs, the -dylib_current_version and -dylib_compatibility_version options should be used to distinguish revisions (4.2.1, 4.2.1, etc)

As for symlinks, the linker doesn't mind the symlink name (because it goes off the install_name built into the file itself) so we could have symlinks for liballeg-4.2.x.dylib which all point to the one true 4.2 version of Allegro. In the 4.3 libs, I suppose liballeg-4.3.dylib could be a symlink pointing to the latest 4.3.x version, but it wouldn't be much use.

Does this make any sense?

Pete

[edit]
Here I am again... with a patch for the makefile

1Index: makefile.osx
2===================================================================
3RCS file: /cvsroot/alleg/allegro/makefile.osx,v
4retrieving revision 1.51
5diff -u -r1.51 makefile.osx
6--- makefile.osx 7 Jun 2005 11:56:48 -0000 1.51
7+++ makefile.osx 4 Jul 2005 18:39:13 -0000
8@@ -51,7 +51,7 @@
9 LIB_NAME = lib/macosx/lib$(VERSION)-$(shared_version).dylib
10 LIB_MAIN_NAME = lib/macosx/lib$(VERSION)-main.a
11
12-DYLINK_FLAGS = -prebind -seg1addr 0x30000000 -dylib_compatibility_version=4.0.0 -dylib_current_version=$(shared_version)
13+DYLINK_FLAGS = -prebind -seg1addr 0x30000000 -compatibility_version $(compatibility_version) -current_version $(shared_version)
14
15 INSTALL_NAME = -install_name lib$(VERSION)-$(shared_major_minor).dylib
16 INSTALL_NAME_EMBED = -install_name "`echo "@executable_path/../Frameworks/$(FRAMEWORK_NAME).framework/Versions/$(shared_version)/$(FRAMEWORK_NAME)" | sed 's!//*!/!g'`"
17Index: makefile.ver
18===================================================================
19RCS file: /cvsroot/alleg/allegro/makefile.ver,v
20retrieving revision 1.34
21diff -u -r1.34 makefile.ver
22--- makefile.ver 1 Apr 2005 08:43:43 -0000 1.34
23+++ makefile.ver 4 Jul 2005 18:39:16 -0000
24@@ -5,3 +5,6 @@
25 # Shared library versions for Unix
26 shared_version = 4.2.0
27 shared_major_minor = 4.2
28+
29+# Compatibility version for Mac OS X
30+compatibility_version = 4.2.0

It turns out that -dylib_compatibility_version=xxx doesn't actually do anything when passed to gcc, because gcc calls libtool rather than ld directly. It should be -compatibility_version xxx.

Also, does anyone know why -seg1addr 0x30000000 is there?

Pete

Dave Vasilevsky
Quote:

I'll try again ... -dylib_file could be used when you want to link something to a 'private' version of a standard library, i.e. when you're building the executable
-dylib_install_name is for when you're building a library that's not designed to go in a standard place (for example, if it's going to go into an application framework)
Is that right?

Er, not really. Let me try to explain it with an example:

Suppose you have a library 'libfoo.dylib', and an executable 'bar'. When bar is linked to libfoo, the OS X linker will include inside bar the install_name of libfoo (which is a complete path). That way when bar is run, the runtime linker will look for libfoo at that path.

So libfoo ALWAYS needs an install_name, a complete path where it can be found such as /usr/lib/libfoo.dylib, whether it's in a standard or non-standard location. You can look at any lib on your system with 'otool -L /path/to/lib' to see the install_names of the lib, and of any libs that it links to, and you'll see that they all have install_names.

So that's the standard case. Now suppose that you're building bar for a friend, and although your libfoo is in /usr/lib/libfoo.dylib, hers is in /opt/local/libfoo.dylib. So then you'll pass -dylib_file /usr/lib/libfoo.dylib:/opt/local/libfoo.dylib , which specifies that the linker should embed the install_name opt... for libfoo in bar even though that's not the case on your system. Use of this flag is VERY RARE, you can probably ignore it.

Quote:

With regard to version numbers, as I understand it now, you should have only one 4.0 library called liballeg-4.0.dylib, then many separate 4.1 versions, liballeg-4.1.1.dylib, liballeg-4.1.2.dylib, ... then only one 4.2 library, many 4.3 libraries, and so on.

If you're talking about install_names, then yes. Multiple libraries for 4.EVEN will all need the same install_name, but libraries for 4.ODD should have different ones.

Quote:

Within your 4.even libs, the -dylib_current_version and -dylib_compatibility_version options should be used to distinguish revisions (4.2.1, 4.2.1, etc)

NO. Just like install_names, compatibility versions are included in objects linked to a library. So if libfoo has compat version 4.2.2, that will be stored in bar--and then if bar is run on a system where libfoo has compat version 4.2.1, it will refuse to run.

This is why I was asking about both forwards and backwards compatibility. If there's only forwards compatbility, then the compat versions should be different for 4.2.X. If there's also backwards compatibility, then the compat versions must all be the same.

Quote:

As for symlinks, the linker doesn't mind the symlink name (because it goes off the install_name built into the file itself) so we could have symlinks for liballeg-4.2.x.dylib which all point to the one true 4.2 version of Allegro. In the 4.3 libs, I suppose liballeg-4.3.dylib could be a symlink pointing to the latest 4.3.x version, but it wouldn't be much use.

Er, sorta. The really proper way to do it is like this:

- Install Allegro as liballeg.4.2.X.dylib, with install name liballeg.4.2.dylib

- Make a symlink at liballeg.4.2.dylib. The runtime linker will find that symlink when it looks at the install_name, and will just follow it. This allows a user to have 4.2.1 and 4.2.2 installed and switch by just changing a symlink, which can be useful for debugging.

- Make another symlink at liballeg.dylib. This way clients can link with just -lalleg .

For 4.3, the install_name is liballeg.4.3.X.dylib, and yeah, there shouldn't really be a symlink at liballeg.4.3.dylib .

Also, for some reason Allegro seems to use a dash (-) instead of a dot (.), and call it liballeg-4 instead of liballeg.4 as is normally done. I guess there's no real harm to doing it that way, it's just weird :-) Up to you if you want to stick with dash or go with dot.

Quote:

It turns out that -dylib_compatibility_version=xxx doesn't actually do anything when passed to gcc, because gcc calls libtool rather than ld directly. It should be -compatibility_version xxx.

Yeah, I take care of that in my patch for Fink, I forgot to bring it up.

Quote:

Also, does anyone know why -seg1addr 0x30000000 is there?

Oooh, more fun explication! Basically, when bar is run and libfoo has to be loaded, the runtime linker has to dynamically figure out where in memory space it can "bind" libfoo. This takes a bit of time, so people try to speed it up by specifying where a library should be, called 'prebinding'.

1) One way, which Allegro uses, is to just specify "this library goes at this address" with seg1addr. Of course, this doesn't work when other libraries also want to use the same space.

2) A better solution is to keep a system-wide registry of where libraries are, which is what Apple does. Fink interfaces with that so that building a Fink package automatically updates the seg_addr_table registry.

3) Even better is to run 10.4, since it has some sort of better solution in place which no longer requires prebinding to run quickly. Unfortunately it's not well documented so I don't really understand it, see 'man dyld' if you want to try to understand it yourself.

The "right thing to do" really depends on what platforms you're building Allegro for, since that affects what linker features can be used. Probably best to let the user set the environment variables LD_PREBIND and MACOSX_DEPLOYMENT_TARGET to whatever's right for them.

Dave

Peter Hull
Quote:

NO.

I didn't explain what I meant very well. If we only fix bugs in the 4.2.x series, and the public API never changes, neither adding nor removing things, the compatibility version will remain at 4.2.0. If we remove or change public APIs - oops - binary compatibility has been broken and it shouldn't be 4.2 any more. If we add things to the API then it could still be 4.2, but the compatibility version needs to be changed to prevent new programs linking to old libraries (which don't have this new thing.) I hope that the public API will not change at all in 4.2, but you never know...

Quote:

Oooh, more fun explication

Well, we have to get our money's worth while you're here ...

I'm almost ready to submit a makefile patch to [AD]. However I've had one though regarding the previous dead bootstrap issue - if a program is in this situation and it tries to init the MACOSX driver, what happens? Will it crash or return an error code? Is it possible that it could happen?

Cheers

Pete

Dave Vasilevsky
Quote:

If we add things to the API then it could still be 4.2, but the compatibility version needs to be changed

I guess I misunderstood again, I thought changes within 4.2 must stay backwards compatible (ie: no API additions). If things might be added, then the compatbility version should increase, yes.

Quote:

regarding the previous dead bootstrap issue - if a program is in this situation and it tries to init the MACOSX driver, what happens? Is it possible that it could happen?

It's certainly possible, though unlikely. If someone's in a dead bootstrap context, they're necessarily in a text-only situation, such as in a ssh session, so it's a bit weird for them to launch a GUI app.

If it does happen, the user gets a crash at some point, because certain things aren't initialized. If you'd like to make things nicer, you could set a global variable based on whether a dead bootstrap context was detected. Then if the program later attempts to initialize the Mac OS X driver, it could look at that variable and give an appropriate error message.

I'm sure that even if there's not a dead bootstrap context, without a graphical environment available a GUI Allegro app will give some kind of error. Unfortunately I can't test that easily right now, but you might want to try it.

Dave

Peter Wang

What we want to express is that a later version of the 4.2.x series will work in place of an earlier version, but an earlier version will not necessarily work in place of a later version.

In the 4.0.x series we inadvertantly guaranteed both backwards and forwards compatibility (on some platforms) so no new symbols were ever introduced, but for 4.2.x I think we want the flexibility of adding things.

EDIT: fixed up forwards/backwards compat

Elias

We already provide backwards compatibility in all WIP versions and even the 4.3 branch, so I'd say forwards compatibility is the whole point of having a stable version. Why would 4.2 be different from 4.0 there?

[Edit.. hm, was mixing up API and ABI. So disregard the 4.3 part..]

Evert

For 4.0, we have backward and `forward' comaptibility: It's possible to use a programem compield with 4.0.3 with the 4.0.1 shared library.
I think the plan was to drop forward compatibility for 4.2: it's possible to use a programme compiled with 4.2.0 with the 4.2.1 shared library, but not nescessarily the other way around.

Elias

Yes. So I compile my game with 4.2.1, and it won't run on systems who have the 4.2.0 shared library. I somehow found the 4.0 behavior better, I could compile with 4.0.1, and it would run with the 4.0.0 DLL.

Peter Hull
Quote:

I'm sure that even if there's not a dead bootstrap context, without a graphical environment available a GUI Allegro app will give some kind of error.

I'll give it a try, by faking the return from the bootstrap_ok function. Running it via ssh from another computer works OK, it just starts up the graphical app on the Mac's screen. I've used it in the past to debug fullscreen apps, so it's quite useful if not correct in a multi-user environment.

Regarding versions, I see no harm in reserving the right to add functions (is that breaking forwards compatibility? I forget) but IMO we should resist the urge to do it unless strictly necessary. Work on 4.2 should be bugfixes only because 4.3 will be quite radical and I think we need to concentrate on that, rather than tweaking 4.2.x. Both 3.9 and 4.1 became the de facto stable versions in their later revisions, because they were current for such a long time. Hopefully this time we'll be able to move more quickly to 4.4 !

I'm quite keen to help out on the Mac side of 4.3 development, anyway

Pete

Peter Wang
Quote:

So I compile my game with 4.2.1, and it won't run on systems who have the 4.2.0 shared library.

That should only happen if you make use of some new feature from 4.2.1.

Peter Hull, if you want to begin now, we need to get the new_api_branch back to a compiling state on MacOS X. After that, I can tell you what further tasks need to be done... :-)

Dave Vasilevsky
Quote:

Running it via ssh from another computer works OK

Yes, but what if nobody's logged in via the GUI? That's not a dead bootstrap context, but still has no GUI available.

Thread #504539. Printed from Allegro.cc