Bug in keyboard.c : Race Condition
ZoriaRPG

In Allegro 4.4.3, keyboard.c still contains buggy code, whereby a race condition is set up that is not thread safe. The offending code is here:

#SelectExpand
1 buffer->lock++; 2 3 if (buffer->lock != 1) { 4 buffer->lock--; 5 return; 6 } 7 8 if ((waiting_for_input) && (keyboard_driver) && (keyboard_driver->stop_waiting_for_input)) 9 keyboard_driver->stop_waiting_for_input(); 10 11 if (buffer->end < KEY_BUFFER_SIZE-1) 12 c = buffer->end+1; 13 else 14 c = 0; 15 16 if (c != buffer->start) { 17 buffer->key[buffer->end] = key; 18 buffer->scancode[buffer->end] = scancode; 19 buffer->end = c; 20 } 21 22 buffer->lock--; 23}

In conditions where multiple threads access this, buffer->lock can be -1, and thus the keyboard input is ignored.

This needs an atomic solution that is `compatible with C03 (or C99, C08, but not C11+).

Do any of you have a good solution to this that is either (1) not compiler-specific, or (2) works across most compilers using defs?

This has been a long-standing bug since 4.2 or earlier, and should be addressed.

Chris Katko

Side question: Is Allegro 4 advertised as... thread-safe? I mean, the API dates back to the late-80's for the Amiga.

And isn't accessing system code from multiple threads (like OpenGL) considered bad practice?

ZoriaRPG

Perhaps not, but this single issue is a pandora's box that continues to contribute to several critical issues in using the software. It should be addressed, by someone, be it us, or someone who maintains the library.

The fact that a programme can simply start ignoring keyboard input because of this singular oversight, is something that surely can be resolved.

The issue itself occurs seemingly at random, is very difficult to replicate as the conditions that cause it are not well-known, and it seems to stem from this particular set of functions that if made thread-safe, would resolve it entirely.

If I knew a straightforward way to fix this so that it works with the majority of compilers (so, C03), I would already have fixed it and submitted a patch.

A number of patches already in the 4.4.3 lib were to fix threading issues, so this is not all that different.

Insofar as the origins of Allegro, given that it probably can no longer even compile on M68k architecture, I think that particular point is academic.

Chris Katko

I'm not discounting your valid claims. I'm just wondering if A4 had thread-safety as a design goal, with the original API long predating multi-threading computers. Since you mention other thread patches, clearly they fix thread issues.

I don't know how much A4 is being actively maintained. I'm sure if you submitted a patch, Siegelord/Trent/etc would review it. For example, the bitmap loading routines in A4 don't support GIMP's default bitmap header (BMP being MANY different structures under the same file extension), so you have to use PNG to be productive in A4.

Ever since I got used to A5, I personally haven't felt the need for A4. Everything is so much faster in A5.

Best of luck.

DarkDragon

A quick point of clarification: the OP is misleading. The user program is [i]not[/i] multi-threaded. This bug is completely internal to Allegro 4.4 itself (see keyboard.c) and arises in single-threaded client code.

Regarding porting to Allegro 5: we would if we could. Our application contains hundreds of thousands of lines of code heavily reliant on Allegro 4 features that were stripped out of Allegro 5.

Edgar Reynaldo

To clarify, nothing was 'stripped' from Allegro 5. A5 was a complete re-write. That it bears its namesake is merely coincidental, and arises from being the successor to Allegro 4.

A modern approach to hardware and software required leaving behind the old ways. How do you suggest turning polling into events without redesigning the system? Polling is an inherently bad design and leads to missed input.

What about being CPU bound for software drawing? Sure optimized SW drawing would have been nice, but all the old assembly code had bit-rotted its way to oblivion and needed to be replaced. Moving to hardware accelerated gpu drawing was the only way to go forward. Meet the future. A4 could do limited hwaccel with DX and AllegroGL, but it certainly wasn't designed to do so, and everything was just a hacked on mess. A4 had to be completely redesigned to do this.

I will say though, the one thing A4 does that A5 doesn't is 8-bit Palette drawing. Windows doesn't natively support 8 bit modes anymore. They're lucky to work at all.

But please, submit patches if you want anything to change. The mailing list is a good place to do this, or file a bug report on github so it will be in the tracker.

DarkDragon

Sure, I'm not trying to disparage Allegro 5. But we hear "just switch to Allegro 5" a lot, and as you say that is not a practical solution for us given how different the two versions are.

I've sent a pull request via GitHub fixing some crashes related to alt-tabbing on Windows, and we may send another for this keyboard issue as well.

Chris Katko

I'm not trying to bash you. But Allegro 5 is really not that different, and as others have posted, there's an A4 to A5 guide.

If your game is tiny, rewriting the Allegro code shouldn't be that bad. If your game is large, the Allegro section should be a tiny fraction of your overall code.

I can't for the life of me, figure out why there's so much friction about this issue. You can literally program A5 the same way as A4 with some function changes. You could practically automate the entire thing. You wouldn't BENEFIT from things like events and acceleration as much, because your game would still be framed in a "pre-HW acceleration" mindset. But you load a bitmap, you draw it. You load a font, you draw some text. You load a sound, and you play it. You can make a key[] array the exact same way as Allegro 4 runs and there's even examples in the A5 library that do that.

A4 and A5 are both minimal wrappers over the OS. The libraries are tiny compared to something like OpenGL. So most of what's going on... is simply renaming stuff.

Spend a couple days writing some A5 examples and it won't seem daunting at all.

- Disable all audio.
- Disable all graphics
- Disable everything except text. Get it to compile.
- Then add more basic graphics back in.
- Then add all graphics in.
- Then add all sounds in.

You have now ported your game from Allegro 4... to SDL... to SFML... to Allegro 5... to super-magic-library-17. All those libraries are 99% the same. They're imperative drawing libraries. (You want something REALLY ALIEN try working with a scene-graph based 3D library where you're doing nothing but adding things to trees and all rendering is automatically sorted and handled for you.)

99% of that is simply string matching and replacing. BITMAP becomes ALLEGRO_BITMAP. my_bmp->w becomes al_get_bitmap_width(my_bmp). (Personally, I hate that long name change, but whatever.)

It really sounds like there's either a "fear of the unknown", or, some sort of entitlement/"you should fix it for me" mindset at play here.

That being said, Allegro has a tiny dev team right now and A5 rightly gets the focus. So anyone who WANTS to do the work and make an exhaustive porting guide, auto-porting script, or come up with patch solutions to race conditions in A4 is 100% welcome to do it with open arms. But, the community is completely charity-based and everyone is busy with "life" and work issues that come with age. So without corporate support, there's not much that can be demanded from this community.

I hope I don't sound as crass, unapologetic, or disregard your problems. I'm just trying to elaborate my alternative perspective of the situation.

SiegeLord

I don't know how much A4 is being actively maintained. I'm sure if you submitted a patch, Siegelord/Trent/etc would review it.

The situation with Allegro 4 is that we accept patches, but a new release is unlikely (and we don't work on it beyond helping out with patches).

I do want to understand more information about this issue bug though. That piece of code is inside a private function, so you must be calling some public function to get to it. Which API is this?

Either way... A4 has internal support for mutexes, so you could just wrap that function in a mutex. See... mixer.c which has usage of a mutex for reference. Definitely want to wrap it up in #ifdef ALLEGRO_MULTITHREADED like in that file.

DarkDragon

I'm not trying to bash you. But Allegro 5 is really not that different, and as others have posted, there's an A4 to A5 guide.

I know you're trying to help, but I don't think you understand the magnitude of the work involved in even a "simple" port... Zelda Classic has 700,000 lines of codes (over twice the size of Allegro 5 itself) and makes heavy use of 4.0-only features such as 8-bit color palettes, packfiles, MIDI, the old GUI routines, etc. We are restricted in what core features we can change or rewrite since we must remain backwards-compatible for hundreds of pieces of user-generated content spanning decades.

We are slowly refactoring our backend code to eventually be able to drop Allegro 4, but this is a process that will take years (plural).

If I were starting a completely new project? Sure, I'd use Allegro 5 over Allegro 4, that's a no-brainer.

beoran

Maybe a silly question, but would it help it help if we were to assist in writing (non-core) A5 plugins for 8-bit color palettes, packfiles, MIDI & the old GUI routines?

A5's has some support for paletted bitmaps but this could be improved. Packiles could be ported. MIDI could be an audio plugin. The old GGUI routinescould probably be ported t or replaced by something abitmore up to date.

Edgar Reynaldo

We are slowly refactoring our backend code to eventually be able to drop Allegro 4, but this is a process that will take years (plural).

It sounds like you're better off re-writing everything from scratch. Trying to modify an existing code base with 700,000 lines is insane. If you want to use Allegro 5 as your new graphics driver, you're going to need to adopt event based processing instead of polling. That is the major change between A4 and A5. Allegro 4's keyboard polling can easily be replaced by a state machine that updates itself as each new event comes in. That's what I did in Eagle 5. It fully supports the old style of polling keys, but is event based so you'll never miss input.

Also, bumping this thread because I can.

Also also, did the patch for wwnd.c and wdispsw.c ever get applied???

ZoriaRPG

The patches seem to have been applied, AFAIK.

Any thoughts on the best/proper way to fix this issue? Should I just shove some std::atomic stuff in there and see what happens, or ???

Edgar Reynaldo

Here's what's there :

if (buffer->lock != 1) {
   buffer->lock--;
   return;
}

What about this instead?

if (buffer->lock > 1) {/// more than one thread competing for the lock
   buffer->lock--;
   return;
}

Or, why can't we use a proper mutex instead?

ZoriaRPG
Edgar Reynaldo said:

if (buffer->lock > 1) {/// more than one thread competing for the lock
   buffer->lock--;
   return;
}

I brought this up on the AGN forum, and I suggested almost the same thing. One of the other devs said that it would have no impact on the issue. I used > 0 for it, but ....

Here is the relevant post:
http://www.armageddongames.net/showthread.php?98003-Allegro-pull-requests&p=915326&viewfull=1#post915326

I'm not sure what would happen if you allowed things to compete for this. The problem seems to be that multiple threads can decrease buffer->lock, and it can become a value of -1. Once this occurs, all KB input no tied to a mapped button dies.

DarkDragon

Yes, the inequality does nothing to fix the root cause of the concurrency bug, namely that a context switch can happen after the contents of buffer->lock have been read but before the new value has been written (i.e. someone in the past mistakenly believed that ++ is atomic in C++, when unfortunately this is not the case.)

I wrote a fix using Allegro's mutexes here: https://github.com/ArmageddonGames/allegro5/commit/f805b1ed4ac1aeab631763b699060943e157cfa3

However it's only a 99% fix, due to the way the keyboard handler allows the user to install the keyboard using custom hooks instead of Allegro's keyboard driver. In that case no cleanup code is executed when Allegro is uninstalled, and so the patch above will leak the mutex. I don't think the leak matters at all in practice but I wanted to bring it up here in case you have some ideas.

ZoriaRPG

For some reason, with the new libs using the mutex, my enter key was not working--in other programmes--after I left ZC running for a while. Why allegro would steal that input, I'm unsure.

This may have been caused by an incomplete build.

Thread #616984. Printed from Allegro.cc