Callback Problem
dialmformartian

Im new to callbacks, I am trying to make an input buffer with a callback
so that i wont miss out inputs while doing lazy processing..

this is what i came up with

1typedef struct _message
2{
3 char type;
4 int val;
5 int ex1, ex2, ex3;
6} message;
7typedef struct _input
8{
9 message queue[255];
10 unsigned int lock, start, end;
11} input;
12 
13volatile input buf;
14 
15void mouse_call(int f) {
16 unsigned int i, ex1 = 0, ex2 = 0, ex3 = 0;
17
18 if(f & MOUSE_FLAG_MOVE) {
19 ex1 = mouse_x;
20 ex2= mouse_y;
21 }
22 if(f & MOUSE_FLAG_MOVE_Z)
23 ex3 = mouse_z;
24
25 if(((buf.end+1) & 255) == (buf.start&255)) //What! buffer is full?
26 return;
27 i = buf.end&255;
28 buf.lock++; //Lock
29 if(buf.lock != 1) { //Damn! its locked
30 buf.lock--;
31 return;
32 }
33
34 buf.end++; //Quickly copy stuff
35 buf.queue<i>.type = 0;
36 buf.queue<i>.val = f;
37 buf.queue<i>.ex1 = ex1;
38 buf.queue<i>.ex2 = ex2;
39 buf.queue<i>.ex3 = ex3;
40 
41 buf.lock--;
42}
43END_OF_FUNCTION(mouse_call)

Then in main:

1 LOCK_VARIABLE(buf);
2 LOCK_FUNCTION(mouse_call);
3 LOCK_FUNCTION(key_call);
4 mouse_callback = mouse_call;
5 keyboard_lowlevel_callback = key_call;
6 buf.end = buf.start = buf.lock = 0;
7 install_mouse();
8 install_keyboard();
9 set_keyboard_rate(0, 0);
10 
11 unsigned int k;
12 while(1) {
13 if((buf.start&255) == (buf.end&255)) {
14 //rest(1);
15 continue;
16 }
17 k = buf.start&255;
18 textprintf_ex(screen, font, 0, 0, 3,0, "type = %i val = %i ex1 = %i ex2 = %i ex3 = %i ", buf.queue[k].type, buf.queue[k].val, buf.queue[k].ex1, buf.queue[k].ex2, buf.queue[k].ex3);
19 textprintf_ex(screen, font, 0,10, 3,0, "start = %i end = %i ", (buf.start&255), (buf.end&255));
20 if(buf.queue[k].val == 59 && buf.queue[k].type == 1) {
21 break;
22 }
23 buf.start++;
24 }

but after buf.end reaches 255 the program gets stuck.
The same program works perfect if i make everything in 'input' structure as separate
global variables.

There are No compile Errors

I compile with -lalleg -O -Wall in Dev-C++
Can Someone help

orz

Change the 255 to 256 in the declaration of message queue. When you write over the 256th element, you're clobbering your control values instead.

edit: further commentary:
That's an odd error to make for someone who got the bitwise math right.
I generally haven't used locking mechanisms like that, not sure if it's a good idea or not. Testing suggests that the mouse callback is not re-entrant on windows, but of course that might vary by platform (specifically, it's probably is re-entrant on DOS, if Allegro still supports DOS). Even given that it's re-entrant though, is that really the right way to implement such a thing? I just have the impression that implementing locking yourself is a bad idea unless you're an OS or a threading implementation.
I believe it works when not in a struct due to different memory layout - the elements in the struct are generally in order in memory, while stuff outside of structs are generally in reverse order.

dialmformartian

Sheesh !! what a silly mistake. I was rather mislead by the complexity
of callbacks into thinking that there was no problem with the code.

thanx a lot.

I was also a bit skeptic about the implementation. Are there any other
better ways to implement such a thing.

axilmar

dialmformartian, how do you make sure your main routine does not read a part of the buffer that has invalid values? the 'lock' member does not guarantee you atomic access to the buffer.

orz
Quote:

I was also a bit skeptic about the implementation. Are there any other
better ways to implement such a thing.

I don't know. In my Allegro callbacks, I do not bother locking (A: I think it only matters on DOS, and B: I think it only matters when there is a performance anomaly).

Quote:

dialmformartian, how do you make sure your main routine does not read a part of the buffer that has invalid values? the 'lock' member does not guarantee you atomic access to the buffer.

Hm, yeah, buf.end++ should probably be moved to after the buffer entry becomes valid instead of before, or you have a race condition there. In the example code the race won't actually do anything except cause the wrong values to be displayed once in a while, but for a real-world program that could be a serious issue.

dialmformartian

Yeah like orz said the buf.end++ should move quite fast enough unless the user is faster than the CPU in giving in inputs (does that work that way??). I just wanted to try this after seeing the CGUI source code. CGUI seemed to implement something of the sort of an input buffer, then reads the buffer and issues messages to widgets. I just didn't have the patients to figure how it works exactly. (Can any one help)

Audric

The current system "gives up" (return and discard the event) when mouse_call() and key_call() race each other.

I'd replace the whole "lock" system by a mutex to cover all operations on start and end. The locking and unlocking are only one function call, and when you use the "official" mutex functions it's much easier to verify, from reading the code, that the critical section is indeed protected.

1// in key_call and mouse_call:
2pthread_mutex_lock(&event_handler_mutex); // <--
3if(((buf.end+1) & 255) == (buf.start&255)) { //What! buffer is full?
4 pthread_mutex_unlock(&event_handler_mutex);
5 return;
6}
7i = buf.end&255;
8buf.end++;
9pthread_mutex_unlock(&event_handler_mutex); // <--
10 
11// in main:
12while(1) {
13 
14 pthread_mutex_lock(&event_handler_mutex); // <--
15 if((buf.start&255) != (buf.end&255)) { // have one message to process
16 int start_value = buf.start&255;
17 int end_value = buf.end&255;
18 buf.start++;
19 pthread_mutex_unlock(&event_handler_mutex); // <-- no need to lock DURING the textprintf's
20 textprintf_ex(screen, font, 0, 0, 3,0, "type = %i val = %i ex1 = %i ex2 = %i ex3 = %i ",
21 buf.queue[start_value].type, buf.queue[start_value].val, buf.queue[start_value].ex1,
22 buf.queue[start_value].ex2, buf.queue[start_value].ex3);
23 textprintf_ex(screen, font, 0,10, 3,0, "start = %i end = %i ",
24 start_value, end_value);
25 }
26 else {
27 pthread_mutex_unlock(&event_handler_mutex);
28 //rest(1); // in real situations, you would rest here,
29 // or at least do something useful instead of hogging the mutex :)
30 }
31}

axilmar
Quote:

Yeah like orz said the buf.end++ should move quite fast enough unless the user is faster than the CPU in giving in inputs (does that work that way??). I just wanted to try this after seeing the CGUI source code. CGUI seemed to implement something of the sort of an input buffer, then reads the buffer and issues messages to widgets. I just didn't have the patients to figure how it works exactly. (Can any one help)

I used a critical section. I think it would be better for you to use the library pthreads which is compatible with many platforms.

orz
Quote:

dialmformartian, how do you make sure your main routine does not read a part of the buffer that has invalid values? the 'lock' member does not guarantee you atomic access to the buffer.

Quote:

Hm, yeah, buf.end++ should probably be moved to after the buffer entry becomes valid instead of before, or you have a race condition there. In the example code the race won't actually do anything except cause the wrong values to be displayed once in a while, but for a real-world program that could be a serious issue.

Quote:

Yeah like orz said the buf.end++ should move quite fast enough unless the user is faster than the CPU in giving in inputs (does that work that way??).

Just to make sure we're on the same page here (because it looks to me like we might, possibly, be talking about different things):
What I think axilmar was talking about, and certainly what I was attempting to agree with, is changing this:

   buf.end++; //Quickly copy stuff
   buf.queue<i>.type = 0;
   buf.queue<i>.val = f;
   buf.queue<i>.ex1 = ex1;
   buf.queue<i>.ex2 = ex2;
   buf.queue<i>.ex3 = ex3;

to this:

   buf.queue<i>.type = 0;//Quickly copy stuff
   buf.queue<i>.val = f;
   buf.queue<i>.ex1 = ex1;
   buf.queue<i>.ex2 = ex2;
   buf.queue<i>.ex3 = ex3;
   buf.end++;

The idea here being that if buf.end is incremented before buf.queue is updated, then your main loop will notice that buf.end is no longer equal to buf.start, and read out the values of buf.queue, at a time when they may or may not be initialized. By making that change, the main loop isn't told about the new mouse event until it's guaranteed to be able to correctly read buf.queue. This is known as a race condition; it is a real possibility on windows, though maybe not on DOS. It would probably be extremely rare, at least on single-core CPUs (the OS scheduler would have to preempt mouse_call after buf.end++ and before buf.lock--).

Anyway, onward:

Quote:

I just wanted to try this after seeing the CGUI source code. CGUI seemed to implement something of the sort of an input buffer, then reads the buffer and issues messages to widgets.

I'm not familiar with CGUI, but I am familiar with SDL, which uses an input buffer (patterned after win32 & DirectX I think, though I haven't done much win32 and/or directX programming). It is my opinion that input buffers are superior to the polling interface that Allegro offers for large programs, but inferior for small programs. My current program extends the concept slightly... it timestamps all input, considers time to be an input, and logs all input to file, in order to allow exact program execution to be reproduced, and as a cheap (read: not very good) way to have game mechanics still feel correct when rendering is slow.

Quote:

I just didn't have the patients to figure how it works exactly. (Can any one help)

What's wrong with your current code? It appears to buffer mouse events fine. The keyboard code isn't posted, but I presume it buffers those okay also. What are you asking for beyond general commentary on the concept of input buffers in general and/or the specific implementation in CGUI?

dialmformartian

Hey thanx Orz for clearing the confusion here.

Quote:

It is my opinion that input buffers are superior to the polling interface that Allegro offers for large programs, but inferior for small programs. My current program extends the concept slightly... it timestamps all input, considers time to be an input, and logs all input to file, in order to allow exact program execution to be reproduced, and as a cheap (read: not very good) way to have game mechanics still feel correct when rendering is slow.

i wanted to create input buffer system because my graph
plotter program(which uses polling) didn't seem to process events during some
humongous calculations.but i was relucatant to use CGUI coz A)i didn't want
to give away the gui system I had already pogrammed. B) I always have a do it all myself attitude.

Quote:

What are you asking for beyond general commentary on the concept of input buffers in general and/or the specific implementation in CGUI?

I just wanted anybody who know about CGUI internals to comment.

Quote:

What's wrong with your current code? It appears to buffer mouse events fine.

So does that mean i can use this as a reliable buffer system if im not bothered of
the fact that "The current system "gives up" (return and discard the event) when mouse_call() and key_call() race each other."
as audric said. (May be i could implement mutex later) or should i go for something else like pthread.

Quote:

The keyboard code isn't posted, but I presume it buffers those okay also.

Here it is

1void key_call(int f) {
2 unsigned int i;
3 
4 if(((buf.end+1) & 255) == (buf.start&255))
5 return;
6 i = buf.end&255;
7 
8 buf.lock++;
9 if(buf.lock != 1) {
10 buf.lock--;
11 return;
12 }
13 
14 buf.queue<i>.type = 1;
15 buf.queue<i>.val = f;
16 buf.queue<i>.ex1 = 0;
17 buf.queue<i>.ex2 = 0;
18 buf.queue<i>.ex3 = 0;
19 buf.end++; //changed it as orz said
20 
21 buf.lock--;
22}
23END_OF_FUNCTION(key_call)

and it does work fine.

Audric
dialmformartian said:

So does that mean i can use this as a reliable buffer system if im not bothered of
the fact that "The current system "gives up" (return and discard the event) when mouse_call() and key_call() race each other."

The only severe problem was the array[256], fixed.

Your test code to show the internals is not thread safe, but it will only cause a problem when the buffer is getting full and the buffer's head starts biting its tail.
(It should behave like: "lock, make a copy of the element, remove the element, unlock" : afterwards, you have all the time you want to process the event)
IMO, don't bother. Increase the buffer to 512, 1024 or even more if you actually run into problems. (full buffer means missed events, anyway)

The problem I pointed should not happen often, but consequences on input should be VERY weird:
Missed event: if it's a key_up, key_down, or mouse butto, it will look like a stuck key.
It will also cause an earlier entry in the buffer to be considered new, and happen again. It will be very visible as the mouse will jerk away and come back, or a key you didn't press will happen to be stuck down.

Quote:

(May be i could implement mutex later) or should i go for something else like pthread.

In my example I took the mutex functions that are in the pthread library, but it doesn't mean you need to multi-thread - the allegro hooks should work fine.

axilmar
Quote:

but it doesn't mean you need to multi-thread - the allegro hooks should work fine

But allegro hooks are executed by threads other than the main one.

I would suggest to dialmformartian to use pthreads anyway. Multi-threading programming is quite tricky and sooner or later he will have untraceable problems.

Audric
axilmar said:

But allegro hooks are executed by threads other than the main one.

Yes, Allegro uses threads internally.
Why dedicate threads yourself if the API already does things right? (and dialmformartian uses it rather correctly)
My first hint was only to use to a reliable and standard mutex system : two functions, lock and unlock. Just surround the critical sections, the closer the better.

axilmar
Quote:

Why dedicate threads yourself if the API already does things right?

I did not say that dialmformartian should use his own threads. I said that he should use critical sections to synchronize access to the shared variables because Allegro uses threads.

Quote:

My first hint was only to use to a reliable and standard mutex system : two functions, lock and unlock. Just surround the critical sections, the closer the better

Exactly.

orz

On further thought:

1. Threading/locking/mutexes/whatever:

So far as I can tell, the keyboard/mouse callbacks in Allegro are all executed from a single thread. Thus, there is no benefit to adding mutexes or anything like that.

Thus: don't use pthreads/whatever for this. Use of platform-dependant or other-library-specific threading primitives is unnecessary and undesirable in this context. Even the "lock" variable in the code is useless. The only kind of thead-awareness needed is the the stuff alread there - the "volatile" keyword on the buffer variables and careful ordering of when buf.end and buf.start are incremented.

2. mouse_call:

dailmformation:
You're implementation is perfectly fine as is. Here are a few things I would consider doing anyway, just because I think the resulting code easier to read or maintain or prettier or something.
2a: eliminate the branches and temporary variables - they don't accomplish anything.
this:

   unsigned int i, ex1 = 0, ex2 = 0, ex3 = 0;
  
   if(f & MOUSE_FLAG_MOVE) {
      ex1 = mouse_x;
      ex2= mouse_y;
   }
   if(f & MOUSE_FLAG_MOVE_Z)
      ex3 = mouse_z;

becomes this:
unsigned int i;
and this:

   buf.queue<i>.ex1 = ex1;
   buf.queue<i>.ex2 = ex2;
   buf.queue<i>.ex3 = ex3;

becomes this:

   buf.queue<i>.ex1 = mouse_x;
   buf.queue<i>.ex2 = mouse_y;
   buf.queue<i>.ex3 = mouse_z;

2b. Make the size of the buf.queue array be a numerical constant (an enum or #define) instead of hardwired. Do the same thing for the "type" values (currently 0 is mouse, 1 is keyboard - wrap those values in enums or #defined constants instead). That way it's easier to change the values later if there is some need to do so.
2c. So far as I can tell, there is no point to the use of buf.lock on windows (or, I suspect, any other Allegro platform). You could remove it.

3. Buffered Input:

dialmformartian: Your main loop reading the callbacks looks correct. I would suggest that that the next step be to add an interface/wrapper to hide/encapsulate the details of the buffer from the main loop or anything else that needs to read keyboard/mouse events.

Audric
Quote:

So far as I can tell, the keyboard/mouse callbacks in Allegro are all executed from a single thread

:-[ ouch... sorry for all the bad advice I gave

axilmar
Quote:

So far as I can tell, the keyboard/mouse callbacks in Allegro are all executed from a single thread. Thus, there is no benefit to adding mutexes or anything like that.

But that thread is a different one from the main thread. Adding a mutex would prevent synchronization problems between the main thread and the allegro input thread.

Quote:

You're implementation is perfectly fine as is.

Actually, it is not. There is a possibility that non valid data will be read from the buffer.

orz

But that thread is a different one from the main thread. Adding a mutex would prevent synchronization problems between the main thread and the allegro input thread.
Each variable is written to by only one thread, all variables are volatile, and all writes are atomic. Under these circumstances, buf.start and buf.end together act as a valid synchronization mechanism, and heavier weight thread synchronization objects are not needed. You can of course use a mutex anyway, but since Allegro does not offer mutexes that would require addition library or platform dependent stuff complicating the build process and/or portability.

Quote:

Actually, it is not. There is a possibility that non valid data will be read from the buffer.

That was fixed by moving buf.end++. If you are saying that another issue exists, describe it more specifically.

axilmar
Quote:

and all writes are atomic.

No they are not atomic. A variable increment has 3 CPU instructions: a) load value to register, b) increase value, c) put value back to memory.

For example:

int x = 0;
x++;

becomes:

004113B5  mov         eax,dword ptr [x] 
004113B8  add         eax,1 
004113BB  mov         dword ptr [x],eax

(copied from the disassembly window of the VS debugger).

orz
Quote:

No they are not atomic. A variable increment has 3 CPU instructions: a) load value to register, b) increase value, c) put value back to memory.

I was not refering to the whole read-modify-write aspect, only the write aspect. ie the value is always either equal to what it is before the statement or after the statement, the write portion is never split into two like it could be for a 64 bit integer.
edit: and, side note, I believe increment instructions on x86 accept memory parameters, though of course they don't on most RISC architectures

axilmar
Quote:

I was not refering to the whole read-modify-write aspect, only the write aspect. ie the value is always either equal to what it is before the statement or after the statement, the write portion is never split into two like it could be for a 64 bit integer.
edit: and, side note, I believe increment instructions on x86 accept memory parameters, though of course they don't on most RISC architectures

But the whole point is to make the whole thing deterministically safe. It's not a big deal to include pthreads in your source anyway.

Kitty Cat
Quote:

But that thread is a different one from the main thread.

Depends on the system. Allegro can run single-threaded on Linux (AFAIK), and has to run single-threaded on DOS. And given that the current API will be wrapped around a new API in the future, you can't know how that'll do it either. It's not good practice to assume callbacks or timers are run from a seperate thread than your program.

dialmformartian

where do i get pthread

Kitty Cat
orz
Quote:

But the whole point is to make the whole thing deterministically safe. It's not a big deal to include pthreads in your source anyway.

Quote:

Depends on the system. Allegro can run single-threaded on Linux (AFAIK), and has to run single-threaded on DOS. And given that the current API will be wrapped around a new API in the future, you can't know how that'll do it either. It's not good practice to assume callbacks or timers are run from a seperate thread than your program.

Sigh.
Okay, let's this from three angles:
1. Would adding a pthreads lock to the callback change anything on any specific platform?
2. Is this the manner in which Allegro intends its callbacks to be used?
3. What is the cost of adding additional locking?

Now, the answers for each question:
1. Would adding a pthreads lock to the callbacks change anything on any specific platform?
In order for it to change something, the callback functions would have to be running concurrently with themselves on some platform. This never happens on DOS, Windows, or Linux, the 3 platforms mentioned so far in this forum thread.
DOS: the callbacks appears to happen from within an interrupt - ie with interrupts disabled so they cannot be interrupted.
WIN32: the callbacks appear to be called from a single thread which does nothing but wait for input and call the callbacks when input arrives. Since this is a single thread, it can't call another callback until the previous one returns.
LINUX: the callbacks appear to occur inside the SIGIO handler. According to the GNU docs, no signal handler is called during handling of the same signal, and since both keyboard and mouse use the same signal, neither keyboard nor mouse callbacks can run concurrently with themselves or each other.

2. Is this the manner in which Allegro intends its callbacks to be used?
Allegro documentation says that the keyboard callback "executes in an interrupt context". It doesn't say that as DOS-specific, though obviously it's not literally true for non-DOS platforms. For the mouse callback, all it says is "function must be in locked memory, and must execute very quickly", even less helpful. My interpretation is that this is the correct way to use the Allegro callbacks, though the documentation leaves this unfortunately ambiguous.

3. What is the cost of adding additional locking?
If you use Allegro's built-in locking mechanisms (_enter_critical() and _exit_critical())... well, I don't know what the drawbacks are - Allegro's built-in locking mechanisms are undocumented so far as I can tell, so I presume you're not intended to use them for normal Allegro programs.
If you use platform-specific locking mechanisms, the obvious cost is in portability.
If you use pthreads or another threading API with implementations for a variety of platforms, the obvious cost is the standard cost for adding dependancies - it makes it harder for arbitrary Allegro users to get your code running - they have to get the library themselves, and figure out how to install it and link with it. This cost is greatly reduced if you are the only one likely to compile your code. (additionally, I suspect that using pthreads calls inside a DOS interrupt is a bad idea)

edit: note: this is only considering adding locking to the callback. I would recommend adding locking to any function that reads from the input buffer if and only if your code is multithreaded independent of Allegro.

Thomas Fjellstrom

1. I don't believe pthreads EXISTS on DOS or DS.
2. You should never do anything clever in allegro timers. ever. You're just asking for trouble.
3. It just won't work on platforms without locking features.

axilmar
Kitty Cat said:

Depends on the system. Allegro can run single-threaded on Linux (AFAIK), and has to run single-threaded on DOS.

If one is interested in running the program in DOS, then the app must be coded in a special way so as that DOS is handled appropriately.

It is not very difficult to use #ifdef to include different APIs according to platform.

Orz said:

Would adding a pthreads lock to the callback change anything on any specific platform?

Yes. It would make the program correct in those platforms that input is handled by a different thread.

Orz said:

Is this the manner in which Allegro intends its callbacks to be used?

Allegro has many DOSisms. Any modern library would not use the concept of interrupt to service input, but the concept of input queue. Unfortunately for us, Allegro does not do that, so there are two solutions:

1) fix Allegro to provide a mouse queue, like the keyboard queue.
2) use callbacks to implement your own queues.

I have done #2 in my GUI code.

Orz said:

What is the cost of adding additional locking?

It's minimal, because critical sections are used.

Orz said:

In order for it to change something, the callback functions would have to be running concurrently with themselves on some platform.

Nope, you got it wrong here. The lock will not synchronize the input threads, because there is only one input thread. The lock will synchronize the input thread and the main thread.

Orz said:

If you use Allegro's built-in locking mechanisms (_enter_critical() and _exit_critical())... well, I don't know what the drawbacks are - Allegro's built-in locking mechanisms are undocumented so far as I can tell, so I presume you're not intended to use them for normal Allegro programs.

Exactly.

Orz said:

If you use platform-specific locking mechanisms, the obvious cost is in portability.

It's no big deal to do so; just a couple of #ifdefs.

Orz said:

If you use pthreads or another threading API with implementations for a variety of platforms, the obvious cost is the standard cost for adding dependancies - it makes it harder for arbitrary Allegro users to get your code running - they have to get the library themselves, and figure out how to install it and link with it. This cost is greatly reduced if you are the only one likely to compile your code. (additionally, I suspect that using pthreads calls inside a DOS interrupt is a bad idea)

The cost of debugging multithreaded programs far outweights the cost of installing and linking with a library.

Here is a very interesting link with the following quote:

No variable or structure should be accessed from multiple threads without being accessed in a thread-safe fashion.

Obviously the guy has great experience with threads and Windows, so if you don't accept my comment, please accept his.

Kitty Cat
Quote:

If one is interested in running the program in DOS, then the app must be coded in a special way so as that DOS is handled appropriately.

Not if its coded properly. As long as you use Allegro properly, and don't rely on platform-specific behavior (eg. how callbacks are implemented), it'll work. However, people tend to become lax because "it's not needed right now for my target platform(s)" and thus don't bother with proper code.

Quote:

It is not very difficult to use #ifdef to include different APIs according to platform.

And given that this will be layered on top of the new API in the future, what'll you do if all platforms, instead of just DOS, are then more sensitive to improper code? You'll pretty much lock it out of being able to use the compatibility layer of newer versions and require Allegro 4.2 only.

axilmar
Quote:

Not if its coded properly.

The only 'coded properly' technique with multiple threads is to use synchronization. Didn't you read the link I posted?

Kitty Cat
Quote:

The only 'coded properly' technique with multiple threads is to use synchronization.

Yes, I fully agree with that. I'm talking about Allegro's callbacks, though. :P You can't assume whether Allegro's callbacks are threaded or not.

orz
Quote:

1. I don't believe pthreads EXISTS on DOS or DS.
2. You should never do anything clever in allegro timers. ever. You're just asking for trouble.
3. It just won't work on platforms without locking features.

1. http://moss.csc.ncsu.edu/~mueller/pthreads/
2. I don't consider copying input to a buffer "clever". It's basically the ONLY thing that the input callbacks are useful for.
3. edited: hm... It won't work correctly on platforms with sufficiently weak memory coherency models. Investigating...

Quote:

The cost of debugging multithreaded programs far outweights the cost of installing and linking with a library.Here [flounder.com] is a very interesting link with the following quote:

[bold]No variable or structure should be accessed from multiple threads without being accessed in a thread-safe fashion.[/bold]

You conspicuously omitted the next line "Note that this applies only to variables where concurrent access can produce incorrect results". He then links to an example of his where he recommends using just volatile variables.
You might think that your position is vindicated by his next statement that incrementing a volatile integer is not thread-safe on a multi-CPU system. However that statement does not apply to the special case where only one thread modifies a variable (while other thread(s) read it).

Quote:

Nope, you got it wrong here. The lock will not synchronize the input threads, because there is only one input thread. The lock will synchronize the input thread and the main thread.

There are three different locking scenarios we're looking at here in this thread:
locking type A: input callbacks lock to prevent reentry into the input callbacks
locking type B: input callbacks lock and input reader lock (same lock), to prevent reader from reading not-yet-ready events
locking type C: input readers lock (in a program with multiple threads, not counting those allegro created)

In that particular case, you were apparently talking about type B, while I was talking about type A. It is my position that locking types A and B are pointless. I think everyone is in agreement that type C is necessary if and only if the program is multithreaded (not counting Allegro-created threads that the user code knows nothing about).

However, I'm beginning to suspect that you might be right. After reading through a bunch of x86, PPC, and C docs, it looks like the ordering may not be guaranteed between multiple CPUs under the circumstances in which the callbacks execute on various platforms unless extra effort is made. That's somewhat scary though, as this appears to be what Allegro does internally... possibly I've been up to long, but to my current interpretation, not only is this code slightly flawed in a multi-CPU environment, but so is Allegro keyboard handling on WIN32 and OSX. And I'm not too sure about the Linux version either... executing in the SIGIO handler protects it somewhat, but only if the reading thread is the original thread, and it can also execute in a timer callback instead, or from the main thread if simulate_keypress is called...

Will think more tomorrow. Hopefully.

edit: fixed typo and changed #3 at the top

axilmar
Quote:

Yes, I fully agree with that. I'm talking about Allegro's callbacks, though. :P You can't assume whether Allegro's callbacks are threaded or not.

Well, if the game/library needs to behave correctly in the 90% of the computers out there (i.e. those that run Windows), you need to code threads properly.

Quote:

You conspicuously omitted the next line "Note that this applies only to variables where concurrent access can produce incorrect results". He then links to an example of his where he recommends using just volatile variables.
You might think that your position is vindicated by his next statement that incrementing a volatile integer is not thread-safe on a multi-CPU system. However that statement does not apply to the special case where only one thread modifies a variable (while other thread(s) read it)

Incrementing an integer is not thread-safe, period. If you read the whole thing, you would have seen the analysis of the assembly code.

You are just asking for trouble especially when the integer is an index to a buffer.

Quote:

However, I'm beginning to suspect that you might be right.

No sh1t, Sherlock! :-)

CPUs have gone multicore, you have forgotten that???

orz
Quote:

Incrementing an integer is not thread-safe, period. If you read the whole thing, you would have seen the analysis of the assembly code.

You seem to be ignoring the point of that: when a variable is only written to by one thread, the atomicity of the entire read-modify-write operation becomes irrelevant. Only the atomicity of the write operation itself is relevant, because the variable cannot be modified asynchronously to that code (because only one thread modifies it). And the write is generally atomic, provided that the integer is aligned (caveats: not true on 386sx when paired with a 32 bit compiler, some 8 bit CPU/compiler combinations, a few other arcane places... but to get even a remotely possible failure you have to find a platform that uses one of the handful of CPUs on which that's not true, uses multiple CPUs concurrently, and runs Allegro programs; you'd have an easier time finding a system that uses 10 bit bytes).

Quote:

You are just asking for trouble especially when the integer is an index to a buffer.

That is the part I'm agreeing with now - that using the variable as an index into a changing buffer is unsafe. According to my current understanding, this code has the capacity to return garbage input (though it almost never will). It could be fixed with threading libraries, platform specific locking mechanisms, or more exotic coding methods. However, I think that the Allegro internals have the same flaw in them, so you suffer from this flaw regardless...

Quote:

CPUs have gone multicore, you have forgotten that???

I remembered the multicore part, but did not realize that Intel and others had weakened their memory coherency policies (I expected at least PRAM consistency for writes to volatile variables). And I presumed that Allegro would arrange things in such a way that its input callbacks were usable in a safe manner without extra dependencies or extraordinary effort.

edit:

ARRGHH!
Okay, I was too tired when coming to the conclusion that it wasn't safe. x86 DOES still offer strong memory coherency, I misread the docs as saying the opposite of what they said. Here's my current position (hopefully I will slow down the rate at which I change positions soon):

The code is correct on some platforms only.
Allegro code is correct on some platforms only.
Which platforms?

x86/*: Allegro and this code both work, because of strong memory coherency guarantees on normal x86 systems.

non-x86/linux: Allegro and this code probably work correctly, if the user code is totally single-threaded. Merely having only one thread talk to allegro is not enough though - it must also be the thread that handles signals. On x86 specifically the issue is (I think) irrelevant though because of the x86/* strong coherency guarantees. However, the buf.lock mechanism should be kept, because in (very rare) cases the callback can become reentrant.

PPC/macOSX: Allegro and this code are both incorrect on this platform.

PPC/macOS9 and x86/DOS: Allegro and this code are correct, because (I think) the code all executes in interrupt context anyway.

axilmar
Quote:

That is the part I'm agreeing with now - that using the variable as an index into a changing buffer is unsafe.

Quote:

ARRGHH!

You went into all this trouble just to prove what others said from the beginning? :-)

orz
Quote:

You went into all this trouble just to prove what others said from the beginning? :-)

And to submit a bug report : )

http://www.allegro.cc/forums/thread/590120

Thread #589912. Printed from Allegro.cc