![]() |
|
Lua coroutines, performance |
Hyena_
Member #8,852
July 2007
![]() |
Last week I ran into serious speed problems with Lua/C++/Allegro5. Some of my game objects had to listen for triggers like DRAW and STEP 60 times a second. I implemented it in my C++ code as spawning a Lua thread for each triggered object. That means 120 threads are created/destroyed per second for a single object. When my instance count reached 300, game got really slow. The first optimization I made was that I brought all code from Lua step event into C++. I gained 20FPS. That however is a desperate solution. Does anyone know why my approach is so slow and how can I reorganize my code to gain maximum speed while still using Lua? Just to mention, I figured out that creation of a new coroutine is probably the reason of slowness so my current optimization ideas are: Generally my Lua/C++ architecture is like that: I am building a system of Lua/C++ for the first couple of times so that I'm constantly not sure if professionals would do something in a different way. I included a sample lua file from my project so you can see the architecture I am using as I invented it myself I cannot be sure if it's a right way to do things. 1function metadata(id)
2 call("Unit","init",id,0);
3
4 register_event(id,"CREATE", "ev_create");
5 register_event(id,"DRAW", "ev_draw");
6 register_event(id,"STEP", "ev_step");
7 register_event(id,"DESTROY", "ev_destroy");
8 register_event(id,"PAIN", "ev_pain");
9 register_event(id,"ATTACK", "ev_attack");
10
11 set(id,"default_sprite","spr_korax");
12 set(id,"fighting_sprite","spr_korax_combat");
13
14 set(id,"sprite","spr_korax");
15 set(id,"node",0);
16
17 set(id,"pose",get(id,"dir"));
18 set(id,"image",0);
19 set(id,"target",0);
20 set(id,"delay",30);
21 set(id,"energy",50);
22 set(id,"max_energy",50);
23end
24
25function ev_create(id)
26end
27
28function ev_step(id)
29 local node = get(id,"node");
30 local room = get(node,"room");
31
32 if (get("room_id") ~= room) then
33 return;
34 end;
35
36 if (get(id,"energy")<10) then
37 set(id,"delay",90);
38 end;
39
40 if (get(id,"health")<=0) then
41 set(id,"delay",0);
42 if (get(id,"position")~="dead") then
43 call("Unit","fade",id,0);
44 if (get(id,"fade")==255 and get(id,"stay")==false) then
45 set(id,"position","dead");
46 set(id,"new_image", 0);
47 set(id,"new_sprite","spr_korax_dead");
48 set(id,"new_pose", "dead");
49 sound_play("sfx_death1",1.0,0.0,math.random(0,0.3)+0.7);
50 call("Unit","stop_fighting",id,0);
51 else
52 return;
53 end;
54 else
55 if (get(id,"fade")==255) then
56 local old_image = get(id,"image");
57 if (old_image<9) then
58 set(id,"new_image",old_image+1);
59 end;
60 end;
61 end;
62 end;
63
64 call("Unit","step",id,0);
65
66 if (get(id,"position") == "moving") then
67 if not sound_is_playing("sfx_walk1") and math.random(0,1000)<10 then
68 sound_play("sfx_walk1",1.0,0.0,math.random(0,0.3)+0.7);
69 elseif not sound_is_playing("sfx_neutral1") and math.random(0,1000)<3 then
70 sound_play("sfx_neutral1",1.0,0.0,math.random(0,0.3)+0.7);
71 elseif not sound_is_playing("sfx_neutral2") and math.random(0,1000)<2 then
72 sound_play("sfx_neutral2",1.0,0.0,math.random(0,0.3)+0.7);
73 elseif not sound_is_playing("sfx_neutral3") and math.random(0,1000)<1 then
74 sound_play("sfx_neutral3",1.0,0.0,math.random(0,0.3)+0.7);
75 end;
76 elseif (get(id,"position") == "default") then
77 --sound_stop("sfx_steps1");
78 local node = get(id,"node");
79 local exits = node_exits(node);
80 local to;
81
82 if (exits ~= nil and get(id,"fighting")==0) then
83 for j = 1, #exits, 1 do
84 to = exits[j];
85 local contents = node_contents(to);
86 if (contents ~= nil) then
87 for i = 1, #contents, 1 do
88 local vch = contents[i];
89 local vch_name = get( get(vch,"object_index"), "name" );
90 if (vch_name == "UNIT_SOHNI"
91 and get(vch,"node") == to) then
92 call("Unit","set_fighting",id,vch,0);
93 set(id,"target",vch);
94 return;
95 end;
96 end;
97 end;
98 end;
99 end;
100 if (get(id,"target")~=0) then
101 to = get(get(id,"target"),"node");
102 set(id,"dest",to);
103 else
104 to = exits[math.random(1,#exits)];
105 if (get(id,"fighting")==0 and math.random(0,100)<10) then
106 set(id,"dest",to);
107 --call("Unit","move_to_node",id,to,0);
108 end;
109 end;
110 elseif (get(id,"position") == "fighting") then
111 call("Unit","one_hit",id,get(id,"fighting"),0);
112 end;
113
114 if (get(id,"position") == "default" and math.random(0,1000)<10) then
115 local directions = { "N", "S", "E", "W", "NE", "SE", "NW", "SW" };
116 set(id,"new_pose", directions[math.random(1,#directions)]);
117
118 local sprite = get(id,"sprite");
119 local pose = get(id,"new_pose");
120 local image = get(id,"image");
121 set(id,"new_image",math.random(0,sprite_count(sprite,pose)-1 ));
122 end;
123end
124
125function ev_draw(id)
126
127end
128
129function ev_destroy(id)
130end
131
132function ev_pain(id)
133 local pan = (800 + math.random(0,200))/1000;
134 if (math.random(0,10)<2) then
135 sound_play("sfx_korax_pain1",1.0,0.0,pan);
136 elseif (math.random(0,10)<2) then
137 sound_play("sfx_korax_pain2",1.0,0.0,pan);
138 elseif (math.random(0,10)<2) then
139 sound_play("sfx_korax_pain3",1.0,0.0,pan);
140 elseif (math.random(0,10)<2) then
141 sound_play("sfx_korax_pain4",1.0,0.0,pan);
142 else
143 sound_play("sfx_korax_pain5",1.0,0.0,pan);
144 end;
145 local hp = get(id,"health");
146 hp = hp-1;
147 set(id,"health",hp);
148end
149
150function ev_attack(id)
151 sound_play("sfx_attack1",1.0,0.0,1.0);
152end
|
aniquilator
Member #9,841
May 2008
![]() |
I think the main problem with your code, is that you create a lot of threads... Another thing I can see is that you create your object metatables inside your LUA code, why don't you create them in your c++ code and just pass them to the lua state? Actually I'm programming my game using lua too, in my case I don't get any performance issues because of that.. What I do is: If you have some more specific question, please ask. |
bamccaig
Member #7,536
July 2006
![]() |
I agree with aniquilator. I haven't done much with Lua and haven't even touched it in a couple of years, but I infer the same from your solution. Having a new Lua state for each game object class is silly. I'm not sure why calling functions from another Lua script would create threads unless you explicitly do this. As I said, it's been a while since I've used Lua, but I don't think it ever creates threads implicitly. I assume that it must be something that you are doing and I assume you are Doin' It Wrong(tm). Apparently Lua doesn't actually have threads, unless that has changed in the past couple of years, so what kind of threads are you referring to? Apparently you should have a different Lua state for each native thread, which does more or less make sense, but that probably means that you don't want too many threads talking to Lua. In general, multi-threading adds a lot of complexity to a program and is therefore very error prone. Be sure you really need it before you use it. It could very well be that the complexity outweighs any benefits it may give you. So limit your Lua states to one (unless you have a good reason not to, like multiple native threads that really do need to interact with Lua), limit the number of threads that you create (none if possible), and keep things simple. -- acc.js | al4anim - Allegro 4 Animation library | Allegro 5 VS/NuGet Guide | Allegro.cc Mockup | Allegro.cc <code> Tag | Allegro 4 Timer Example (w/ Semaphores) | Allegro 5 "Winpkg" (MSVC readme) | Bambot | Blog | C++ STL Container Flowchart | Castopulence Software | Check Return Values | Derail? | Is This A Discussion? Flow Chart | Filesystem Hierarchy Standard | Clean Code Talks - Global State and Singletons | How To Use Header Files | GNU/Linux (Debian, Fedora, Gentoo) | rot (rot13, rot47, rotN) | Streaming |
Hyena_
Member #8,852
July 2007
![]() |
Lua threads are not the actual OS threads. Lua threads are just coroutines. They can just yield for waiting input for example or just to sleep a while. No parallel computing is done. If one lua thread freezes the whole game will still freeze. However, you are telling me the exact thing that I've been suspecting What you are saying I should have is one Lua state that holds all my *.lua files. That's a good idea but how should I do it. I don't understand the concept of modules nor libraries within the context of Lua because if I have my *.lua files as memory files then how do they include each other? Maybe some example pseudo code please? The other question is that will I be able to still use lua threads? I guess I'll have to spawn a new lua state for each thread while each state holds absolutely every function I have in all my lua source files. Maybe someone can quickly explain me the architecture of typical lua project. Let's say I have Enemy.lua and Utils.lua and now I want to make external calls from Enemy to Utils within the same Lua state, how is it done in code and what C++ functionality should be implemented for that? aniquilator: Just a note about my architecture:
|
bamccaig
Member #7,536
July 2006
![]() |
Hyena_ said: The other question is that will I be able to still use lua threads? I guess I'll have to spawn a new lua state for each thread while each state holds absolutely every function I have in all my lua source files.
Why would you need a different Lua state for each co-routine? That doesn't make any sense. No. Hyena_ said: Maybe someone can quickly explain me the architecture of typical lua project. Let's say I have Enemy.lua and Utils.lua and now I want to make external calls from Enemy to Utils within the same Lua state, how is it done in code and what C++ functionality should be implemented for that?
Each Lua script should be defining functionality. Basically functions. Once each script has been ran those functions should exist and should be callable. At least, AFAIK, that is how it would work. Hyena_ said: What do you mean by creating my object metatables inside my Lua code? The metadata function? The thing is that I don't want to compile my game each time a new object or variable is added. There are scripters included in my team who don't want to deal with C++ part of the game. Just a note about my architecture:
It sounds like you're trying to make C++ dynamically-typed. -- acc.js | al4anim - Allegro 4 Animation library | Allegro 5 VS/NuGet Guide | Allegro.cc Mockup | Allegro.cc <code> Tag | Allegro 4 Timer Example (w/ Semaphores) | Allegro 5 "Winpkg" (MSVC readme) | Bambot | Blog | C++ STL Container Flowchart | Castopulence Software | Check Return Values | Derail? | Is This A Discussion? Flow Chart | Filesystem Hierarchy Standard | Clean Code Talks - Global State and Singletons | How To Use Header Files | GNU/Linux (Debian, Fedora, Gentoo) | rot (rot13, rot47, rotN) | Streaming |
Hyena_
Member #8,852
July 2007
![]() |
Ok, then one more question. How can I use same function names in different *lua files within the same lua state? Could you describe me the architecture behind having all your variables defined and held in Lua. I have object instances that are obviously held in C++ but now I must create some additional "objects" only inside Lua that would hold variables associated with their object type. Basically you are saying that I should have a list of all instances in some global Lua table that will contain variables for each object it holds. When I spawn a new lua thread for yielding then can I still access that list with these same values inside? If that would work out then I would really consider switching to that system as you are right I don't need to hold variables in C++. That would make scripting a lot easier as I no longer would have to use get and set C++ calls and constructs like: The functions associated with an object could also be inside its variable list so that this would be possible: edit:
|
Audric
Member #907
January 2001
|
Yes. The alternative is to simply have your single Lua state load each module once. The order isn't even important! |
Hyena_
Member #8,852
July 2007
![]() |
Just out of curiosity - does anybody know why spawning a different Lua thread for each triggered event is so bad for the performance? Let's say I have only one lua state and generally one lua thread. Now I have my main loop also inside this lua thread. Each time this thead resumes it will ask for the C++ what events have happened and executes them right away without creating a new thread for that. How can it be faster? Maybe the whole performance thing is just what you get when having to call 36 000 lua functions per second (60 fps, 300 objects, 60+60 events per second for step and redraw)?
|
bamccaig
Member #7,536
July 2006
![]() |
Hyena_ said: Ok, then one more question. How can I use same function names in different *lua files within the same lua state? You typically should not have multiple functions with the same name. A function name should describe what the function does. Conceptually, if two functions have the same name they should do exactly the same thing. You can use a table as a namespace to group functions around some other name, or you can use a table as an actual object (OOP) if that's is applicable. Hyena_ said: Could you describe me the architecture behind having all your variables defined and held in Lua. I have object instances that are obviously held in C++ but now I must create some additional "objects" only inside Lua that would hold variables associated with their object type. What are these Lua objects holding? If the Lua objects represent the C++ objects then I would probably just wrap a userdata pointer to the object in a Lua table with some methods that call corresponding C++ methods. Otherwise, it sounds like you would be duplicating yourself, and that would mean that things could get out of sync easily... I don't really know what you're doing. You basically want the Lua world to see the C++ world as it is so you don't want to copy state from C++ into Lua. You want Lua to access the real C++ state through safe interfaces. Hyena_ said: Basically you are saying that I should have a list of all instances in some global Lua table that will contain variables for each object it holds. I don't know if I'm saying that or not. I would generally avoid too much global state, even in Lua. You may need one global variable to wrap everything else in, but that's all you technically need. That said, it might be beneficial to keep things simple to save effort, but if you have non-programmers writing the Lua then you may want to safely wrap everything up so they can't break the entire game with something silly. Hyena_ said: When I spawn a new lua thread for yielding then can I still access that list with these same values inside? If the list is global then probably, yes. Lua co-routines are not threads. They execute asynchronously, but the Lua API takes turns executing the code so it isn't faster to use or anything. If one of your functions blocks then all co-routines will be blocked also. I don't see any advantage to a lot of co-routine use until you're very familiar with Lua and know when it's appropriate to use them. Hyena_ said:
The functions associated with an object could also be inside its variable list so that this would be possible: Lua has special syntax to make this easier: id.update_position(id); // ...is exactly the same as... id:update_position();
This syntactic sugar just passes the table id into the function as the first argument (which is basically how all OO implementations that I've used actually work under the surface). That said, your name id doesn't seem appropriate if its a table... Note that there's no sense returning two values only to assign them back to the same table. Your function has a reference to that table already. Shouldn't you just directly assign to it inside of the function instead of returning anything? Hyena_ said: Just out of curiosity - does anybody know why spawning a different Lua thread for each triggered event is so bad for the performance?
I just went and read the Lua manual and it seems that Lua's run-time doesn't manage the execution of co-routines. You have to do that. Which means that you never have more than one "thread" in Lua running at a time. You have to explicitly yield your co-routine and then explicitly resume it. Meaning that if you create a ton of co-routines then you'll need to manage when to call which ones... And if they never yield then they aren't actually multitasking at all. They're just an extra expensive function call. Like I said, I think you're doing it wrong. You should probably check with the Lua community (e.g., IRC, mailing list, message board, whatever), but you probably shouldn't be using co-routines the way you are. Hyena_ said: Let's say I have only one lua state and generally one lua thread. Now I have my main loop also inside this lua thread. Each time this thead resumes it will ask for the C++ what events have happened and executes them right away without creating a new thread for that. How can it be faster? Maybe the whole performance thing is just what you get when having to call 36 000 lua functions per second (60 fps, 300 objects, 60+60 events per second for step and redraw)?
Get rid of the co-routines and see what happens. It could well be that you're expecting too much of Lua and need to move more of it into native code for speed. I would probably suggest that you have your main game loop in C++ and use it to invoke Lua when appropriate. But again, I'm not an expert on Lua, and have never actually embedded it before. This is just what I would probably do... -- acc.js | al4anim - Allegro 4 Animation library | Allegro 5 VS/NuGet Guide | Allegro.cc Mockup | Allegro.cc <code> Tag | Allegro 4 Timer Example (w/ Semaphores) | Allegro 5 "Winpkg" (MSVC readme) | Bambot | Blog | C++ STL Container Flowchart | Castopulence Software | Check Return Values | Derail? | Is This A Discussion? Flow Chart | Filesystem Hierarchy Standard | Clean Code Talks - Global State and Singletons | How To Use Header Files | GNU/Linux (Debian, Fedora, Gentoo) | rot (rot13, rot47, rotN) | Streaming |
Hyena_
Member #8,852
July 2007
![]() |
I already made a thread in lua forums too. Still waiting for replies though. My greatest concern about getting rid of my coroutine system is that it would break my yielding. Sometimes a function must yield until some condition is met, currently it works perfectly as yielding only stops the thread where the current function is being executed. If I only use one coroutine to manage the step event of every instance then if one of the instances should yield the whole loop will block until that instance resumes. Just a side note: I removed debug symbols compiler flag and toggled the most hardcore optimizations and I gained 30 FPS!! Never would have thought that it would give such great results. I am also browsing Lua Just In Time Compiler but unfortunately it isn't available for Lua5.2a (I'm using it). Maybe I should switch to Lua5.1? These gains however aren't enough. Can you please describe me the system behind synchronizing C++ and Lua variables for the same object? As you might have noticed, currently I use get(id,"variable") and set(id,"variable",value) calls that find the object in my C++ code and get/set the value. Having a grand table of objects where each entry contains variables and functions the object has I will have to post that data to C++ somehow. You mentioned userdata but I really haven't ever understood how and why to use it. Maybe I should make a Lua call from my C++ code where I need to get/set a variable of a specific object. That would be done by instantly spawning a thread that would fetch the variable from Lua, something like that: #define LUA(name,func,...) SCRIBE->lua_call_va(name, func, ## __VA_ARGS__) ... //C++ code const char *name=NULL; LUA("Main","get_str","i>s",id,&name); Similar logic could be applied for setting values. First argument of LUA macro is just the name of Lua State, second is the function name. i>s means that integer is taken as an argument and string is returned.
|
Thomas Fjellstrom
Member #476
June 2000
![]() |
Instead of coroutines and yielding, you could make it event based. When a coroutine would yield now, it would request an action from the engine and return, then when the action completes/starts, the engine calls lua again to tell it what's happening. Could be a single function in lua that dispatches the events to other lua code, could be separate lua functions for each event. -- |
Hyena_
Member #8,852
July 2007
![]() |
That's an interesting idea. All this time I have made games without the concept of coroutines I don't even know why I decided to implement them in the first place. Probably because it was easier to make interactive dialogs using them: However, I'm constructed this main Lua loop, maybe you could look at it and say if it would do the job: 1-- Main lua script, this gets looped
2-- get_events is C++ call that would return structure of tables
3-- object_list contains all the game objects
4object_list = {};
5
6function game_loop()
7 while true do
8 local events = get_events(); -- { {name="draw",ids={3,4,65,4},args={}}, {event="step",ids={3,4,3},args={}} }
9 for i = 1, #events, 1 do
10 local objects = events[i].ids;
11 local event = events[i].name;
12 local args = events[i].args;
13 for j = 1, #objects, 1 do
14 local id = objects[j];
15 local obj = object_list[id];
16 obj:event(args);
17 end;
18 end;
19 coroutine.yield();
20 end;
21end;
|
Thomas Fjellstrom
Member #476
June 2000
![]() |
I wouldn't put the main loop in lua. But I suppose that would work? -- |
Hyena_
Member #8,852
July 2007
![]() |
Right
|
Thomas Fjellstrom
Member #476
June 2000
![]() |
why even bother with threads? just call into the lua code from your C/C++ when needed? -- |
Hyena_
Member #8,852
July 2007
![]() |
Hah! EDIT:
|
bamccaig
Member #7,536
July 2006
![]() |
Hyena_ said: Can you please describe me the system behind synchronizing C++ and Lua variables for the same object? As you might have noticed, currently I use get(id,"variable") and set(id,"variable",value) calls that find the object in my C++ code and get/set the value. Having a grand table of objects where each entry contains variables and functions the object has I will have to post that data to C++ somehow. You mentioned userdata but I really haven't ever understood how and why to use it. AFAIK, userdata is effectively a C data type that Lua can't directly interact with. For example, it could be a pointer to your real object. Use C to generate a Lua table with a userdata attribute that contains your pointer. Then your Lua get/set methods could map to C++ functions that call a method of the class: extract the pointer from the userdata attribute of the table, extract the parameter to the Lua method, and then call the actual C function or C++ method on the object. Get the result and put that into Lua as a return value of the function. It will require a bit of coding to get it done, but it should about as fast as Lua can be and you won't have to worry about synchronizing Lua with C/C++. Lua will be pulling the values directly out of C/C++ instead. If you need even more speed after this, throw away your dynamic attributes in C++ and declare a static type. Then bind that into Lua so Lua can manipulate it directly. -- acc.js | al4anim - Allegro 4 Animation library | Allegro 5 VS/NuGet Guide | Allegro.cc Mockup | Allegro.cc <code> Tag | Allegro 4 Timer Example (w/ Semaphores) | Allegro 5 "Winpkg" (MSVC readme) | Bambot | Blog | C++ STL Container Flowchart | Castopulence Software | Check Return Values | Derail? | Is This A Discussion? Flow Chart | Filesystem Hierarchy Standard | Clean Code Talks - Global State and Singletons | How To Use Header Files | GNU/Linux (Debian, Fedora, Gentoo) | rot (rot13, rot47, rotN) | Streaming |
Hyena_
Member #8,852
July 2007
![]() |
I ran several tests and the results were amazingly weird. I called my lua scripts spawning a new lua thread for each call 10 000 times. Difference was only 4%! I don't consider it being vital. Therefore I continue using my lua threads. However, I got a really big fps boost by optimizing something that I even don't remember as I thought it was not so important. I got rid of the different lua state for each game object. Now all objects use the same lua state. Could this really be the boost for performance? The other thing I did was that I moved some code from lua to c++ which immediately gave 20 FPS but that couldn't have been the only boost. I no longer trigger draw event for objects outside the current room. Before I sent the events but I checked inside the lua code whether to return immediately or do some drawing.
|
bamccaig
Member #7,536
July 2006
![]() |
Hyena_ said:
I called my lua scripts spawning a new lua thread for each call 10 000 times. Difference was only 4%! I don't consider it being vital. Therefore I continue using my lua threads. However, I got a really big fps boost by optimizing something that I even don't remember as I thought it was not so important. How do you even manage 10000 coroutines? AFAICT, you are responsible for stopping and starting them every time... Are you sure they're all running? Hyena_ said: I got rid of the different lua state for each game object. Now all objects use the same lua state. Could this really be the boost for performance?
I think you can consider a Lua state as effectively being a new VM instance. It would be like running a new JVM for every Java class or a new .NET runtime for every class. -- acc.js | al4anim - Allegro 4 Animation library | Allegro 5 VS/NuGet Guide | Allegro.cc Mockup | Allegro.cc <code> Tag | Allegro 4 Timer Example (w/ Semaphores) | Allegro 5 "Winpkg" (MSVC readme) | Bambot | Blog | C++ STL Container Flowchart | Castopulence Software | Check Return Values | Derail? | Is This A Discussion? Flow Chart | Filesystem Hierarchy Standard | Clean Code Talks - Global State and Singletons | How To Use Header Files | GNU/Linux (Debian, Fedora, Gentoo) | rot (rot13, rot47, rotN) | Streaming |
Hyena_
Member #8,852
July 2007
![]() |
Sorry for bad explanation. I called my step event 10 000 times. That basically meant that most of the computing was therefore done in lua. I had maybe 15 objects to so every object got 10 000 step event calls in lua. The whole processing lasted for ~9 seconds. With threads it lasted 9.2 seconds and without lua threads it lasted 8.8 seconds. edit:
|
|