|
Computer chess |
ReyBrujo
Moderator
January 2001
|
Have you compiled with loop unroll ( -funloop-rolls for gcc) to test the loop speed? -- |
Kirr
Member #5,060
September 2004
|
Hi Evert, congrats on a working chess program! If it gets UCI interface (or WB), it will be able to enter tournaments. There it can get a rating and match versus some real competition. Good luck making it stronger, I know it's a lot of fun! My ongoing tournament: KCEC -- |
Peter Wang
Member #23
April 2000
|
A couple of quick notes on compilation in Linux. My glibc doesn't include qsort_r() yet so I had to nick one off the web (attached). depend.sh expects echo '\t' to print a tab. This is non-standard, but you should be able to use $'\t' which is a POSIX shell feature. When I told the computer to make the first move, it crashed. I can't reproduce it now though (maybe I'll run it through valgrind later). You should handle display expose events. Other than that, it seems to work. Unfortunately I don't know anything about chess.
|
Evert
Member #794
November 2000
|
ReyBrujo said: Have you compiled with loop unroll ( -funloop-rolls for gcc) to test the loop speed? Hmm, I have fairly aggressive optimisations switched on (those make a difference too!), specifically "-fast" which enables "-O3 -fomit-frame-pointer -fstrict-aliasing -momit-leaf-frame-pointer -fno-tree-pre -falign-loops" but not loop unrolling (I thought it did!), I'll test that. Kirr said: congrats on a working chess program! Thank you! It's actually my third (fourth if you count my Chinese chess one). Quote: If it gets UCI interface (or WB), it will be able to enter tournaments. There it can get a rating and match versus some real competition. I have UCI planned (it looks slightly easier to do). It'll be interesting to see how it fares against other programs. It does decently well on some benchmarks, but not so well (yet!) on others... Quote: Good luck making it stronger, I know it's a lot of fun!
Damn frustrating too! The immediate effect of a lot of improvements I put in is a decrease in tactical strength/speed. Usually that means there is a subtle bug somewhere that is tricky to find. Quote: My ongoing tournament: KCEC [kirill-kryukov.com] Cool! I'll be very interested to see how my program does. Though if Crafty is down in 31st place I can't hope for much, I tink... Peter Wang said: A couple of quick notes on compilation in Linux. My glibc doesn't include qsort_r() yet so I had to nick one off the web (attached). Heh, oops! Sorry about that; I have one ready in src/misc/ (probably the same one you found) but didn't get round to using that. Mostly because OS X (and BSD?) libc have it already. Quote: depend.sh expects echo '\t' to print a tab. This is non-standard, but you should be able to use $'\t' which is a POSIX shell feature. Ah! Good to know. I should actually just replace it with a Perl script I wrote for another project and that works much better (the bash script is hacked together based on Allegro's old depend.sh script from several years ago). Quote: When I told the computer to make the first move, it crashed. Hmm... Quote: I can't reproduce it now though (maybe I'll run it through valgrind later). Ok. That could be helpful! Quote: You should handle display expose events. Oh, right! I forget, since they're not needed on OS X... Quote: Other than that, it seems to work. Unfortunately I don't know anything about chess. You have a golden opportunity to learn. EDIT: EDIT2: |
ReyBrujo
Moderator
January 2001
|
The content of the loop itself are more complex than the if statements, so it is expected to be slower. A profile run would tell you if you can optimize simple functions too. Also, many programs cache results to prevent having to review movements again. -- |
Kirr
Member #5,060
September 2004
|
Evert said: I have UCI planned (it looks slightly easier to do). It'll be interesting to see how it fares against other programs. It does decently well on some benchmarks, but not so well (yet!) on others... Results on test suits unfortunately don't correlate very well with the ratings, playing real engine-engine games is more reliable way to track the progress. So actually implementing the UCI interface will be very useful for your development too, as it will allow you to run test matches. Evert said: Damn frustrating too! The immediate effect of a lot of improvements I put in is a decrease in tactical strength/speed. Usually that means there is a subtle bug somewhere that is tricky to find. Yes balancing search speed versus evaluation accuracy is a challenge! Test suits are good for a quick check that you haven't broken anything, but test matches are really more sound way to test the changes. Some authors run matches between different builds of their program, some use other engines as opponents. I prefer the latter way, as testing against itself may hide some weaknesses that your engine can't exploit but others will. Variety of at least 10 or 20 different opponents is much better. And anyway in the tournaments you'll face other engines so why not test against them during the development too. Evert said: Cool! I'll be very interested to see how my program does. Though if Crafty is down in 31st place I can't hope for much, I tink... My list is missing a few recent strong engines, so in fact it's even tougher than it looks. But if you don't set unrealistic goals, like getting in top 50 from the start, then you'll be OK. -- |
Evert
Member #794
November 2000
|
ReyBrujo said: The content of the loop itself are more complex than the if statements, so it is expected to be slower.
Yes, but they avoid a branch. Anyway, no matter, things are fine now the way they are. Quote: A profile run would tell you if you can optimize simple functions too. It would, if gprof would give me sensible output on my MacBook, but it doesn't. I have no idea why, but gprof seems to be broken on OS X. Quote: Also, many programs cache results to prevent having to review movements again.
Oh, that is absolutely essential to have. The speed of my program increased by more than an order of magnitude when I first added it (which was the second thing or so that I did). EDIT Kirr said: Results on test suits unfortunately don't correlate very well with the ratings, playing real engine-engine games is more reliable way to track the progress. So actually implementing the UCI interface will be very useful for your development too, as it will allow you to run test matches.
True. There's also another more subtle danger: before I was testing a small number of test positions and optimising my program so it could solve them quickly. However, it actually performed worse on other positions than before: I had been tuning it to solve one particular set of problems very efficiently. Still, that's fine. I see getting it to play tactically well as a first step into making it play generally ok. Quote: test matches are really more sound way to test the changes. I know. They take a while to run though... Quote: Some authors run matches between different builds of their program, some use other engines as opponents. I prefer the latter way, as testing against itself may hide some weaknesses that your engine can't exploit but others will. Variety of at least 10 or 20 different opponents is much better.
True; again, it takes time though. |
Thomas Fjellstrom
Member #476
June 2000
|
Evert said: -O3 -fomit-frame-pointer -fstrict-aliasing -momit-leaf-frame-pointer -fno-tree-pre -falign-loops -O3 BAD! VERY BAD! Expect odd unrepeatable errors with -O3. It is explicitly marked dangerous and may actually generate bad/wrong code. -- |
Evert
Member #794
November 2000
|
Thomas Fjellstrom said: Expect odd unrepeatable errors with -O3. It is explicitly marked dangerous and may actually generate bad/wrong code.
I've used -O3 in programs doing heavy floating point calculations and have the resulting code produce absolute garbage as a result. I'm well aware of the dangers of using -O3. |
Arvidsson
Member #4,603
May 2004
|
Very nice looking project! I will try to compile it sometime soon. I'm intrigued by chess programming, but I will probably start out with something simpler, such as Othello. As you say, the bitboards will be a tough nut to crack, especially since I'm not that well versed even with simple bit operations.
|
Thomas Fjellstrom
Member #476
June 2000
|
Evert said: I've used -O3 in programs doing heavy floating point calculations and have the resulting code produce absolute garbage as a result. I'm well aware of the dangers of using -O3. Just making sure. It could still be producing broken code though, just that it doesn't always matter much. Like very hard to find crashes (see prior messages), or making odd choices (ones you didn't expect it to make)... I find its better on my sanity to use -O2 and hand select extra opt flags instead of wonder if -O3 is silently trying to drive me insane. -- |
Evert
Member #794
November 2000
|
Arvidsson said: I will probably start out with something simpler, such as Othello.
You could, though I'm not sure it's actually that much easier to program (maybe, I haven't given it much though). Draughts might be even simpler. Quote: As you say, the bitboards will be a tough nut to crack, especially since I'm not that well versed even with simple bit operations.
You don't need to use bitboards - plain char arrays are perfectly fine. In fact, some people will tell you that plain char arrays are better and produce faster code... then again, other people will tell you bitboards are what it's at. Thomas Fjellstrom said: It could still be producing broken code though, just that it doesn't always matter much. Like very hard to find crashes (see prior messages), or making odd choices (ones you didn't expect it to make)... It's pretty much impossible to work out why the code thinks some moves are good or bad anyway, there's simply too much happening at once. But don't worry, when I see something funky turn up, the first thing I do is switch off optimisations. |
Thomas Fjellstrom
Member #476
June 2000
|
Evert said: Even a tic-tac-toe program can be solved using the same approach as a chess program. It's overkill, but it would work - and teach you the basics of the search algorithm. I need to start working on an AI for my Sos game. The problem is it can use an arbitrarily sized board, and theres only a few rules:
It vaguely reminds me of Go, at least with the simple complexity of it all. I'm sure I could program a really stupid AI for it, but to program anything that approaches GOOD would take intelligence I probably don't have right now. Not only do you have to pick a good spot, you have to guard against the other player picking an SOS string. Just how deep do you look for strategy? Especially on a 50x50 or 100x100 board? -- |
Evert
Member #794
November 2000
|
Go is a notoriously difficult game to write programs for, at least programs that play somewhat decently. I'd start with something relatively simple, say a 16x16 board, and then work from there. You can generalise it later. |
Thomas Fjellstrom
Member #476
June 2000
|
Evert said: I'd start with something relatively simple, say a 16x16 board, and then work from there. You can generalise it later. Thanks for the tips. I'll have a look at alpha-beta and transposition tables -- |
Anomie
Member #9,403
January 2008
|
For a game like that (SOS, which I've heard called Oxo in the past), wouldn't it be sufficient to treat the game board as though its size was maybe two or three spaces larger in each dimension than the smallest board that would contain the current state? The board size would get bigger as the game progressed, but with a game like that I don't think the possible moves would increase dramatically. This thread made me take up chess again (as well as go for the first time) so thank you. ______________ |
Thomas Fjellstrom
Member #476
June 2000
|
Anomie said: (SOS, which I've heard called Oxo in the past) That would be Tic-Tac-Toe wouldn't it? -- |
Anomie
Member #9,403
January 2008
|
Tic-Tac-Toe is just Ooo. Or Xxx, for the mature version. add: Oh, according to Wikipedia OXO was the name of "the first graphical computer game, a version of Tic-tac-toe". I'm talking about the Oxo described here, which sounds like what you're doing, with X's instead of S's. ______________ |
Kirr
Member #5,060
September 2004
|
Evert said: True. There's also another more subtle danger: before I was testing a small number of test positions and optimising my program so it could solve them quickly. However, it actually performed worse on other positions than before: I had been tuning it to solve one particular set of problems very efficiently. Still, that's fine. I see getting it to play tactically well as a first step into making it play generally ok. Yes tactical strength is the start. Also, at the larger search depths the difference between "tactical" and "strategical" is getting blurred. Some moves that we think are bad strategically are also bad tactically when you search deep enough. Famous example is Rybka - when it first appeared in the end of 2005 it was 1. Very strong 2. Showing rather low node-counts. So most of people concluded that it was slow and had very accurate evaluation. Later it turned out that Rybka in fact is a crazily fast searcher with relatively primitive evaluation. It was manipulating the node-count to make it look slow. Still from the games you could not tell the difference easily. Do you score piece mobility? Like, scoring higher when there are many squares a piece can go, and lower if there are few? This may already help with the bad bishops. Evert said: True; again, it takes time though. Yes it takes lot of time, but it is necessary if your goal is high. As your program gets stronger, it will be more and more difficult to find an improvement. Most of time will be spent tuning parameters and testing the difference. And the difference will typically be very small. So small that it will need many thousands of games to detect with significance. Accumulating these small differences you'll still be able to progress, but strong testing strategy is essential for this. Famous examples are Crafty, and again Rybka. Both use clusters to run automated test matches day and night. Evert said: The choice of engine is also important: the opponents must be of comparable strength or slightly better. I don't think I'd learn much from pitting my poor program against the likes of Fruit, Glaurung, Rybka or Crafty! You'd be surprised to know how many engine authors don't get this simple idea. Many test their engine agains itself, many choose opponents that are too weak or too strong. The good choice is opponents of comparable strength. It's difficult to choose without knowing approximate strength of your engine. There are hundreds of engines to choose from. Some important lists: CCRL 40/4, CCRL 40/40, WBEC, RWBC, UEL. These two engines are fun because they are open source, and the source is just about 2 KB of C code: Toledo Nanochess, micro-Max. (The latter one is from Netherlands too). These two guys are competing for the smallest C source that can still beat the other one. (Both claim that their engine is stronger). Since your engine is probably more than 2 kb of code, it's worth checking that it can beat these two. -- |
Evert
Member #794
November 2000
|
Kirr said: Yes tactical strength is the start. Also, at the larger search depths the difference between "tactical" and "strategical" is getting blurred. Some moves that we think are bad strategically are also bad tactically when you search deep enough.
True. I did a quick test (by hand): running my program against TSCP (at a fixed time of 5s per move), which is listed as one of the weaker programs on the pages you linked to (no particular reason, I just happened to have a copy of it because I wanted to have a look at how it retrieves its principle variation). Both programs played fairly horribly, although TSCP at least put its pieces in semi-sensible locations (it then didn't know what to do with them, of course). I had been hoping that there would be some deep combination that would win material and allow my program to win, but of course that is statistically unlikely if your pieces are badly placed to begin with. Conversely, if your pieces are placed right, there probably is some deep combination somewhere - which my program saw even when TSCP didn't, which meant it gladly gave a pawn to prevent losing a minor piece in some deep combination that was beyond TSCP's horizon. My program then proceeded to trap its bishop on a7 (the classical Bxa7 b6) and entered the end game with two pawns and a knight down. It managed to get down to knight+rook pawn versus king, which you can draw in some situations but you need to know what colour square your king needs to be at (the knight must be able to drive the king out of the corner), which was beyond my programs horizon (maybe not if it had been allowed to search for the full 5 seconds, but it was also limited to 20 ply, which it reached in 0.5 seconds at this stage of the game). So, lessons learned: 1. I need a better evaluation function. 2. It can probably still be fairly simple, especially if that means it's also still fast. 3. Search depth alone doesn't mean a thing if your evaluation is really bad (well, I knew that). Quote: Famous example is Rybka - when it first appeared in the end of 2005 it was 1. Very strong 2. Showing rather low node-counts. So most of people concluded that it was slow and had very accurate evaluation. Later it turned out that Rybka in fact is a crazily fast searcher with relatively primitive evaluation. It was manipulating the node-count to make it look slow. Evil! From what I remember, programs like Fruit and Glaurung also have fairly simple evaluation functions (although they're still much better than mine) and simply "out search" the competition. Quote: Do you score piece mobility? Like, scoring higher when there are many squares a piece can go, and lower if there are few? This may already help with the bad bishops.
Nope. On the list. Quote: Yes it takes lot of time, but it is necessary if your goal is high. As your program gets stronger, it will be more and more difficult to find an improvement. Most of time will be spent tuning parameters and testing the difference. And the difference will typically be very small. So small that it will need many thousands of games to detect with significance. Accumulating these small differences you'll still be able to progress, but strong testing strategy is essential for this.
I know. For now, my goal is simply to write the best chess program that I can write. It's a spare-time project, so I have a limited amount of time to spend on it. Quote: You'd be surprised to know how many engine authors don't get this simple idea. Really? It's not really any different from humans playing humans: you don't learn what your weak spots are unless you play against evenly matched opponents. And without knowing that you cannot advance. Quote: Many test their engine agains itself, For something like Crafty, this is probably fine: the old version is strong enough and tested enough that it isn't likely to have major weaknesses. Quote: It's difficult to choose without knowing approximate strength of your engine. Indeed. Well, at least I have a temporary target to beat now. Quote: These two engines are fun because they are open source, and the source is just about 2 KB of C code: Toledo Nanochess [nanochess.110mb.com], micro-Max [home.hccnet.nl]. (The latter one is from Netherlands too). These two guys are competing for the smallest C source that can still beat the other one. (Both claim that their engine is stronger). Since your engine is probably more than 2 kb of code, it's worth checking that it can beat these two.
Yes, it's about 20xthat, but that includes variable names of more than one character. Anyway, onwards to play with the evaluation function! |
Kirr
Member #5,060
September 2004
|
Evert said: True. I did a quick test (by hand): running my program against TSCP (at a fixed time of 5s per move), which is listed as one of the weaker programs on the pages you linked to (no particular reason, I just happened to have a copy of it because I wanted to have a look at how it retrieves its principle variation). Both programs played fairly horribly, although TSCP at least put its pieces in semi-sensible locations (it then didn't know what to do with them, of course). I had been hoping that there would be some deep combination that would win material and allow my program to win, but of course that is statistically unlikely if your pieces are badly placed to begin with. Conversely, if your pieces are placed right, there probably is some deep combination somewhere - which my program saw even when TSCP didn't, which meant it gladly gave a pawn to prevent losing a minor piece in some deep combination that was beyond TSCP's horizon. My program then proceeded to trap its bishop on a7 (the classical Bxa7 b6) and entered the end game with two pawns and a knight down. It managed to get down to knight+rook pawn versus king, which you can draw in some situations but you need to know what colour square your king needs to be at (the knight must be able to drive the king out of the corner), which was beyond my programs horizon (maybe not if it had been allowed to search for the full 5 seconds, but it was also limited to 20 ply, which it reached in 0.5 seconds at this stage of the game). Fun! It will be even more fun when you can let the two programs play automatically, so you can sit back and watch. Evert said: So, lessons learned: 1. I need a better evaluation function. 2. It can probably still be fairly simple, especially if that means it's also still fast. 3. Search depth alone doesn't mean a thing if your evaluation is really bad (well, I knew that). [...] From what I remember, programs like Fruit and Glaurung also have fairly simple evaluation functions (although they're still much better than mine) and simply "out search" the competition. Yeah. Historially computer chess was considered as kind of AI application, very complex eval was respected. Though some were successful with very simple eval and fast search (like Fritz 5). It was always a big debate. The turning point happened I think with the emergence of Fruit. It was amazing example of a fast searcher with clean and simple evaluation. Now most people don't think computer chess is about intelligence anymore, but about outsearching the opponent. Fruit's influence on computer chess was enormous. Rybka, Fritz 11, Naum 4, Loop, Onno owe a lot of their strength to Fruit (they don't admit it though), plus countless Fruit derivatives (Toga, Grapefruit, Cyclone, TheMadPrune, etc). Of course this is mainly about the competition at the top. In the amateur area anything is fine as long as it works. Personally I don't like to see other people's code as it spoils the fun of coming up with my own solutions. Evert said: Nope. On the list. I'd try the general piece mobility first - it is a robust and rightly popular idea. Anything specific to bishop-pawn placements will likely slow you down for too little benefit. (IMHO). Evert said: Really? It's not really any different from humans playing humans: you don't learn what your weak spots are unless you play against evenly matched opponents. And without knowing that you cannot advance. Actually the most common mistake in testing is simply not testing enough. Often people run a couple of 100-game matches, and based on that decide if the improvement is good or bad. If the improvement is worth 10 Elo points, that's invisible in the noise with this number of games. So their progress is forward or backward depending on random chance. And having too strong or too weak opponents of course will increase the number of games necessary for the same error bars. Anyway, we can get back to that later when you have UCI protocol done. Evert said: For something like Crafty, this is probably fine: the old version is strong enough and tested enough that it isn't likely to have major weaknesses. Crafty is being tested against selection of a few opponents (less than 10 IIRC). The opponent selection is very bad, it includes Glaurung and some other much stronger engines. But Bob is able to run millions of games on his cluster, so he can measure very small improvements. The superior testing scheme plays important role in Crafty's steady improvement during the last few years. Rybka is playing test games with itself, but in this case it's fine because there are simply no other opponents of comparable strength to play with. Evert said: Anyway, onwards to play with the evaluation function! Check this out: CPW (also other materials in the wiki). -- |
Evert
Member #794
November 2000
|
Kirr said: Yeah. Historially computer chess was considered as kind of AI application, very complex eval was respected. Though some were successful with very simple eval and fast search (like Fritz 5). It was always a big debate. The turning point happened I think with the emergence of Fruit.
I would think it happened much earlier than that - when it became clear that a Shannon type B approach (only look at good moves, don't waste time with poor moves) could never beat a type A program (look at all moves). The first would be some sort of artificial intelligence (no human considers all moves in a position) whereas the second approach mainly needs a fast searcher. Quote: Fruit's influence on computer chess was enormous. Rybka, Fritz 11, Naum 4, Loop, Onno owe a lot of their strength to Fruit (they don't admit it though), plus countless Fruit derivatives (Toga, Grapefruit, Cyclone, TheMadPrune, etc).
I might be wrong, but wasn't Rybka originally a Fruit derivative? Quote: Personally I don't like to see other people's code as it spoils the fun of coming up with my own solutions.
There's only so many ways in which you can write an alpha-beta search. Quote: I'd try the general piece mobility first - it is a robust and rightly popular idea. Anything specific to bishop-pawn placements will likely slow you down for too little benefit. (IMHO).
I suspect it will actually score the same thing (more or less). Mobility for knights doesn't depend on pawn structure very much as long as they're somewhere on the middle of the board (depending a bit on whether you count destination squares attacked by enemy pawns in your mobility or not), for rooks the main mobility determinator would be that they're on (semi) open files. So the main mobility term would (generally) come from bishops, I would imagine. Quote: Actually the most common mistake in testing is simply not testing enough. Often people run a couple of 100-game matches, and based on that decide if the improvement is good or bad. If the improvement is worth 10 Elo points, that's invisible in the noise with this number of games. Quite. A difference of ~100 Elo is something you could see though, but when your program is already strong it'll be hard to advance that much. Quote: Check this out: CPW [chessprogramming.wikispaces.com] (also other materials in the wiki).
Already have. Some of the stuff on there I find very useful (my static exchange evaluator is based on the description given there), some of it not so (when it's mainly code and not so much explanation in the article itself). I've also gone over Ed Schroeder's disection of the internal workings of Rebel several times, though I wish he'd written it in Dutch because his English sometimes isn't that good... |
Kirr
Member #5,060
September 2004
|
Evert said: I would think it happened much earlier than that - when it became clear that a Shannon type B approach (only look at good moves, don't waste time with poor moves) could never beat a type A program (look at all moves). The first would be some sort of artificial intelligence (no human considers all moves in a position) whereas the second approach mainly needs a fast searcher. Well as you say no strong program looks at all moves to the equal depth anyway. However there was still considerable debate about the relative importance of eval vs search. Fruit's emergence and subsequent improvement of top engines underlined the importance of fast search and fast evaluation, even in the context of already highly selective search strategies. Evert said: I might be wrong, but wasn't Rybka originally a Fruit derivative? People could be sued for saying something like this openly without hard proof, so let's just say that I believe that Fruit's influence on Rybka is more than is acknowledged. Note that this applies to a number of other commercial engines too, just Rybka receives the most attention. Evert said: I know there is a lot of debate about whether it still is (in which case it's in violation of the GPL) that I'm not particularly interested in diving into. Yes there was a lot of debate in computer chess circles about it, and you are probably right to avoid getting into it. Evert said: I suspect it will actually score the same thing (more or less). Mobility for knights doesn't depend on pawn structure very much as long as they're somewhere on the middle of the board (depending a bit on whether you count destination squares attacked by enemy pawns in your mobility or not), for rooks the main mobility determinator would be that they're on (semi) open files. So the main mobility term would (generally) come from bishops, I would imagine. I am happy with whatever works for you! BTW, engine names shown in orange color in KCEC or CCRL lists are all open source. -- |
Evert
Member #794
November 2000
|
Ok, I seem to have a working UCI interface - at least working well enough that I can play a game and everything seems to work more or less as it should. It doesn't do fixed-time-per-move yet and I'm sure there are other options I've missed, but the bottom line is: it's able to play a game through UCI. I'll hopefully finish that up over the weekend and then get to work on the evaluation a bit more. |
Kirr
Member #5,060
September 2004
|
Evert said: Ok, I seem to have a working UCI interface - at least working well enough that I can play a game and everything seems to work more or less as it should. It doesn't do fixed-time-per-move yet and I'm sure there are other options I've missed, but the bottom line is: it's able to play a game through UCI. Cool! Please post any match results! Evert said: I'll hopefully finish that up over the weekend and then get to work on the evaluation a bit more. If you make a simple web-page with short description and download link, you can announce the engine and enter tournaments. -- |
|
|