Allegro.cc - Online Community

Allegro.cc Forums » Game Design & Concepts » Computer chess

This thread is locked; no one can reply to it. rss feed Print
 1   2   3   4 
Computer chess
ReyBrujo
Moderator
January 2001
avatar

Have you compiled with loop unroll ( -funloop-rolls for gcc) to test the loop speed?

--
RB
光子「あたしただ…奪う側に回ろうと思っただけよ」
Mitsuko's last words, Battle Royale

Kirr
Member #5,060
September 2004
avatar

Hi Evert, congrats on a working chess program! If it gets UCI interface (or WB), it will be able to enter tournaments. There it can get a rating and match versus some real competition. :D

Good luck making it stronger, I know it's a lot of fun!

My ongoing tournament: KCEC

--
"Go to the NW of Stonemarket Plaza to the Blue Glyph Keeper Library door and enter."
- Lunabean's Thief Deadly Shadows Walkthrough, day eight

Peter Wang
Member #23
April 2000

A couple of quick notes on compilation in Linux. My glibc doesn't include qsort_r() yet so I had to nick one off the web (attached). depend.sh expects echo '\t' to print a tab. This is non-standard, but you should be able to use $'\t' which is a POSIX shell feature.

When I told the computer to make the first move, it crashed. I can't reproduce it now though (maybe I'll run it through valgrind later). You should handle display expose events. Other than that, it seems to work. Unfortunately I don't know anything about chess.

Evert
Member #794
November 2000
avatar

ReyBrujo said:

Have you compiled with loop unroll ( -funloop-rolls for gcc) to test the loop speed?

Hmm, I have fairly aggressive optimisations switched on (those make a difference too!), specifically "-fast" which enables "-O3 -fomit-frame-pointer -fstrict-aliasing -momit-leaf-frame-pointer -fno-tree-pre -falign-loops" but not loop unrolling (I thought it did!), I'll test that.

Kirr said:

congrats on a working chess program!

Thank you! It's actually my third (fourth if you count my Chinese chess one).

Quote:

If it gets UCI interface (or WB), it will be able to enter tournaments. There it can get a rating and match versus some real competition. :D

I have UCI planned (it looks slightly easier to do). It'll be interesting to see how it fares against other programs. It does decently well on some benchmarks, but not so well (yet!) on others...

Quote:

Good luck making it stronger, I know it's a lot of fun!

Damn frustrating too! The immediate effect of a lot of improvements I put in is a decrease in tactical strength/speed. Usually that means there is a subtle bug somewhere that is tricky to find.
For instance, I had a problem where in certain positions, the search would take a very long time. This turned out to be an interaction between the check extension (look deeper in lines with checks, with some restrictions) and the recapture extension (look deeper in lines with recaptures, again with some restrictions), which generated long sequences of pointless checks and silly exchanges. Fixing that is what gave my Bxf7 in the above example (current build finds it in 0.45s but doesn't see that it wins a queen until 0.61s).

Quote:

My ongoing tournament: KCEC [kirill-kryukov.com]

Cool! I'll be very interested to see how my program does. Though if Crafty is down in 31st place I can't hope for much, I tink...

A couple of quick notes on compilation in Linux. My glibc doesn't include qsort_r() yet so I had to nick one off the web (attached).

Heh, oops! Sorry about that; I have one ready in src/misc/ (probably the same one you found) but didn't get round to using that. Mostly because OS X (and BSD?) libc have it already.

Quote:

depend.sh expects echo '\t' to print a tab. This is non-standard, but you should be able to use $'\t' which is a POSIX shell feature.

Ah! Good to know. I should actually just replace it with a Perl script I wrote for another project and that works much better (the bash script is hacked together based on Allegro's old depend.sh script from several years ago).

Quote:

When I told the computer to make the first move, it crashed.

Hmm...

Quote:

I can't reproduce it now though (maybe I'll run it through valgrind later).

Ok. That could be helpful!

Quote:

You should handle display expose events.

Oh, right! I forget, since they're not needed on OS X...

Quote:

Other than that, it seems to work. Unfortunately I don't know anything about chess.

You have a golden opportunity to learn. ;)

EDIT:
I tried adding -funroll-loops; the difference is within the error bar but it seems slightly slower with -funroll-loops enabled (in the current code, I didn't try with the old version of get_piece()).

EDIT2:
Ok, did that experiment too. With the for loop instead of the if statements, the version with -funroll-loops is 10% faster than the version compiled without that flag. The if statements are 20% faster, so it's closer but still not enough...

ReyBrujo
Moderator
January 2001
avatar

The content of the loop itself are more complex than the if statements, so it is expected to be slower.

A profile run would tell you if you can optimize simple functions too. Also, many programs cache results to prevent having to review movements again.

--
RB
光子「あたしただ…奪う側に回ろうと思っただけよ」
Mitsuko's last words, Battle Royale

Kirr
Member #5,060
September 2004
avatar

Evert said:

I have UCI planned (it looks slightly easier to do). It'll be interesting to see how it fares against other programs. It does decently well on some benchmarks, but not so well (yet!) on others...

Results on test suits unfortunately don't correlate very well with the ratings, playing real engine-engine games is more reliable way to track the progress. So actually implementing the UCI interface will be very useful for your development too, as it will allow you to run test matches.

Evert said:

Damn frustrating too! The immediate effect of a lot of improvements I put in is a decrease in tactical strength/speed. Usually that means there is a subtle bug somewhere that is tricky to find.
For instance, I had a problem where in certain positions, the search would take a very long time. This turned out to be an interaction between the check extension (look deeper in lines with checks, with some restrictions) and the recapture extension (look deeper in lines with recaptures, again with some restrictions), which generated long sequences of pointless checks and silly exchanges. Fixing that is what gave my Bxf7 in the above example (current build finds it in 0.45s but doesn't see that it wins a queen until 0.61s).

Yes balancing search speed versus evaluation accuracy is a challenge! Test suits are good for a quick check that you haven't broken anything, but test matches are really more sound way to test the changes. Some authors run matches between different builds of their program, some use other engines as opponents. I prefer the latter way, as testing against itself may hide some weaknesses that your engine can't exploit but others will. Variety of at least 10 or 20 different opponents is much better. And anyway in the tournaments you'll face other engines so why not test against them during the development too.

Evert said:

Cool! I'll be very interested to see how my program does. Though if Crafty is down in 31st place I can't hope for much, I tink...

My list is missing a few recent strong engines, so in fact it's even tougher than it looks. But if you don't set unrealistic goals, like getting in top 50 from the start, then you'll be OK. :)

--
"Go to the NW of Stonemarket Plaza to the Blue Glyph Keeper Library door and enter."
- Lunabean's Thief Deadly Shadows Walkthrough, day eight

Evert
Member #794
November 2000
avatar

ReyBrujo said:

The content of the loop itself are more complex than the if statements, so it is expected to be slower.

Yes, but they avoid a branch. Anyway, no matter, things are fine now the way they are.
:)

Quote:

A profile run would tell you if you can optimize simple functions too.

It would, if gprof would give me sensible output on my MacBook, but it doesn't. I have no idea why, but gprof seems to be broken on OS X.

Quote:

Also, many programs cache results to prevent having to review movements again.

Oh, that is absolutely essential to have. The speed of my program increased by more than an order of magnitude when I first added it (which was the second thing or so that I did).
This is particularly important because chess programs tend to use "iterative deepening": you search a position to 2 ply before doing a 3 ply search before doing a 4 ply search... etc. The idea is that a low-ply search is relatively fast and you can use the extra information to speed up the deeper search.
Without a transposition table, I don't think this would work at all.

EDIT

Kirr said:

Results on test suits unfortunately don't correlate very well with the ratings, playing real engine-engine games is more reliable way to track the progress. So actually implementing the UCI interface will be very useful for your development too, as it will allow you to run test matches.

True. There's also another more subtle danger: before I was testing a small number of test positions and optimising my program so it could solve them quickly. However, it actually performed worse on other positions than before: I had been tuning it to solve one particular set of problems very efficiently.
Most test positions will only measure tactical ability and possibly the evaluation function. It's important to get those right, but it isn't the whole story, because in most chess positions you have to play strategically rather than tactically.
I had a very beautiful test game against my program that showed this a couple of days ago (unfortunately I accidentally deleted it). The program was playing sound moves tactically, but putting its pieces in bad squares (it had two bad bishops at one point) and keeping its king near the centre. All I had to do was develop my pieces and wait for the right moment to sacrifice a knight to draw its king out in the open. It avoided being mated, but by then its position was so horrible it was always going to lose (a lot of) material and it was an easy win after that.

Still, that's fine. I see getting it to play tactically well as a first step into making it play generally ok.

Quote:

test matches are really more sound way to test the changes.

I know. They take a while to run though...

Quote:

Some authors run matches between different builds of their program, some use other engines as opponents. I prefer the latter way, as testing against itself may hide some weaknesses that your engine can't exploit but others will. Variety of at least 10 or 20 different opponents is much better.

True; again, it takes time though.
The choice of engine is also important: the opponents must be of comparable strength or slightly better. I don't think I'd learn much from pitting my poor program against the likes of Fruit, Glaurung, Rybka or Crafty!
Are there any engines you'd recommend as test opponents?

Thomas Fjellstrom
Member #476
June 2000
avatar

Evert said:

-O3 -fomit-frame-pointer -fstrict-aliasing -momit-leaf-frame-pointer -fno-tree-pre -falign-loops

-O3 BAD! VERY BAD!

Expect odd unrepeatable errors with -O3. It is explicitly marked dangerous and may actually generate bad/wrong code.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Evert
Member #794
November 2000
avatar

Expect odd unrepeatable errors with -O3. It is explicitly marked dangerous and may actually generate bad/wrong code.

I've used -O3 in programs doing heavy floating point calculations and have the resulting code produce absolute garbage as a result. I'm well aware of the dangers of using -O3.
The proof of the pudding is in the eating though, as they say. So far everything works perfectly fine. The only case so far where I was having problems when using -O3 (instead of -O2) those problems were actually due to passing the wrong size to calloc(), thus allocating insufficient memory to hold an entire array. After fixing that, things were fine.

Arvidsson
Member #4,603
May 2004
avatar

Very nice looking project! I will try to compile it sometime soon. I'm intrigued by chess programming, but I will probably start out with something simpler, such as Othello. As you say, the bitboards will be a tough nut to crack, especially since I'm not that well versed even with simple bit operations.

Thomas Fjellstrom
Member #476
June 2000
avatar

Evert said:

I've used -O3 in programs doing heavy floating point calculations and have the resulting code produce absolute garbage as a result. I'm well aware of the dangers of using -O3.
The proof of the pudding is in the eating though, as they say. So far everything works perfectly fine. The only case so far where I was having problems when using -O3 (instead of -O2) those problems were actually due to passing the wrong size to calloc(), thus allocating insufficient memory to hold an entire array. After fixing that, things were fine.

Just making sure. It could still be producing broken code though, just that it doesn't always matter much. Like very hard to find crashes (see prior messages), or making odd choices (ones you didn't expect it to make)...

I find its better on my sanity to use -O2 and hand select extra opt flags instead of wonder if -O3 is silently trying to drive me insane.

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Evert
Member #794
November 2000
avatar

I will probably start out with something simpler, such as Othello.

You could, though I'm not sure it's actually that much easier to program (maybe, I haven't given it much though). Draughts might be even simpler.
Even a tic-tac-toe program can be solved using the same approach as a chess program. It's overkill, but it would work - and teach you the basics of the search algorithm. :)

Quote:

As you say, the bitboards will be a tough nut to crack, especially since I'm not that well versed even with simple bit operations.

You don't need to use bitboards - plain char arrays are perfectly fine. In fact, some people will tell you that plain char arrays are better and produce faster code... then again, other people will tell you bitboards are what it's at. ;)
It seems that you can do whatever works best for you. Writing the search and evaluation properly is more important than your particular choice of board representation.

It could still be producing broken code though, just that it doesn't always matter much. Like very hard to find crashes (see prior messages), or making odd choices (ones you didn't expect it to make)...

It's pretty much impossible to work out why the code thinks some moves are good or bad anyway, there's simply too much happening at once. But don't worry, when I see something funky turn up, the first thing I do is switch off optimisations. :)

Thomas Fjellstrom
Member #476
June 2000
avatar

Evert said:

Even a tic-tac-toe program can be solved using the same approach as a chess program. It's overkill, but it would work - and teach you the basics of the search algorithm. :)

I need to start working on an AI for my Sos game. The problem is it can use an arbitrarily sized board, and theres only a few rules:

  • Use an S or an O on any open square

  • Make an SOS in any direction (horizontal, vertical, or diagonal) and you get a point, and a free turn

It vaguely reminds me of Go, at least with the simple complexity of it all.

I'm sure I could program a really stupid AI for it, but to program anything that approaches GOOD would take intelligence I probably don't have right now.

Not only do you have to pick a good spot, you have to guard against the other player picking an SOS string. Just how deep do you look for strategy? Especially on a 50x50 or 100x100 board?

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Evert
Member #794
November 2000
avatar

Go is a notoriously difficult game to write programs for, at least programs that play somewhat decently.
It does sound like something where an alpha-beta type approach and transposition table would be quite useful to have, but the number of possible moves is enormous. Most of these are probably obviously bad or not very good, so you could use some sort of move ordering/depth extensions to help things along. The good thing is that the number of possible moves decreases as the game advances.

I'd start with something relatively simple, say a 16x16 board, and then work from there. You can generalise it later.

Thomas Fjellstrom
Member #476
June 2000
avatar

Evert said:

I'd start with something relatively simple, say a 16x16 board, and then work from there. You can generalise it later.

Thanks for the tips. I'll have a look at alpha-beta and transposition tables :)

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Anomie
Member #9,403
January 2008
avatar

For a game like that (SOS, which I've heard called Oxo in the past), wouldn't it be sufficient to treat the game board as though its size was maybe two or three spaces larger in each dimension than the smallest board that would contain the current state? The board size would get bigger as the game progressed, but with a game like that I don't think the possible moves would increase dramatically.

This thread made me take up chess again (as well as go for the first time) so thank you. :)

______________
I am the Lorax. I speak for the trees. I speak for the trees for the trees have no tongues.

Thomas Fjellstrom
Member #476
June 2000
avatar

Anomie said:

(SOS, which I've heard called Oxo in the past)

That would be Tic-Tac-Toe wouldn't it?

--
Thomas Fjellstrom - [website] - [email] - [Allegro Wiki] - [Allegro TODO]
"If you can't think of a better solution, don't try to make a better solution." -- weapon_S
"The less evidence we have for what we believe is certain, the more violently we defend beliefs against those who don't agree" -- https://twitter.com/neiltyson/status/592870205409353730

Anomie
Member #9,403
January 2008
avatar

Tic-Tac-Toe is just Ooo. Or Xxx, for the mature version.

add: Oh, according to Wikipedia OXO was the name of "the first graphical computer game, a version of Tic-tac-toe". I'm talking about the Oxo described here, which sounds like what you're doing, with X's instead of S's.

______________
I am the Lorax. I speak for the trees. I speak for the trees for the trees have no tongues.

Kirr
Member #5,060
September 2004
avatar

Evert said:

True. There's also another more subtle danger: before I was testing a small number of test positions and optimising my program so it could solve them quickly. However, it actually performed worse on other positions than before: I had been tuning it to solve one particular set of problems very efficiently.
Most test positions will only measure tactical ability and possibly the evaluation function. It's important to get those right, but it isn't the whole story, because in most chess positions you have to play strategically rather than tactically.
I had a very beautiful test game against my program that showed this a couple of days ago (unfortunately I accidentally deleted it). The program was playing sound moves tactically, but putting its pieces in bad squares (it had two bad bishops at one point) and keeping its king near the centre. All I had to do was develop my pieces and wait for the right moment to sacrifice a knight to draw its king out in the open. It avoided being mated, but by then its position was so horrible it was always going to lose (a lot of) material and it was an easy win after that.

Still, that's fine. I see getting it to play tactically well as a first step into making it play generally ok.

Yes tactical strength is the start. Also, at the larger search depths the difference between "tactical" and "strategical" is getting blurred. Some moves that we think are bad strategically are also bad tactically when you search deep enough. Famous example is Rybka - when it first appeared in the end of 2005 it was 1. Very strong 2. Showing rather low node-counts. So most of people concluded that it was slow and had very accurate evaluation. Later it turned out that Rybka in fact is a crazily fast searcher with relatively primitive evaluation. It was manipulating the node-count to make it look slow. Still from the games you could not tell the difference easily.

Do you score piece mobility? Like, scoring higher when there are many squares a piece can go, and lower if there are few? This may already help with the bad bishops.

Evert said:

True; again, it takes time though.

Yes it takes lot of time, but it is necessary if your goal is high. As your program gets stronger, it will be more and more difficult to find an improvement. Most of time will be spent tuning parameters and testing the difference. And the difference will typically be very small. So small that it will need many thousands of games to detect with significance. Accumulating these small differences you'll still be able to progress, but strong testing strategy is essential for this.

Famous examples are Crafty, and again Rybka. Both use clusters to run automated test matches day and night.

Evert said:

The choice of engine is also important: the opponents must be of comparable strength or slightly better. I don't think I'd learn much from pitting my poor program against the likes of Fruit, Glaurung, Rybka or Crafty!
Are there any engines you'd recommend as test opponents?

You'd be surprised to know how many engine authors don't get this simple idea. Many test their engine agains itself, many choose opponents that are too weak or too strong. The good choice is opponents of comparable strength. It's difficult to choose without knowing approximate strength of your engine. There are hundreds of engines to choose from. Some important lists: CCRL 40/4, CCRL 40/40, WBEC, RWBC, UEL.

These two engines are fun because they are open source, and the source is just about 2 KB of C code: Toledo Nanochess, micro-Max. (The latter one is from Netherlands too). These two guys are competing for the smallest C source that can still beat the other one. (Both claim that their engine is stronger). Since your engine is probably more than 2 kb of code, it's worth checking that it can beat these two. :D

--
"Go to the NW of Stonemarket Plaza to the Blue Glyph Keeper Library door and enter."
- Lunabean's Thief Deadly Shadows Walkthrough, day eight

Evert
Member #794
November 2000
avatar

Kirr said:

Yes tactical strength is the start. Also, at the larger search depths the difference between "tactical" and "strategical" is getting blurred. Some moves that we think are bad strategically are also bad tactically when you search deep enough.

True. I did a quick test (by hand): running my program against TSCP (at a fixed time of 5s per move), which is listed as one of the weaker programs on the pages you linked to (no particular reason, I just happened to have a copy of it because I wanted to have a look at how it retrieves its principle variation).
It has a very simple evaluation function, but it's still much more advanced than mine, which has material and some odd tweaks for material imbalances and pawn structure and that's it. Most of it is supposed to be framework for when I do get around to implementing a better one (I have a long list of things I want and notes on how I want to implement them). It doesn't even have piece square tables, so my program has no notion of where it should sensibly put its pieces. Well, it has them (I copied them from my previous attempt), but they're unused.
Throughout the game, my program outsearched TSCP by 5 or 6 ply (13 or 14 ply in the end game) and fairly often, the main variations agreed for the first four or five moves - so the greater search depth at least partially compensated for the horrible evaluation function.

Both programs played fairly horribly, although TSCP at least put its pieces in semi-sensible locations (it then didn't know what to do with them, of course). I had been hoping that there would be some deep combination that would win material and allow my program to win, but of course that is statistically unlikely if your pieces are badly placed to begin with. Conversely, if your pieces are placed right, there probably is some deep combination somewhere - which my program saw even when TSCP didn't, which meant it gladly gave a pawn to prevent losing a minor piece in some deep combination that was beyond TSCP's horizon. My program then proceeded to trap its bishop on a7 (the classical Bxa7 b6) and entered the end game with two pawns and a knight down. It managed to get down to knight+rook pawn versus king, which you can draw in some situations but you need to know what colour square your king needs to be at (the knight must be able to drive the king out of the corner), which was beyond my programs horizon (maybe not if it had been allowed to search for the full 5 seconds, but it was also limited to 20 ply, which it reached in 0.5 seconds at this stage of the game).
I stopped the game when my program reported the promotion in the main variation.

So, lessons learned: 1. I need a better evaluation function. :P 2. It can probably still be fairly simple, especially if that means it's also still fast. 3. Search depth alone doesn't mean a thing if your evaluation is really bad (well, I knew that).
I will probably have to adjust how the different extensions and reductions work too, which means the speed of my program is likely to go down. Still, that's ok, as long as it increases in strength as it does so.

Quote:

Famous example is Rybka - when it first appeared in the end of 2005 it was 1. Very strong 2. Showing rather low node-counts. So most of people concluded that it was slow and had very accurate evaluation. Later it turned out that Rybka in fact is a crazily fast searcher with relatively primitive evaluation. It was manipulating the node-count to make it look slow.

Evil!

From what I remember, programs like Fruit and Glaurung also have fairly simple evaluation functions (although they're still much better than mine) and simply "out search" the competition.

Quote:

Do you score piece mobility? Like, scoring higher when there are many squares a piece can go, and lower if there are few? This may already help with the bad bishops.

Nope. On the list.
Should definitely help with bad bishops, although I also had another idea for that: a bishop is marked "bad" if it is on the flank and its forward advance is blocked by its own pawns. I haven't worked out the details, and I'd need to somehow catch situations like Bd4, Pe5/c5/b4f4, which is also not so great.
Or forget about bad bishops entirely and rely on the concept of colour weakness to handle the situation (although in that case I would still withold the bonus for having the bishop pair if one of the bishops is "bad").

Quote:

Yes it takes lot of time, but it is necessary if your goal is high. As your program gets stronger, it will be more and more difficult to find an improvement. Most of time will be spent tuning parameters and testing the difference. And the difference will typically be very small. So small that it will need many thousands of games to detect with significance. Accumulating these small differences you'll still be able to progress, but strong testing strategy is essential for this.

I know. For now, my goal is simply to write the best chess program that I can write. It's a spare-time project, so I have a limited amount of time to spend on it.
But, we'll face that burden once my program gets a little stronger.

Quote:

You'd be surprised to know how many engine authors don't get this simple idea.

Really? It's not really any different from humans playing humans: you don't learn what your weak spots are unless you play against evenly matched opponents. And without knowing that you cannot advance.

Quote:

Many test their engine agains itself,

For something like Crafty, this is probably fine: the old version is strong enough and tested enough that it isn't likely to have major weaknesses.

Quote:

It's difficult to choose without knowing approximate strength of your engine.

Indeed. Well, at least I have a temporary target to beat now. :)

Quote:

These two engines are fun because they are open source, and the source is just about 2 KB of C code: Toledo Nanochess [nanochess.110mb.com], micro-Max [home.hccnet.nl]. (The latter one is from Netherlands too). These two guys are competing for the smallest C source that can still beat the other one. (Both claim that their engine is stronger). Since your engine is probably more than 2 kb of code, it's worth checking that it can beat these two. :D

Yes, it's about 20xthat, but that includes variable names of more than one character. ;)
Those will be my next targets.

Anyway, onwards to play with the evaluation function!
(Weekend, bed now, work tomorrow).

Kirr
Member #5,060
September 2004
avatar

Evert said:

True. I did a quick test (by hand): running my program against TSCP (at a fixed time of 5s per move), which is listed as one of the weaker programs on the pages you linked to (no particular reason, I just happened to have a copy of it because I wanted to have a look at how it retrieves its principle variation).
It has a very simple evaluation function, but it's still much more advanced than mine, which has material and some odd tweaks for material imbalances and pawn structure and that's it. Most of it is supposed to be framework for when I do get around to implementing a better one (I have a long list of things I want and notes on how I want to implement them). It doesn't even have piece square tables, so my program has no notion of where it should sensibly put its pieces. Well, it has them (I copied them from my previous attempt), but they're unused.
Throughout the game, my program outsearched TSCP by 5 or 6 ply (13 or 14 ply in the end game) and fairly often, the main variations agreed for the first four or five moves - so the greater search depth at least partially compensated for the horrible evaluation function.

Both programs played fairly horribly, although TSCP at least put its pieces in semi-sensible locations (it then didn't know what to do with them, of course). I had been hoping that there would be some deep combination that would win material and allow my program to win, but of course that is statistically unlikely if your pieces are badly placed to begin with. Conversely, if your pieces are placed right, there probably is some deep combination somewhere - which my program saw even when TSCP didn't, which meant it gladly gave a pawn to prevent losing a minor piece in some deep combination that was beyond TSCP's horizon. My program then proceeded to trap its bishop on a7 (the classical Bxa7 b6) and entered the end game with two pawns and a knight down. It managed to get down to knight+rook pawn versus king, which you can draw in some situations but you need to know what colour square your king needs to be at (the knight must be able to drive the king out of the corner), which was beyond my programs horizon (maybe not if it had been allowed to search for the full 5 seconds, but it was also limited to 20 ply, which it reached in 0.5 seconds at this stage of the game).
I stopped the game when my program reported the promotion in the main variation.

Fun! It will be even more fun when you can let the two programs play automatically, so you can sit back and watch. :)

Evert said:

So, lessons learned: 1. I need a better evaluation function. :P 2. It can probably still be fairly simple, especially if that means it's also still fast. 3. Search depth alone doesn't mean a thing if your evaluation is really bad (well, I knew that).
I will probably have to adjust how the different extensions and reductions work too, which means the speed of my program is likely to go down. Still, that's ok, as long as it increases in strength as it does so.

[...]

From what I remember, programs like Fruit and Glaurung also have fairly simple evaluation functions (although they're still much better than mine) and simply "out search" the competition.

Yeah. Historially computer chess was considered as kind of AI application, very complex eval was respected. Though some were successful with very simple eval and fast search (like Fritz 5). It was always a big debate. The turning point happened I think with the emergence of Fruit. It was amazing example of a fast searcher with clean and simple evaluation. Now most people don't think computer chess is about intelligence anymore, but about outsearching the opponent. Fruit's influence on computer chess was enormous. Rybka, Fritz 11, Naum 4, Loop, Onno owe a lot of their strength to Fruit (they don't admit it though), plus countless Fruit derivatives (Toga, Grapefruit, Cyclone, TheMadPrune, etc).

Of course this is mainly about the competition at the top. In the amateur area anything is fine as long as it works. Personally I don't like to see other people's code as it spoils the fun of coming up with my own solutions.

Evert said:

Nope. On the list.
Should definitely help with bad bishops, although I also had another idea for that: a bishop is marked "bad" if it is on the flank and its forward advance is blocked by its own pawns. I haven't worked out the details, and I'd need to somehow catch situations like Bd4, Pe5/c5/b4f4, which is also not so great.
Or forget about bad bishops entirely and rely on the concept of colour weakness to handle the situation (although in that case I would still withold the bonus for having the bishop pair if one of the bishops is "bad").

I'd try the general piece mobility first - it is a robust and rightly popular idea. Anything specific to bishop-pawn placements will likely slow you down for too little benefit. (IMHO). :)

Evert said:

Really? It's not really any different from humans playing humans: you don't learn what your weak spots are unless you play against evenly matched opponents. And without knowing that you cannot advance.

Actually the most common mistake in testing is simply not testing enough. Often people run a couple of 100-game matches, and based on that decide if the improvement is good or bad. If the improvement is worth 10 Elo points, that's invisible in the noise with this number of games. So their progress is forward or backward depending on random chance. And having too strong or too weak opponents of course will increase the number of games necessary for the same error bars. Anyway, we can get back to that later when you have UCI protocol done. :)

Evert said:

For something like Crafty, this is probably fine: the old version is strong enough and tested enough that it isn't likely to have major weaknesses.

Crafty is being tested against selection of a few opponents (less than 10 IIRC). The opponent selection is very bad, it includes Glaurung and some other much stronger engines. But Bob is able to run millions of games on his cluster, so he can measure very small improvements. The superior testing scheme plays important role in Crafty's steady improvement during the last few years.

Rybka is playing test games with itself, but in this case it's fine because there are simply no other opponents of comparable strength to play with. :)

Evert said:

Anyway, onwards to play with the evaluation function!

Check this out: CPW (also other materials in the wiki). :)

--
"Go to the NW of Stonemarket Plaza to the Blue Glyph Keeper Library door and enter."
- Lunabean's Thief Deadly Shadows Walkthrough, day eight

Evert
Member #794
November 2000
avatar

Kirr said:

Yeah. Historially computer chess was considered as kind of AI application, very complex eval was respected. Though some were successful with very simple eval and fast search (like Fritz 5). It was always a big debate. The turning point happened I think with the emergence of Fruit.

I would think it happened much earlier than that - when it became clear that a Shannon type B approach (only look at good moves, don't waste time with poor moves) could never beat a type A program (look at all moves). The first would be some sort of artificial intelligence (no human considers all moves in a position) whereas the second approach mainly needs a fast searcher.
The irony (if that is the word) is of course that pretty much all strong modern programs are a bit of a hybrid: they examine all moves, but extend lines that look interesting and reduce lines that seem poor. You then rely on the time you save to give you extra plies that will enable you to research nodes that you thought were poor but are actually good, the idea being that this happens relatively rarely.

Quote:

Fruit's influence on computer chess was enormous. Rybka, Fritz 11, Naum 4, Loop, Onno owe a lot of their strength to Fruit (they don't admit it though), plus countless Fruit derivatives (Toga, Grapefruit, Cyclone, TheMadPrune, etc).

I might be wrong, but wasn't Rybka originally a Fruit derivative?
I know there is a lot of debate about whether it still is (in which case it's in violation of the GPL) that I'm not particularly interested in diving into.

Quote:

Personally I don't like to see other people's code as it spoils the fun of coming up with my own solutions.

There's only so many ways in which you can write an alpha-beta search. ;)
I've mainly looked at other people's source (mainly Crafty and Glaurung) to make sure that I interpreted a particular algorithm correctly. I agree it is more fun to try to do your own thing (and sometimes reinvent the wheel, but if it's Robert Hyatt's wheel you reinvented there's no reason to feel bad about it).

Quote:

I'd try the general piece mobility first - it is a robust and rightly popular idea. Anything specific to bishop-pawn placements will likely slow you down for too little benefit. (IMHO).

I suspect it will actually score the same thing (more or less). Mobility for knights doesn't depend on pawn structure very much as long as they're somewhere on the middle of the board (depending a bit on whether you count destination squares attacked by enemy pawns in your mobility or not), for rooks the main mobility determinator would be that they're on (semi) open files. So the main mobility term would (generally) come from bishops, I would imagine.
We'll see. I was going to do mobility first anyway because it's easier (doesn't require pattern recognition). The tricky thing will be to do it somewhat fast.

Quote:

Actually the most common mistake in testing is simply not testing enough. Often people run a couple of 100-game matches, and based on that decide if the improvement is good or bad. If the improvement is worth 10 Elo points, that's invisible in the noise with this number of games.

Quite. A difference of ~100 Elo is something you could see though, but when your program is already strong it'll be hard to advance that much.

Quote:

Check this out: CPW [chessprogramming.wikispaces.com] (also other materials in the wiki).

Already have. Some of the stuff on there I find very useful (my static exchange evaluator is based on the description given there), some of it not so (when it's mainly code and not so much explanation in the article itself). I've also gone over Ed Schroeder's disection of the internal workings of Rebel several times, though I wish he'd written it in Dutch because his English sometimes isn't that good...
Lots of ideas to play with!

Kirr
Member #5,060
September 2004
avatar

Evert said:

I would think it happened much earlier than that - when it became clear that a Shannon type B approach (only look at good moves, don't waste time with poor moves) could never beat a type A program (look at all moves). The first would be some sort of artificial intelligence (no human considers all moves in a position) whereas the second approach mainly needs a fast searcher.
The irony (if that is the word) is of course that pretty much all strong modern programs are a bit of a hybrid: they examine all moves, but extend lines that look interesting and reduce lines that seem poor. You then rely on the time you save to give you extra plies that will enable you to research nodes that you thought were poor but are actually good, the idea being that this happens relatively rarely.

Well as you say no strong program looks at all moves to the equal depth anyway. However there was still considerable debate about the relative importance of eval vs search. Fruit's emergence and subsequent improvement of top engines underlined the importance of fast search and fast evaluation, even in the context of already highly selective search strategies.

Evert said:

I might be wrong, but wasn't Rybka originally a Fruit derivative?

People could be sued for saying something like this openly without hard proof, so let's just say that I believe that Fruit's influence on Rybka is more than is acknowledged. Note that this applies to a number of other commercial engines too, just Rybka receives the most attention.

Evert said:

I know there is a lot of debate about whether it still is (in which case it's in violation of the GPL) that I'm not particularly interested in diving into.

Yes there was a lot of debate in computer chess circles about it, and you are probably right to avoid getting into it.

Evert said:

I suspect it will actually score the same thing (more or less). Mobility for knights doesn't depend on pawn structure very much as long as they're somewhere on the middle of the board (depending a bit on whether you count destination squares attacked by enemy pawns in your mobility or not), for rooks the main mobility determinator would be that they're on (semi) open files. So the main mobility term would (generally) come from bishops, I would imagine.
We'll see. I was going to do mobility first anyway because it's easier (doesn't require pattern recognition). The tricky thing will be to do it somewhat fast.

I am happy with whatever works for you!

BTW, engine names shown in orange color in KCEC or CCRL lists are all open source. :)

--
"Go to the NW of Stonemarket Plaza to the Blue Glyph Keeper Library door and enter."
- Lunabean's Thief Deadly Shadows Walkthrough, day eight

Evert
Member #794
November 2000
avatar

Ok, I seem to have a working UCI interface - at least working well enough that I can play a game and everything seems to work more or less as it should. It doesn't do fixed-time-per-move yet and I'm sure there are other options I've missed, but the bottom line is: it's able to play a game through UCI. :)

I'll hopefully finish that up over the weekend and then get to work on the evaluation a bit more.

Kirr
Member #5,060
September 2004
avatar

Evert said:

Ok, I seem to have a working UCI interface - at least working well enough that I can play a game and everything seems to work more or less as it should. It doesn't do fixed-time-per-move yet and I'm sure there are other options I've missed, but the bottom line is: it's able to play a game through UCI. :)

Cool! Please post any match results!

Evert said:

I'll hopefully finish that up over the weekend and then get to work on the evaluation a bit more.

If you make a simple web-page with short description and download link, you can announce the engine and enter tournaments. :)

--
"Go to the NW of Stonemarket Plaza to the Blue Glyph Keeper Library door and enter."
- Lunabean's Thief Deadly Shadows Walkthrough, day eight

 1   2   3   4 


Go to: