next and previous filters
next
and previous
let CQL look ahead
and look backwards to determine what follows or precedes the current
position. Each of these modifies the value of the current position as CQL
traverses the positions following or preceding in the PGN file.
Regular expressions
The syntax fornext
and previous
is based on the familiar regular expression
syntax (e.g Wikipedia on regular expressions).
Regular expressions are usually used as a way to search for strings of characters inside text.
For example, the regular expression That was a (good )+game
could be used to find instances of
"That was a good game" or "That was a good good game" or "That was a good good good game" and so on.
To convert the concept of ordinary regular expressions to CQL and apply it on chess positions, two changes need to be made.
Regular expressions on trees
Usually regular expressions are applied to a linear string of characters, like a text file. However, we can imagine a tree each node of which is labeled with a single characters. Each path from the root to a leaf of the tree denotes a particular linear string. We can match a regular expression to the tree by matching the expression to one of the strings from the root to the leaf.Regular expressions in chess
In normal regular expressions, we match a string made up of characters. In chess, instead each character can be any chess position. So a "string" of chess positions is just a linear sequence of chess positions, arranged to make a legal game.The regular expression itself, now uses filters to match a particular linear sequence of chess positions. For example, the regular expression
check*will match any sequence of 0 or more positions in which one side is in check. Likewise
check+ stalematewould match any sequence of 1 or more positions of check, followed by a stalemate.
Regular expressions in chess trees
Now we combine the concepts of regular expressions on trees and of regular expressions in chess to get what we want: regular expressions on chess trees. A chess tree is just a game tree: each node is a position, and a node is connected to its children by chess moves, according to the PGN file.Thus, a chess regular expression matches a chess tree if there is a string of chess positions from the root of the tree to a leaf of the tree that is matched by the regular expression.
Regular expression chess syntax
CQL supports most of the usual character regular expression syntax, applied to filters instead of to other character regular expressions:
regexp | meaning | character example | chess example |
---|---|---|---|
. |
any | .z | . check |
* |
zero or more repetitions | z* | check* |
+ |
one or more repetitions | z+ | check+ |
? |
optional | z? | check? |
{} |
repetition | z{2,3} | check{2 3} |
() |
grouping | (yz)* | (check move from k)* |
next and previous syntax
A next filter consists of the wordnext
, an optional range,
the optional keyword nestban
,
and a list of constituents enclosed in parentheses:
next (check) ; current position is a check next 5 100 (check) next 5 100 nestban (check*) next (check move from k attack (R _) next 5 100 ((move from k move from K)+)A previous filter is the same, except
previous
is used instead of next.
A constituent is a either a filter, or it is formed from another constituent by using the special
regular expression characters: +
, *
, ()
, {}
, or ?
.
next filter semantics
We begin by discussing semantics when all the arguments are filters.next (filter1 filter2 ... filtern)matches the current position if:
- filter1 matches the current position;
- filter2 matches the next position following the current position
- filter3 matches the next position after the next position following the current position
- ... and so on
If variations
is set, then next
will also consider positions that move
into variations. Otherwise, only mainline positions are considered.
For example,
D. Gurgenidze
1.p Polish Chess Federation ty
1985
Left diagram: 8.Qe3+! (current position) 8..Kxe3+ (position 2) 9.Rg3+ (position 3) 9..K-e4 stalemate (position 4 and right diagram)
next (check check check stalemate)The above study matches following 8.Qe3+! because the current position is check, positions 2 and 3 are also checks, and position 4 is stalemate.
regular expression characters with next
A regular expression always matches the longest possible sequence of positions. The specific characters in detail:The '*' symbol
'*' means "repeat 0 or more times".
Thus, consider the filter
next (not check check*)The last 'check' is modified by the '*' so it is repeated 0 or more times. Thus, the expression is equivalent to:
next (not check) ; repeat 0 times or next (not check check) ; repeat 1 times or next (not check check check) ; repeat 2 times or next (not check check check check) ; repeat 3 times ....; and so on foreverTherefore, the next filter matches a position if either:
- The current position is not a check, or
- The current position is not a check and the next position is a check, or
- The current position is not a check and the next position is a check and the following position is a check... and so on.
Because of the rule above about matching longest sequence of position, the actual filter will match the longest possible sequence of checks that it can
starting from a not check
in the current position.
The '+' symbol
The '+' symbol following a constituent means "repeat 1 or more times". Thus,next (move from Q move from [Kk]+ mate )will match any position from which the next move is a move by the White queen; following which is a sequence of one or more positions from which a King moves; following which there is a position that is mate.
{ range } repetition symbol
When the braces '{' and '}' enclose a range they denote repetition of the preceding constituent a number of times that lies inside the range. For example,next (move from Q not move from Q {10 20} move from Q)will match a sequence of moves that begins with a move of the a white queen, then has between 10 and 20 consecutive moves of any piece other than a white queen, and ends with a move of a white queen.
The '?' symbol
The '?' following a constituent means "repeat 0 or 1 times". For example,next (move from Q move from k? mate)means
next (move from Q mate) or next (move from Q move from k mate)That is, either White delivers mate with the Queen, or after White's Queen move, black delivers mate by a King move.
The '()' wildcard symbol
A sequence of filters inside parentheses matches a sequence of consecutive positions that match the filters respectively. This construct is used exclusively with wildcards.For example, suppose you want to match a white queen move followed by a black move followed by a sequence of White checks by a pawn followed by black king moves followed by mate. You can use this:
next (move from Q btm (move from P ; white moves, giving check check ; black to move and is in check )+ ; repeat wtm ; white delivers... mate) ; mate to black
counting with next
next
is countable when it has a range parameter. It
counts the number of matched positions.
For example,
next 5 1000 (check*)will match positions from which, starting at the current position, at least 5 consecutive checks occur.
Because next
with a range is countable, it may be
sorted on:
sort next 5 1000 (check*)The output games can then be sorted based on the longest matched sequence of at least 5 which is just the longest sequence of checks.
Counting sequences: nextban
There is one problem with the codesort next 5 1000 (check*)If there is a sequence of say 10 consecutive checks, then there will be a match on 5 of these. This will make the output difficult to read, with excessive "MATCH" being printed or annotations. It is clearer to require the sequence to start with at the earliest possible move. This can be done using the
nestban
keyword. The nestban
keyword comes just before the left parenthesis. It indicates that the next
filter should not match a position
that is part of a sequence of positions that was already matched by the same filter.
sort next 5 1000 nestban (check*)You can see this used in many of the examples of longest sequence searching.
Note: If a next
filter is inside a transform, then it is implicitly transformed
into multiple distince next
filters. These can sometimes result in confusing comments when using nestban
.
Linearization: using move filter inside next when variations is set
Whenvariations
flag is set in the CQL header, next
evaluates its constituent move filters a bit differently from usual. (The rules below seem complicated, but they give the intuitive behavior).
The problem these rules are designed to address is that the move
filter matches a position Xbased on every possible
move that arises from the position. However, as the next
filter goes does the game tree, it only selects a single move at
a time from X to evaluate. This can mean that the next
filter winds up traversing a move from X that the move
filter already rejected.
Suppose for example that a user wants to match positions that from which a bishop promotes, and from which the game lasts at least 10 moves. A natural way to write this is this:
cql (input test.pgn variations) next 10 1000 (move promote B .* )Here, the user saying that in the current position, one side promotes to a bishop, and that following that move, there are 0 or more other moves. The range
10 1000
in the next says there must be at least 10 positions contained in the matched sequence.
Now conside the following excerpt from a PGN game:
23. e8Q (23. e8B? Rf4+! =) 23. ...Ng1 24. {many moves omitted} ... 44. Rf3 #Call the position before white's 23'd move above X. The user does not want the CQL file to match X, because there is only one move following the bishop promotion. However, the above CQL would match the position without the rule specified below:
- When X is evaluated, the move filter evaluates to true, because there is a matching move (a bishop promotion) from X.
- From X, the
next
filter then evaluates the position after23. e8Q
. This position certainly matches the next constituent of thenext
filter, namely.*
. Thenext
filter goes on to match the position after23...Ng1
and so on, all the way until44. Rf3#
- Since this
represents at least 40 position, the whole
next
filter matches X.
next
filter chooses a single "line" in the game tree, from the root to a terminal position.
All the move
filters are then evaluated as if that "line" gave the only valid moves from a position. The remaining
moves are discarded.
For example, in the PGN excerpt above, there are two lines:
- the line
23. e8Q Ng1 ... 44. Rf3#
is one line. When the next filter tries to choose the this line, the bishop underpromotion disappears from the game. Thus, the move filter in the CQL file,move promote B
no longer matches the position X - the second line is
23. e8B Rf4+
. When thenext
filter selects this line, then the position X now does match the move filter, since there is bishop promotion at move 23. However, this line only has a length of three positions (X, and the positions after the next two moves). Since the length of this line is not long enough to fall within the10 1000
range of thenext
filter, thenext
filter does not match X
Exceptions to linearization
Linearization is not applied in two cases. These rarely arise and this section can be safely skipped on first reading:- If the
next
filter never visits or needs to visit any positions following the current position, then there is nothing to linearize. For example, if the move filter innext (move to _ )
is not linearized, since there is only one filter in thenext filter
so thenext
filter never visits any other positions. - If a
next
filter (or any filter than can modify the current position) is inside anothernext
, the linearization of the outernext
filter does not affect the linearization of the innernext
.
move
filter inside a next
filter by wrapping
the move filter in another next
filter.
To turn off linearization completely, use the --nolinearize
option to cql
(this option is
unsupported).
previous
previous
is just like next
except that it searches backwards. Also, linearization does not apply to previous
Using wildcards to look for maneuvers
Wildcards are useful in isolating the particular pieces you're interested in in a maneuver. For example, in turton.cql, there is a next filter with these subfilters:next (... {not move from $front or $side}* move from $side to $criticalsquare ... )In this particular theme, without going into details, we are particularly interested in the movements of the pieces
$front
and $side
.
We want to track where they go. So the first wildcard operator ensures that that we ignore movements of pieces other than those, and just focus on
those pieces.
pitfalls in using previous and next
There are several issues to keep in mind when using these filters.
- Always remember that the first filter in the
next
orprevious
filter's arguments matches the current position. This can be confusing, since the words 'next' and 'previous' might suggest that the first filter would match the next or previous positions, but they don't. - because
next
and previous each take arguments that are list of filters separated by whitespace, and because these constituent filters can be fairly complex, it is important to be careful as to which filters are matching a position and which are part of a larger filter. It is easy to write say "mainline check" thinking that this is one filter, when in fact it is two filters that will matched to two successive positions inside an argument list.We recommend liberal use of braces to make the code clearer if there is any question. Always begin by testing the simplest possible next or previous filter, then adding in more complex arguments as necessary.
-
It is important to keep in mind that the current position
changes as the next or previous filter matches positions. When that
changes, so do all the
piece
variables. -
next
andprevious
by default print a lot of annotations. These annotations can be confusing or overwhelming with wildcards, because an annotation will be printed for every matched position. Use silent to silence the annotations. Examples