line (deprecated)
Theline
filter lets CQL look ahead
or look backwards to determine what follows or precedes the current
position.
Regular expressions
The syntax for line
is based on the familiar regular expression syntax.
Regular expressions are usually used as a way to search for strings of characters inside text. For example, the regular expression
That was a (good )+gamecould be used to find instances of "That was a good game" or "That was a good good game" or "That was a good good good game" and so on.
To convert the concept of ordinary regular expressions to CQL and apply it on chess positions, two changes need to be made.
Regular expressions on trees
Usually regular expressions are applied to a linear string of characters, like a text file. However, we can imagine a tree each node of which is labeled with a single characters. Each path from the root to a leaf of the tree denotes a particular linear string. We can match a regular expression to the tree by matching the expression to one of the strings from the root to the leaf.
Regular expressions in chess
In normal regular expressions, we match a string made up of characters. In chess, instead each character can be any chess position. So a "string" of chess positions is just a linear sequence of chess positions, arranged to make a legal game.The regular expression itself, now uses filters to match a particular linear sequence of chess positions. For example, the regular expression
check*will match any sequence of 0 or more positions in which one side is in check. Likewise
check+ --> stalematewould match any sequence of 1 or more positions of check, followed by a stalemate. (The --> is used to separate different regular expressions)
Regular expressions in chess trees
Now we combine the concepts of regular expressions on trees and of regular expressions in chess to get what we want: regular expressions on chess trees. A chess tree is just a game tree: each node is a position, and a node is connected to its children by chess moves, according to the PGN file.Thus, a chess regular expression matches a chess tree if there is a string of chess positions from the root of the tree to a leaf of the tree that is matched by the regular expression.
Regular expression chess syntax
CQL supports most of the usual character regular expression syntax, applied to filters instead of to other character regular expressions:
regexp | meaning | character example | chess example |
---|---|---|---|
. |
any | .z | . --> |
* |
zero or more repetitions | z* | check* |
+ |
one or more repetitions | z+ | check+ |
? |
optional | z? | check? |
{} |
repetition | z{2,3} | check{2,3} |
() |
grouping | (yz)* | (check --> move from k)* |
line syntax
A line filter consists ofline
- optional parameters
<--
or-->
- constituents separated by
<--
or-->
- a range indicating that the length of the sequence of positions found must be in the range to match;
firstmatch
indicating search should stop after the first sequence of positions matching the constituents is found;lastposition
indicating that the position at the end of the sequence of positions that is found is to be returned (rather the length of the sequence)singlecolor
indicating that only positions whose side to move is the same as to side to move of the position at the start of the line are to be consideredquiet
indicating that comments automatically generated byline
are suppressed (these comments are also suppressed if quiet is set in the CQL header)nestban
indicating that a position cannot start a line if that position was already part of a matching sequence of the same line filter (from an earlier position)primary
when moving to the next position, only consider positions resulting from primary movessecondary
when moving to the next position, only consider positions resulting from secondary moves
line --> check //current position is a check line --> check*>5 // at least 5 checks line 5 100 nestban --> check* line --> check --> move from k --> R attacks _ line --> (move from k--> move from K)+
<-- and -->
The arrows<--
and -->
are used inside of a line
filter to denote the
direction of motion through the game tree and to separate individual constituents.
A right arrow -->
denotes that the direction
is towards future moves: the line
is looking forward. The -->
comes after
the line
keyboard and also between entries in parentheses.
If the -->
is replaced by the left arrows <--
then the line will look backwards, towards the
past and previous moves.
A given line
filter can use only one type of arrow: either <--
or -->
.
A constituent is a either a filter, or it is formed from another constituent by using the special
regular expression characters: +
, *
, ()
, {}
, or ?
.
line filter semantics
We begin by discussing semantics when all the arguments are filters.line --> filter1 --> filter2 --> filter3 ... --> filternmatches the current position if:
- filter1 matches the current position;
- filter2 matches the next position following the current position
- filter3 matches the next position after the next position following the current position
- ... and so on
If variations
is set, then line
will also consider positions that move
into variations. Otherwise, only mainline positions are considered.
line --> . --> check --> check --> check --> stalemate
regular expression special characters
A regular expression always matches the longest possible sequence of positions. The specific characters in detail:The '*' symbol
'*' means "repeat 0 or more times".
Thus, consider the filter
line --> not check --> check*The last 'check' is modified by the '*' so it is repeated 0 or more times. Thus, the expression is equivalent to:
line --> check // repeat 0 times or line --> not check --> check // repeat 1 times or line --> not check --> check // repeat 1 times --> check // repeat 2 times or line --> not check --> check // repeat 1 times --> check // repeat 2 times --> check // repeat 3 times ....; and so on foreverTherefore, the line filter matches a position if either:
- The current position is not a check, or
- The current position is not a check and the next position is a check, or
- The current position is not a check and the next position is a check and the following position is a check... and so on.
Because of the rule above about matching longest sequence of position, the actual filter will match the longest possible sequence of checks that it can
starting from a not check
in the current position.
The '+' symbol
The '+' symbol following a constituent means "repeat 1 or more times". Thus,line --> move from Q --> move from [Kk]+ --> matewill match any position from which the next move is a move by the White queen; following which is a sequence of one or more positions from which a King moves; following which there is a position that is mate.
{ range } repetition symbol
Ifn
is a literal number, like 2
, then { n
} denotes repetition of the
preceding constituent exactly n
times. Likewise, {, n
} denotes repetition between 0 and n
times and {n , }
denotes repetition at least n
times. If m
is also a literal number, then { m , n}
denotes repetition between m
and n
times inclusive.
For example,
line --> move from Q --> not move from Q {10,20} --> move from Q
will match a sequence of moves that begins with a move of the a white queen, then has between 10 and 20 consecutive moves of any piece other than a white queen, and ends with a move of a white queen.
Version Notes: the CQL 6.0 syntax allowed a space character instead of the comma as a separator. This syntax is now supported but deprecated.
In CQL 6.0, the regular expression signal {..}
was inefficient for a large number of repetitions. That is no longer the case: range-repetition in CQL 6.1 has efficiency comparable to *
and +
.
The '?' symbol
The '?' following a constituent means "repeat 0 or 1 times". For example,line --> move from Q --> move from k? --> matemeans
line --> move from Q --> mate or line --> move from Q --> move from k --> mateThat is, either White delivers mate with the Queen, or after White's Queen move, black delivers mate by a King move.
The '()' wildcard symbol
A sequence of filters separated by arrows inside parentheses matches a sequence of consecutive positions that match the filters respectively. This construct is used exclusively with wildcards.For example, suppose you want to match a white queen move followed by a black move followed by a sequence of White checks by a pawn followed by black king moves followed by mate. You can use this:
line --> move from Q --> btm --> (move from P --> check)+ --> wtm --> mate
value of line filter
By default theline
filter has as value the number of positions in the longest sequence of positions that was matched. Thus, a line filter can be sorted by
putting it in a sort filter:
sort {line --> check*} >= 5or equivalently
sort line 5 1000 --> check*
{+} and {*}
Sometimes it can be confusing to clearly distinguish between '+' and '*' as arithmetic operators and as wildcards. If you want to be absolutely sure these symbols are interpreted as wildcards, enclose them in braces.
Linearization: using move inside line when variations is set
When variations is set in the CQL header,line
evaluates its constituent move filters differently from usual. (The rules below seem complicated, but they give the intuitive behavior).
The problem these rules are designed to address is that the move
filter matches a position X based on every possible
move that arises from the position. However, as the line
filter goes down the game tree, it only selects a single move at
a time from X to evaluate. This can mean that the line
filter winds up traversing a move from line that the move
filter already rejected.
Suppose for example that a user wants to match positions that from which a bishop promotes, and from which the game lasts at least 10 moves. A natural way to write this is this:
cql (input test.pgn variations) line 10 1000 --> move promote B --> .*Here, the user saying that in the current position, one side promotes to a bishop, and that following that move, there are 0 or more other moves. The range
10 1000
in the next says there must be at least 10 positions contained in the matched sequence.
Now conside the following excerpt from a PGN game:
23. e8Q (23. e8B? Rf4+! =) 23. ...Ng1 24. {many moves omitted} ... 44. Rf3 #Call the position before white's 23'd move above X. The user does not want the CQL file to match X, because there is only one move following the bishop promotion. However, the above CQL would match the position without the rule specified below:
- When X is evaluated, the move filter evaluates to true, because there is a matching move (a bishop promotion) from X.
- From X, the
line
filter then evaluates the position after23. e8Q
. This position certainly matches the next constituent of theline
filter, namely.*
. Theline
filter goes on to match the position after23...Ng1
and so on, all the way until44. Rf3#
- Since this
represents at least 40 position, the whole
line
filter matches X.
To prevent this behavior CQL uses a rule called linearization. Under linearization, before evaluating
any filters, the line
filter chooses a single "line" in the game tree, from the root to a terminal position.
All the move
filters are then evaluated as if that "line" gave the only valid moves from a position. The remaining
moves are discarded.
For example, in the PGN excerpt above, there are two lines:
- the line
23. e8Q Ng1 ... 44. Rf3#
is one line. When the next filter tries to choose the this line, the bishop underpromotion disappears from the game. Thus, the move filter in the CQL file,move promote B
no longer matches the position X - the second line is
23. e8B Rf4+
. When theline
filter selects this line, then the position X now does match themove
filter, since there is bishop promotion at move 23. However, this line only has a length of three positions (X, and the positions after the next two moves). Since the length of this line is not long enough to fall within the10 1000
range of theline
filter, theline
filter does not match X
Disabling linearization
Suppose a line filterL
contains a filter F
that contains a move filter M
. If F
is a : filter, another line
filter, a move previous filter, a find filter, or an echo filter, then linearization of L
does not affect M
.
Thus, linearization may be disabled for a particular move
filter inside a line
filter by preceding the move filter with currentposition:
, for example
If the primary
parameter is set for a line
filter, only primary variations for a given move
filter are considered. If the secondary
parameter is set for a line
filter, only secondary variations for a given move
filter are considered.
line --> move from q //linearization applies --> currentposition: {move from R and move from B} // no linearization --> mate
To turn off linearization completely, use the -nolinearize
option to cql
(this option is unsupported).
Using wildcards to look for maneuvers
Wildcards are useful in isolating the particular pieces you're interested in in a maneuver. For example, in turton.cql, there is a line filter with these subfilters:line --> ... --> {not move from Front or Side}* --> move from Side to Criticalsquare ...In this particular theme, without going into details, we are particularly interested in the movements of the pieces
Front
and Side
.
We want to track where they go. So the first wildcard operator ensures that that we ignore movements of pieces other than those, and just focus on
those pieces.