⊢ (or path)
The⊢
filter comprises the ⊢
symbol followed by ⊢
parameters followed by one or more ⊢
constituents. The ⊢
symbol may be replaced with the keyword path
. The ⊢
symbol is typically called a "turnstile", although that term has no special significance in CQL.
Loosely speaking, the ⊢
filter matches a position if its constituents describe in order a sequence of moves starting from the current position.
For example, consider the ⊢
filter below:
⊢ ♗×♟h7 check ♚×h7 ♘――g5 check
This filter matches a Greek gift sacrifice: White sacrifices his bishop for Black's h7
pawn with check; Black recaptures with the ♚; White checks with the ♘ on g5
.
This filter has no ⊢
parameters and has five constituents:
♗×♟h7
-
check
-
♚×h7
-
♘――g5
-
check
Note that this notation is extremely close to ordinary chess notation.
The ⊢
parameters control which moves are considered in matching constituents; when a path of moves is considered to match; and what specific value is returned by the ⊢
filter. The list of constituents can include regular expressions allowing for repetition of some constituents. Keep in mind that these bells and whistles are just variations on the ordinary chess move notation illustrated above.
terminology
We say a path is a sequence of consecutive moves appearing in the game tree, that is, in the PGN file. A matching path is a path that satisfies the constraints of a specific⊢
filter. The start position of a ⊢
filter is the current position when that filter is evaluated. Note that the same ⊢
filter can be invoked with many different start positions.
A ⊢
filter matches moves, not positions (like the previous line filter). For example, a ⊢ filter returns the number of moves in its matching path, not the number of positions. The position from which a given move is made is called the head of the move; the position that results from making that move is the tail of the move. Sometimes we identify the tail of the move with the move itself.
⊢ parameters
The⊢
parameters control which moves the ⊢
filter considers, what value it returns, and so on. By default, ⊢
will try and find the longest path of moves that match the constituents, and return this length.
parameter | argument | description |
---|---|---|
primary | Only consider primary moves | |
◎ | set filter | Only consider moves by specified pieces |
◎ capture | set filter | Only consider moves by, or capturing, specified pieces |
max | numeric filter | maximize the value of the given filter at the tail of the last move in the path |
verbose | annotate each move of the path | |
keepallbest | retain comments associated to each matching path | |
title | string literal | when verbose is used, describe the path by that string literal |
piecepath | when ◎ is used, annotate the matched path with the sequence of squares taken by each focused piece | |
quiet | suppress automatically generated comments | |
nestban | If the start position of the path has already been included in another matched path of this filter with a different start position, skip | |
firstmatch | stop after the first matching path is found | |
lastposition | return the tail of the last move in the path |
Most of these parameters are fairly self-explanatory or are similar to parameters of the same name in line. One exception is the new ◎ (or focus
) parameter:
◎ (or focus)
◎ constrains the effect of dash constituents within the scope of⊢
. See here for more information on ◎.
⊢ constituents
The body of a⊢
filter consists of a sequence of one or more constituents. The sequence of constituents may be terminated by a blank line (which will terminate all enclosing ⊢
filters as well).
A constituent is either:
- a ―― constituent, which has the syntax of an ordinary ―― or × filter but without the target filter; or
- a filter constituent, which is an ordinary filters;
- a chain constituent, which consists of one or more constituents enclosed in parentheses; and
- a
*
constituent, comprising a constituent followed by a*
; - a
+
constituent, comprising a constituent followed by+
; - a
?
constituent, comprising a constituent followed by?
; - a brace repetition constituent, comprising a constituent followed by a braced repetition signal like
{2,}
or{1,4}
.
The last 4 constituent types above are called repetition constituents. A ―― or filter constituent is called a simple constituent.
matching the a ⊢
filter with simple constituents
If each constituent is either a ―― constituent or a filter constituent, then whether a path of moves matches the ⊢
filter is determined as follows.
Suppose the path X of moves consists of consecutive moves X1, X2,...Xn
Let Y be a sequence of simple constituents Y1,...,Yk.
If Y is nonempty, let Y' denote the sequence Y with its first element removed: Y2,...,Yk. Similarly, let X' denote X with its first element removed: X2,...Xn
We say a ―― constituent D matches a move M if D matches M when D is considered as a normal ―― filter. Recall that a ―― constituent is simply a normal ―― filter without any target filter.
Then X matches Y if:
- Y and X are both empty; or
- Y1 is a filter constituent that matches the current position and X matches the sequence Y'; or
- Y1 is a ―― constituent, and
- Y1 matches the move X1 , and
- Y' matches X' when the current position is set to the tail of X1 .
In other words each ―― constituent is paired one-to-one against each move in order. Each filter constituent is checked against the current position. When a ―― constituent is matched against a move, the current position is updated to the tail of the given move. (Of course, the current position is restored after all the matching is complete).
For example, suppose the current position is the start position of the chess game, and the next moves in the PGN file begins
1.e4 e5 2. ♘f3
Suppose the ―― filter is
⊢ ♙――e4 not check ♟――□ ―― ♗→d3
Here, the ―― body contains 5 constituents. Three of the constituents are ―― constituents and the other two are filter constituents. The given sequence of constituents matches the moves listed because the dash constituent
♙――e4matches the move
e4
. After making this move, the current position is set to the resulting position, and the
filter constituent not check
matches that position. Therefore, the next constituent,
♟――□indicating a move of a black pawn to an empty square is checked. This matches the next move listed,
e5
.
The last ―― constituent, ――, matches any move, specifically ♘f3
. Finally, the final constituent, the filter constituent
♗→d3matches the position after
♘f3
.
Because the matched sequence of moves has three moves in it, the given ⊢
filter would have the numeric value 3 when evaluated in that start position on the given PGN file.
matching the chain constituent
A simple chain constituent is a chain constituent all of whose elements are either simple constituents or other simple chain constituents. Inother words, there are no repetition signals inside the chain constituent.A simple chain constituent
( Y1 ... Yn )
all matches a sequence of moves X exactly when the constituents with the outer parentheses removed matches the sequence.
Likewise, any sequence of simple chain constituents and simple constituents matches a sequence of moves if and only if the sequence of constituents formed by removing the outer parentheses from the chain constituents also matches the sequence.
For example, let us rewrite the early sequence of simple constituents to use a chain constituent:
⊢ ♙――e4 not check (♟――□ ―― ♗→d3)
This ⊢
filter now contains three (not five) constituents: two simple constituents and the final chain constituent, which itself contains the three simple constituents:
♟――□ ―― ♗→d3
This filter matches the exact paths that the earlier filter that did not parenthesize the last three constituents.
matching repetition constituents
If D is a dash filter and if F is a possibly empty sequence of filter constituents, thenD F Rdenotes that the sequence
D Fis repeated as many times as indicated.
For example, the single constituent
♕♛―― check +
is equivalent to either
♕♛―― check
or
♕♛―― check ♕♛―― check
or
♕♛―― check ♕♛―― check ♕♛―― checkand so on.
The repetition constituent
♕♛―― check +
will match a sequence of moves if there is some number, at least 1, of repetitions of ♕♛―― check
that matches that sequence.
As we see above, this means that there is a sequence of at least 1 consecutive queen check.
If C is a chain constituent and R is a repetition signal (*
, +
, ?
, or e.g. {2}
) then
C Rdenotes that the chain is repeated the number of times indicated by the repetion signal, which follow standard regular expression notation.
The chain C must contain a ―― filter if it is repeated.
In general, the body of a ⊢
filter can contain multiple repetition constituents. CQL conceptually matches these in two steps:
- Expand each repetition constituent so that it is transformed into a sequence of simple constituents
- Match the resulting sequence of simple constituents
By default, CQL selects the expansion of repetition constituents so as to maximize the length of the matched path of moves.
Notes on parsing ⊢
filter
Sometimes it can be difficult to determine whether certain symbols should be treated specially as constituents or as ordinary filters. This can depend on whether a particular token arises at the ⊢
top level or not.
A symbol or word inside of a ⊢
filter is at the ⊢ top level if it is not contained in braces that are themselves inside that ⊢
filter.
For example, consider this ⊢
filter:
⊢ △―― z=x+3 {z=x+3} {――} (legal ――b4) {legal ――b4}
Here, the first △――
is a normal ―― constituent. It is at the top level.
The next line z=x+3
is also at the top level, which is illegal here because the +
will be interpreted as a repetition signal. You must enclose any arithmetic filters in braces if you want to use them in a ⊢
filter.
The second z=x+3
is a normal filter and is perfectly legal. It is not at the top level.
The next ――
is also not at the top level, so it is treated as a filter not as a ――
constituent. The entire compound filter {――}
is filter constituent at the top level, but inside is just a normal filter.
The first attempt at a legal
filter is illegal because it is at the top-level: even though it is enclosed in parentheses, it is not enclosed in braces. The second attempt is not at the top level, so the ――b4
is treated as a normal filter modified by the legal
keyword.
To further understand the difference between filters and constituents, consider this ⊢
filter:
⊢ {△―― △――} ――
This will match any white to move non-terminal position. The first consituent is a filter constituent
{△―― △――}which matches any white-to-move position. The first
△――
matches a position where white makes a move, and then so does the second. These are treated as filters not as constituents because they are not at the top level.
On the other hand, if we remove the braces, then
⊢ △―― △―― ――
will not match any position. After the first △――
matches a move, if it does, the current position is set to the result of making that move, will be a black to move position. The second △――
will thus never match.
In the vast majority of ⊢
filters these issues never arise. But it is useful to have a clear understanding of how the parsing works when they do.
status filters
The state of the⊢
filter can be queried with path status filters, including pathstatus, pathlastposition, pathcount, pathcountunfocused