Textual regular expressions
In CQL, regular expressions are most commonly used in the ⊢ filter to search for sequences of positions. In the non-CQL world, regular expressions are most commonly used to search text for sequences of characters. See regular expressions . We call these regular expressions designed to search in text textual regular expressions to distinguish them from the chess kind.A texual regular expression is just a sequence of alphanumeric characters and spaces. Each textual regular expression will either match or not match a given string.
Some more examples:
Bor?gov
would match either Borgov
or Bogov
, but not Borrgov
, since a ?
denotes 0 or one repetition
Bor*gov
would match Bogov
or Borgov
or Borrgov
or Borrrgov
and so on, because *
denotes 0 or more repetitions.
Likewise,
Bor+gov
would match Borrrrgov
but not Bogov
as at least one repetition is needed.
Bo[r-t]gov
would match Borgov
or Bosgov
or Botgov
, just like in piece designators.
Bor{3,8}gov
would match Borrrrrgov
, anything with betweeen 3 and 8 r
s.
Bo.gov
would match Boxgov
or Bo_gov
or Bo gov
. Here, .
matches any character (just like in the rest of CQL, the . matches any square or any position).
Borgov[0-9]+
would match Borgov
followed by a nonempty sequence of digits, like Borgov0358
.
The parentheses denotes grouping, e.g.
Bor(go)+v
would match Borgov
and Borgogov
and Borgogogov
and so on.
Escaping special characters in regular expressions
Some characters have special meanings within textual regular expressions, namely{}()[].?+*\^$To use such a character literally inside a regular expression, precede it with a backslash. For example, to match the three character string
(c)
, use the regular expression \(c\). To search for a literal period in an an event, use \.
instead of .
.
other special characters
There are host of more obscure regular expression facilities, like\w
, \W
and many others. The two
most common such facilities useful in CQL are \d
, which matches any digit (is thus equivalent to [0-9]
); and \n
, which matches a newline character.
version of regular expressions used
CQL currently uses the default ECMAScript variant of regular expressions as interpreted in C++. However, we recommend that users generally use only the basic regex syntaxes discussed here unless they are already familiar with this variant. Moreover, we have observed some lack of uniformity in the implementation of this standard among the different C++ compilers on which CQL is typically compiled.
Comparison of textual regular expressions to regular expression with ⊢
The meanings of textual regular expression operators ∙, * in ⊢, {} operator, + in ⊢, parentheses in ⊢, ? in ⊢ are the same as in ⊢, except for matching characters rather than positions. Note that the regular expression.
is now the center dot ∙
for compatibility reasons.
Brackets are used in piece designators similar to how they are in textual regular expressions.
The |
textual regular expression operator corresponds to the CQL or filter and is not a regular expression operator in ⊢.