A string is a sequence of characters. A string filter is a filter whose value is a string. A literal string is a string enclosed in quotation marks. For example, "rooks" is a literal string with 5 characters.

As of CQL 6.1, strings are first-class datatypes. They can be assigned to variables, returned as the result of functions, compared, and used as arguments to sort.

Strings can be compared for equality using == and != just like other data types.

If x and y are strings, then x + y is their concatentation:

	"pin" + "mate" == "pinmate"

Strings can be compared using <= , < , >= , > using alphabetical order:

      "The file h1" > "The file H1"
      "" < "a"
      "A" < "a"  

Strings can be stored in variables like integers or sets of squares; they can be the result of another CQL expression; they can passed to functions:

      X= if (Y>0) "check" else "mate"
      X  "check"

table of filters manipulating strings

The following filters handle strings specifically:
~~regular expression matchingplayer ~~ "Ka.*ov"
\iget a capturing group\2 =="4a"
\-iindex of a capturing group\-2 ==4
#length of a string#"pin"==3
+concatenate strings"x"+"y"=="xy"
asciiconversion from to and from ASCIIascii "A"==65
ascii 65=="A"
currenttransformcurrent transformmessage currenttransform
specified PGN fieldplayer white == "Kasparov"
sort event
originalcommentthe comment in the PGN fileoriginalcomment ~~ "Eval: (\d+)"
dictionarystore and retrieve stringsdictionary D["hi"]="bye"
fenget FEN of current position as a stringY=fen
insubstring"et" in "Reti"
indexofindex of a substringindexof ("n" "pin")==2
intconvert string to intint "23"==23
lowercaseconvert string to lowercaselowercase "Tal"=="tal"
makesquareconvert string to squarea3==makesquare "a3"
max or min of its argumentsx=max("a" "b")
y=min("a" "b")
readfileread a string from a fileX=readfile "cook.cqo"
settagset value of PGN tagsettag("CustomTag" "Troitzky")
sortsort string filterssort player white
sort by stringsort by a string valuesort date
strconvert arguments to stringstr("X is: " X)
tagget a PGN tag valuetag "CustomTag"=="value"
uppercaseconvert string to uppercaseuppercase "Tal"=="TAL"
writefilewrite a string to a filewritefile("cook.cqo" X)

Predefined strings

There are special predefined strings: \n is string consisting of the linefeed character; \t is the tab character; \" is the quote character; \r is the carriage return character; \\ is the backslash character. Note that these predefined strings are not specially interpreted inside quoted strings (although they may be specially interpreted when used inside regular expressions with ~~ ):
      message ("The value of x is: " \n x)
      y = "pin" + \n + "mate"
      #("pin" + \n)  4
      #"pin\n"  5
      "pin\n"[3]== \\

These predefined strings are treated the same as quoted strings and are considered to be string literals.

Capturing groups

If i is a literal nonegative integer, then \i can refer to the i'th capturing group after a ~~ operation; \0 refers to the entire matched sequence of characters. See capturing groups in ~~ for more information. For example
    \0  "ello23"
    \1  23

indexing into strings

Strings are zero-indexed: the first character is at character position 0. Suppose i is a non-negative integer and x is a string.

If i < #x then x[i] is the character (that is, the length-1 string) at index i. If i >= #x then x [i] fails to match the position.

If i is negative, then it is first converted into #x + i and then the above rules are used. Thus, x[-1] is the last character of x (or it fails to match if x has length 0). Similarly, x[-2] is the next-to-last character of x unless x has fewer than 2 characters, in which case it fails to match.

More formally, an expression of the form x[i] for a string x simply matches if x and i each match the current position and i is nonnegative and less than #x.

An expression x[i] matches the current position whenever either x[i] simply matches or x[#x+i] simply matches.

An expression of the form x[m : n] matches the position whenever x, m and n match the position.

 "hello"[5] // false; does not match position
 "hello"[-100] // false; does not match
 ("hello" + "goodbye")[5]=="g"
 ("hello" + "goodbye")[#"hello"+3]=="d"
 "filename.cql"[-4:] == ".cql"

If m and n are nonnegative integers, and x is a string then

is the string consisting of those characters of x whose indices lie between m and n-1 inclusive .

If m is missing, it is taken to be 0. If n is missing, it is taken to be #x. If either m or n is negative, it is replaced non-recursively by #x + m; likewise with n. Thus, x[:5] are the first 5 characters of x :

 "mate"[0:2] == "ma"
 "mate"[1:2] == "a"
 "mate"[1:100]== "ate"
 "mate"[1:1]== ""
 "mate" [1:-1]== "at" 
 "mate" [-2:-1] == "t"
 "mate" [2:1]== ""

Assignment of strings

Strings can be assigned just like numbers:
  x  "a"
  x  "ab"

Indexed strings (when the string being indexed is a variable) can also be assigned.

  x="a" // x is "a"
  x[0]="b" // x is now "b"
  x[0]="hello" // x is now "hello"
  x[-2]="c" // x is "helco"
  x[5]="z" // expression fails to match; x is still "helco"

String ranges (when the index expression contains :) can similarly be assigned, and can be used to prepend or append to strings:

  x="a" // x is "a"
  x[0:0]="b" // x is "ba"
  x[2:2]="This" // x is "bahis"
  x[-3:-1]="HEY" // x is "baHEYs"
  x[2:4]="Z" // x is "baZYs"
  x[:2]="VV" // x is "VVZYs"
  x[2:]="" // x is "VV"

Performance notes when dealing with long strings

CQL is not particularly efficient when dealing with long strings, and does not generally support strings of more than a billion characters at all. CQL 6.1 sometimes makes unnecessary copies of string subexpressions, which can hurt performance when dealing with long strings. To avoid extra copies, use += rather than + for appending to a variable, and in general try to keep strings in variables.

(CQL can manipulate strings longer than a billion characters so long as the length of the string is never evaluated and the string is never indexed into; this technique, however, is not supported. For example, a multigigabyte file can be read using readfile and then each line parsed using

    BigFile=readfile "bigfile.pgn"
    while (BigFile~~.*){
      Line=\0 ...}

The warnings in this section apply only to long strings either generated in a CQL loop or read from readfile. For the kinds of strings typically found in pgn files - comments, tag values and so on, the issues discussed in this section do not arise.


Most of the string features, the scheme for integrating them into CQL 6.1 while maintaining back-compatibility with CQL 6.0, and considerable implementation assistance, are due to Robert Gamble.