CVC4 Native Input Language

The native input language consists of a sequence of symbol declarations and commands, each followed by a semicolon (;).

Any text after the first occurrence of a percent character and to the end of the current line is a comment:

%%% This is a native language comment

How to Use This Document

The example code given in this document can be run on CVC4 (found here). To look at models of the systems constrained by the example code, CVC4’s produce-models option must be turned on, on start-up. This can be done with the OPTION command.

Once all assertions are made, check the satisfiability using the CHECKSAT command. If the result is sat, to check a model, use the COUNTERMODEL command.

The general framework of commands to run the example commands, are thus, as follows:

OPTION "produce-models";
%type example code here
CHECKSAT;
%if result is sat, then
COUNTERMODEL;

Most examples follow this framework, with some slight variations that should be obvious to use. For more information on these commands, see Commands.

Type System

CVC4’s type system includes a set of built-in types which can be expanded with additional user-defined types. The type system consists of first-order types, subtypes of first-order types, and higher-order types, all of which are interpreted as sets. For convenience, we will sometimes identify below the interpretation of a type with the type itself.

First-order types consist of basic types and structured types. The basic types are $\texttt{BOOLEAN}$, $\texttt{REAL}$, $\texttt{BITVECTOR}(n)$ for all $n \gt 0$, $\texttt{STRING}$ as well as user-defined basic types (also called uninterpreted types). The structured types are array, tuple, record types, ML-style user-defined (inductive) datatypes, finite set, and relation types.

Note: Currently, subtypes consist only of the built-in subtype $\texttt{INT}$ of $\texttt{REAL}$. Support for CVC3-style user-defined subtypes may be added in a later release.

Function types are the only higher-order types. More precisely, they are just second-order types since function symbols in CVC4, both built-in and user-defined, can take as argument or return only values of a first-order type.

Basic Types

CVC4’s basic types are divided into the following types.

The BOOLEAN Type

The BOOLEAN type is interpreted as the two-element set of Boolean values TRUE; FALSE. The operations allowed on Boolean values are described in Logical Symbols.

Note: CVC4’s treatment of this type differs from CVC3’s where BOOLEAN is used only as the type of formulas, but not as value type. CVC3 follows the two-tiered structure of classical first-order logic which distinguishes between formulas and terms, and allows terms to occur in formulas but not vice versa (with the exception of the IF-THEN-ELSE construct). CVC4 drops the distinction between terms and formulas and defines the latter just as terms of type BOOLEAN. As such, formulas can occur as subterms of possibly non-Boolean terms.

Example:

% Boolean constants
a, b: BOOLEAN;

ASSERT a AND b => TRUE;
ASSERT (NOT b) <=> a OR b;

The REAL Type

The REAL type is interpreted as the set of real numbers.

Note that these are the (infinite precision) mathematical reals, not the floating point numbers. Support for floating point types is planned for future versions.

Arithmetic operators, defined in Arithmetic, are allowed on the Real type.

Example:

% Boolean constants
a, b: BOOLEAN;

% REAL constants
x,y: REAL;

% Boolean-valued function, can be used as a predicate symbol
p: REAL -> BOOLEAN;
ASSERT  a => p(x - y);

% Real-valued function with Boolean argument
f: BOOLEAN -> REAL;

% The argument of f can be a formula ...
ASSERT f(a OR b) > 0.3  OR  p(x);

% ... even a quantified one
ASSERT y = f(FORALL (r: REAL): r < r + 1);

The INT Type

The INT type is interpreted as the set of integer numbers and is considered a subtype of REAL. The latter means, in particular, that it is possible to mix integer and real terms in expressions without the need of an explicit upcasting operator. It also means that the arithmetic and relational operations allowed on REAL values are allowed on INT values, as well.

Note that these are the (infinite precision) mathematical integers, not the finite precision machine integers used in most programming languages. The latter are modeled by bit vector types.

Example:

x, y : INT;
QUERY ((2 * x + 4 * y <= 1) AND ( y >= x)) => (x <= 0);
z : REAL;
QUERY (2 * x + z <= 3.5) AND (z >= 1);

The STRING Type

The STRING type is interpreted as the set of strings. Literals of type STRING are represented as characters enclosed in double quotes ("). The alphabet consists of the printable US ASCII characters from the 8-bit Extended ASCII character set. Printable characters are the characters between 0x20 and 0x7e. Non-printable characters can be represented by the following escape sequences.

\a, \b, \e, \f, \n, \r, \t, \v represent the corresponding ASCII characters in C++ convention. \ooo represents a single ASCII character where ooo consists of exactly three digits in the octal encoding of the character (from 0 to 377). Going beyond value 377 might give unexpected results.

\xNN represents a single ASCII character, where NN consists of exactly two digits in the hexadecimal encoding of the character.

The operators allowed on strings are described in Strings.

Example:

s1, s2 : STRING;
ASSERT s1 = "abc";
CHECKSAT s2 /= s1; %s2 = ""

The BITVECTOR Types

For every positive integer $n$, the type BITVECTOR(n) is interpreted as the set of all bit vectors of size $n$. A bitvector literal of type BITVECTOR(n) is represented as 0binX where X represents $n$ binary bits of the bitvector. For example, 0bin0010 represents 2 in type BITVECTOR(n) where $n = 4$.

The operations allowed on bitvectors are defined in Bit Vectors.

Example:

bv1, bv2 : BITVECTOR(4);
ASSERT bv1 = 0bin0101;
CHECKSAT bv2 = bv1;

User-defined Basic Types

Users can define new basic types (often referred to as uninterpreted types in the SMT literature). Each such type is interpreted as a set of unspecified cardinality but disjoint from any other type. A user-defined type is defined with the TYPE annotation. Once defined, it can be used to annotate variables, just like any other type.

User-defined basic types are created by declarations like the following:

% User declarations of basic types:

MyBrandNewType: TYPE;

Apples, Oranges: TYPE;

A : TYPE;

%declaring variable of type A
x : A;

%y is a type synonym for A
y : TYPE = A;

Structured Types

CVC4’s structured types are divided into the following families.

Array Types

Array types are created by the mixfix type constructors ARRAY _ OF _ whose arguments can be instantiated by any value type. Array elements can be accessed using the arrays name followed by square brackets ([ ]) containing the index to be accessed.

Note that the only constraints on the indices of the array are its type. For example, arrays with integer indices can have negative indices. This isn’t true in most programming languages.

Example:

I : TYPE;

%% Array types:

% Array with indices from I and values from REAL
Array1: ARRAY I OF REAL;

% Array with integer indices and array values
Array2: ARRAY INT OF (ARRAY INT OF REAL);

% Array with integer pair indices and integer values
IntMatrix: ARRAY [INT, INT] OF INT;

% Array with integer indices and integer values
ar : ARRAY INT OF INT;

ASSERT ar[1] = 17;

An array type of the form ARRAY T1 OF T2 is interpreted as the set of all total maps from T1 to T2. The main difference with the function type T1 -> T2 is that arrays, contrary to functions, are first-class objects of the language, that is, values of an array type can be arguments or results of functions.

Tuple Types

Tuple types are created by the mixfix type constructors [_], [_, _], [_, _, _], and so on, whose arguments can be instantiated by any value type.

Example:

IntArray: TYPE = ARRAY INT OF INT;

% Tuple type declarations

EmptyTupleType: TYPE = []; % this is a unit type

RealPair: TYPE = [REAL, REAL];

MyTuple: TYPE = [ REAL, IntArray, [INT, INT] ];

A tuple type of the form [T1, ..., Tn] is interpreted as the Cartesian product $T_1 \times ... \times T_n$. Note that while the types $(T_1, \ldots, T_n) \to T$ and $(T_1 \times \ldots \times T_n) \to T$ are semantically equivalent, they are operationally different in CVC4. The first is the type of functions that take $n$ arguments of respective type $T_1, \ldots, T_n$, while the second is the type of functions that take one argument of an n-tuple type.

Record Types

Similar to, but more general than tuple types, record types are created by type constructors of the form [# l1: _, …, ln: _ #] where $n \geq 0$; l1, …, ln are field labels, and the arguments can be instantiated with any first-order types.

A literal of type [# l1: t1, …, ln: tn #], where t1, …, tn are types, is represented as (# l1 := v1, …, ln := vn #), where v1, …, vn are values of the respective types.

The order of the fields in a record type is meaningful: permuting the field names gives a different type.

Example:

MyType: TYPE;

% Record declaration

RecordType: TYPE = [# id: REAL, age: INT, info: MyType #];

r1 : [# label1 : INT, label2 : REAL, label3 : STRING #];

r2 : [# label1 : INT, label2 : REAL, label3 : STRING #];

CHECKSAT r1 = r2;

Note that record types are non-recursive. For instance, it is not possible to declare a record type called Person containing a field of type Person. Recursive types are provided in CVC4 by the more general inductive data types. (As a matter of fact, both record and tuple types are implemented internally as inductive data types).

Inductive Data Types

Inductive data types in CVC4 are similar to inductive data types of functional programming languages. They can be parametric or not.

Non-Parametric Data Types

Non-parametric data types are created by declarations of the form

\[\begin{array}{l} \mathrm{DATATYPE} \\ \begin{array}{ccc} \ \ A_1 & = & C_{1,1} \mid C_{1,2} \mid \cdots \mid C_{1,m_1}, \\ \ \ A_2 & = & C_{2,1} \mid C_{2,2} \mid \cdots \mid C_{2,m_2}, \\ \ \ \vdots & = & \vdots \\ \ \ A_n & = & C_{n,1} \mid C_{n,2} \mid \cdots \mid C_{n,m_n} \\ \end{array} \\ \mathrm{END}; \end{array}\]

where each $A_i$ is a type name and each $C_{i, j}$ is either a constant symbol or an expression of the form

\[cons(sel_1 : T_1, \ldots, sel_k : T_k)\]

where $T_1, \ldots, T_k$ are any first-order types, including any $A_i$. Such declarations define the data types $A_i, \ldots, A_n$. For each data type $A_i$, they introduce:

constructors symbols $cons$ of type $(T_1, \ldots, T_k) \rightarrow A_i$,
selector symbols $sel_j$ of type $A_i \rightarrow T_j$, and
tester symbols $is_{cons}$ of type $A_i \rightarrow \texttt{BOOLEAN}$.

Note that permitting more than one data type to be defined in the same declarations allows the definition of mutually recursive types.

Example:

% simple enumeration type

% implicitly defined are the testers: is_red, is_yellow and is_blue
% (similarly for the other data types)
DATATYPE
  PrimaryColor = red | yellow | blue
END;

% infinite set of pairwise distinct values ..., v(-1), v(0), v(1), ...
DATATYPE
  Id = v (id: INT)
END;

% ML-style integer lists
DATATYPE
  IntList = nil | ins (head: INT, tail: IntList)
END;

% AST for lamba calculus
DATATYPE
  Term = var (index: INT)
       | apply (arg_1: Term, arg_2: Term)
       | lambda (arg: INT, body: Term)
END;

% Trees
DATATYPE
  Tree = tree (value: REAL, children: TreeList),
  TreeList = nil_tl
           | ins_tl (first_t1: Tree, rest_t1: TreeList)
END;

% A mutually-recursive specification
DATATYPE
  nat = succ(pred: nat) | zero,
  list = cons(car: tree, cdr: list) | null,
  tree = node(children: list) | leaf(data:nat)
END;
one, two, three: nat;
l1, l2, l3: list;
t1, t2, t3 : tree;

Constructor, selector and tester symbols defined for a data type have global scope. So, for example, it is not possible for two different data types to use the same name for a constructor.

An inductive data type is interpreted as a term algebra constructed by the constructor symbols over some sets of generators. For example, the type IntList defined above is interpreted as the set of all terms constructed with nil and ins over the integers.

Parametric Data Types

Parametric data types are infinite families of (non-parametric) data types with each family parametrized by one or more type variables. They are created by declarations of the form:

\[\begin{array}{l} \mathrm{DATATYPE} \\ \begin{array}{ccc} \ \ A_1[X_{1,1}, \ldots, X_{1,p_1}] & = & C_{1,1} \mid C_{1,2} \mid \cdots \mid C_{1,m_1}, \\ \ \ A_2[X_{2,1}, \ldots, X_{2,p_2}] & = & C_{2,1} \mid C_{2,2} \mid \cdots \mid C_{2,m_2}, \\ \ \ \vdots & = & \vdots \\ \ \ A_n[X_{n,1}, \ldots, X_{n,p_n}] & = & C_{n,1} \mid C_{n,2} \mid \cdots \mid C_{n,m_n} \\ \end{array} \\ \mathrm{END}; \end{array}\]

where each $A_i$ is a type parametrized by the type variables $X_{i,1}, \ldots X_{i, p_i}$ and each $C_{i, j}$ is either a constant symbol or an expression of the form:

\[\mathit{cons}(\ \mathit{sel}_1: T_1,\ \ldots,\ \mathit{sel}_k: T_k\ )\]

where $T_1, \ldots, T_k$ are any first-order types, possibly parametrized by $X_1, \ldots X_p$, including any $A_i$.

Example:

% Parametric pairs
DATATYPE
  Pair[X, Y] = pair (first: X, second: Y)
END;

% Parametric lists

DATATYPE
  List[X] = nil | cons (head: X, tail: List[X])
END;


% Parametric trees using the list type above

DATATYPE
  Tree[X] = node (value: X, children: List[Tree[X]]),
END;

% Parametric mutually recursive datatype
DATATYPE
  tree[X] = node(children: list) | leaf(data: X),
  list = cons(car: tree, cdr: list) | null
END;

The declarations above define infinitely many types of the form Pair[S,T], List[T] and Tree[T] where S and T are first-order types. Note that the identifier List above, for example, by itself does not denote a type. In contrast, the terms List[Real], List[List[Real]], List[Tree[INT]], and so on do.

Restriction to Inductive Types

By adopting a term algebra semantics, CVC4 allows only inductive data types, that is, data types whose values are essentially (labeled, ordered) finite trees. Infinite structures such as streams or even finite but cyclic ones such as circular lists are then excluded. For instance, none of the following declarations define inductive data types, and are rejected by CVC4:

DATATYPE
 IntStream = s (first:INT, rest: IntStream)
END;

DATATYPE
 RationalTree = node1 (child: RationalTree)
              | node2 (left_child: RationalTree, right_child:RationalTree)
END;

DATATYPE
  T1 =  c1 (s1: T2),
  T2 =  c2 (s2: T1)
END;

Concretely, a declaration of $n \geq 1$ datatypes $T_1, \ldots, T_n$ will be rejected if for any one of the types $T_1, \ldots, T_n$, it is impossible to build a finite term of that type using only the constructors of $T_1, \ldots, T_n$, and free constants of types other than $T_1, \ldots, T_n$.

Inductive data types are the only types where the user also chooses names for the built-in operations to:

construct a value of the type (with the constructors),
extract components from a value (with the selectors), or
check if a value was constructed with a certain constructor or not (with the testers). For all the other types, CVC4 provides predefined names for the built-in operations on the type.

Set Types

Finite set types are created by the type constructors SET OF _ whose arguments can be instantiated by any value type. The empty set of type X is represented as {} :: SET OF X; the singleton set containing x of type X is represented as {x} : SET OF X; and, the finite set consisting of elements x1, …, xn of type X is represented as {x1, …, xn} : SET OF X.

Two sets are of comparable types if they are of same type, or one’s type is a sub-type of the other’s. The operations allowed on set types are described in Sets.

Example:

set1, set2 : SET OF BOOLEAN;
ASSERT set1 = {TRUE};
ASSERT set2 /= {} :: SET OF BOOLEAN;
CHECKSAT; %set2 = {FALSE}

Relation Types

Relation types are created by the mixfix type constructors SET OF [T1, ..., Tn] where $n \gt 0$, and $T_1, \ldots, T_n$ are CVC4 types. Literals of the relation type are represented as {(v11, ..., vn1), ..., (v1m, ..., vnm)} where all tuples have the same size $n_1 = \ldots = n_m$ and $m \geq 0$.

The operations allowed on relation types are defined in Relations.

Example:

r1, r2 : SET OF [INT, BOOLEAN];
ASSERT r1 = {(1, TRUE), (0, FALSE)};
ASSERT r2 /= r1;
CHECKSAT; %r2 = {} :: SET OF [INT, BOOLEAN]

Function Types

Function ($\to$) types are created by the mixfix type constructors _ -> _, (_, _) -> _, (_, _, _) -> _, and so on. Their arguments can be instantiated by any first-order type.

% Function type declarations

UnaryFunType: TYPE = INT -> REAL;

BinaryFunType: TYPE = (REAL, REAL) -> ARRAY REAL OF REAL;

TernaryFunType: TYPE = (REAL, BITVECTOR(4), INT) -> BOOLEAN;

%Variables of function type

x, y : UnaryFunType;

z : INT -> REAL;

A function type of the form $(T_1, \ldots, T_n) \to T$ with $n \gt 0$ is interpreted as the set of all total functions from the Cartesian product $T_1 \times \ldots \times T_n$ to $T$.

The example above also shows how to introduce type names. A name like UnaryFunType above is just an abbreviation for the type $\texttt{INT} \times \texttt{REAL}$ and can be used interchangeably with it.

In general, any type defined by a type expression E can be given a name with the declaration

name : TYPE = E;

Type Checking

In CVC4, formulas and terms are statically typed at the level of types (as opposed to subtypes) according to the usual rules of first-order many-sorted logic, with the main difference that formulas are just terms of type BOOLEAN:

each variable has one associated first-order type,
each constant symbol has one or more associated first-order types,
each function symbol has one or more associated function types,
the type of a term consisting just of a variable is the type associated to that variable,
the type of a term consisting just of a constant symbol is the type associated to that constant symbol,
the term obtained by applying a function f to the terms $t_1, \ldots, t_n$ is $T$ if $f$ has type $(T_1, \ldots, T_n) \times T$ and each $t_i$ has type $T_i$.

Attempting to enter an ill-typed term will result in an error.

Another significant difference with standard many-sorted logic is that some built-in symbols are parametrically polymorphic. For instance, the function symbol for extracting the element of any array has type $(\texttt{ARRAY}\ T_1\ \texttt{OF}\ T_2, T_1) \to T_2$ for all first-order types $T_1, T_2$.

Type Ascription

By the type inference rules above some terms might have more than one type. This can happen with terms built with polymorphic data type constructors that have more than one return type for the same input type. In that case, a type ascription operator (::) must be applied to the constructor to specify the intended return type.

DATATYPE
  List[X] = nil | cons (head: X, tail: List[X])
END;

ASSERT y = cons(1, nil::List[REAL]);

DATATYPE
  Union[X, Y] = left(val_l: X) | right(val_r: Y)
END;

ASSERT y = left::Union[BOOLEAN, REAL](TRUE);

The constant symbol nil declared above has infinitely many types (List[REAL], List[BOOLEAN], List[[Real,Real]], List[List[REAL]], …). CVC4’s type checker requires the user to indicate explicitly the type of each occurrence of nil in a term. Similarly, the injection operator left has infinitely many return types for the same input type, for instance: $\texttt{BOOLEAN} \to \text{Union}[\texttt{BOOLEAN}, \texttt{REAL}]$, $\texttt{BOOLEAN} \to \text{Union}[\texttt{BOOLEAN}, [\texttt{REAL}, \texttt{REAL}]]$, $\texttt{BOOLEAN} \to \text{Union}[\texttt{BOOLEAN}, \texttt{List}[\texttt{REAL}]]$, and so on. Applications of left need to specify the intended returned typed, as shown above.

Terms and Formulas

In addition to type expressions, CVC4 has expressions for terms and for formulas (i.e., terms of type BOOLEAN). By and large, these are standard first-order terms built out of typed variables, predefined theory-specific operators, free (i.e., user-defined) function symbols, and quantifiers. Extensions include an if-then-else operator, lambda abstractions, and local symbol declarations, as illustrated below. Note that these extensions still keep CVC4’s language first-order. In particular, lambda abstractions are restricted to take and return only terms of a first-order type. Similarly, variables can only be of a first-order type.

A number of built-in function symbols (for instance, the arithmetic ones) are used as infix operators. All user-defined symbols are used as prefix ones.

User-defined, i.e., free, function symbols include constant symbols and predicate symbols, respectively nullary function symbols and nullary function symbols with a BOOLEAN return type. These symbols are introduced with global declarations of the form $f_1, \ldots, f_m : T$, where $m \gt 0$, $f_i$ are the names of the symbols and $T$ is their type.

Example:

% integer constants

a, b, c: INT;

% real constants

x, y, z: REAL;

% unary function

f1: REAL -> REAL;

% binary function

f2: (REAL, INT) -> REAL;

% unary function with a tuple argument

f3: [INT, REAL] -> BOOLEAN;

% binary predicate

p: (INT, REAL) -> BOOLEAN;

% Propositional "variables"

P, Q: BOOLEAN;

Like type declarations, function symbol declarations like the above have global scope and must be unique. In other words, it is not possible to declare a function symbol globally more than once in the same lexical scope. This entails among other things that globally-defined free symbols cannot be overloaded with different types and that theory symbols cannot be redeclared globally as free symbols.

Global Symbol Definitions

As with types, a function symbol can be defined as the name of another term of the corresponding type. With constant symbols, this done with is a declaration the form f of:T = t; where f is a function symbol, T is a type, and t is the expression representing the term.

Example:

c: INT;

i: INT = 5 + 3*c;  % i is effectively a shorthand for 5 + 3*c

j: REAL = 3/4;

t: [REAL, INT] = (2/3, -4);

r: [# key: INT, value: REAL #] = (# key := 4, value := (c + 1)/2 #);

f: BOOLEAN = FORALL (x:INT): x <= 0 OR x > c ;

A restriction on constants of type BOOLEAN is that their value can only be a closed formula, that is, a formula with no free variables.

A term and its name can be used interchangeably in later expressions. Named terms are often useful for shared subterms (terms used several times in different places) since their use can make the input exponentially more concise. Named terms are processed very efficiently by CVC4. It is much more efficient to associate a complex term with a name directly rather than to declare a constant and later assert that it is equal to the same term. This point is explained in more detail later in section Commands.

More generally, in CVC4 one can associate a term to function symbols of any arity. For non-constant and non-proposition function symbols this is done with a declaration of the form:

\[f:(T_1, \ldots, T_n) \to T = \mathrm{LAMBDA}(x_1:T_1, \ldots, x:T_n): t\;;\]

where $t$ is any term of type $T$ with free variables in $\{x_1, \ldots, x_n\}$. The lambda binder has the usual semantics and conforms to the usual lexical scoping rules: within the term $t$ the declaration of the symbols $x_1, \ldots, x_n$ as local variables of respective type $T_1, \ldots, T_n$ hides any previous declarations of those symbols that are in scope.

As a general shorthand, when $k$ consecutive types $T_i, \ldots, T_{i+k-1}$ in the lambda expression $\mathrm{LAMBDA}(x_1:T_1, \ldots, x:T_n): t$ are identical, the syntax $\mathrm{LAMBDA}(x_1:T_1, \ldots, x_i,\ldots, x_{i+k-1}:T_i,\ldots, x:T_n): t$ can also be used.

Example:

% Global declaration of x as a unary function symbol

x: REAL -> REAL;

% Local declarations of x as variable (hiding the global one)

f: REAL -> REAL = LAMBDA (x: REAL): 2*x + 3;

p: (INT, INT) -> BOOLEAN = LAMBDA (x,i: INT): i*x - 1 > 0;

g: (REAL, INT) -> [REAL, INT] = LAMBDA (x: REAL, i:INT): (x + 1, i - 3);

Note that lambda definitions are not recursive: the symbol being defined cannot occur in the body of the lambda term. They should be understood as macros. For instance, any occurrence of the term f(t) where f is as defined above will be treated as if it was the term (2*t + 3).

##Local Symbol Definitions Constant and function symbols can also be declared locally anywhere within a term by means of a let binder. This is done with a declaration of the form:

\[\mathrm{LET}\ f = s \ \mathrm{IN}\ t ;\]

where $t$ is a term with no free variables, possibly a lambda term. Let binders can be nested arbitrarily and follow the usual lexical scoping rules. The following general form:

\[\mathrm{LET}\ f_1 = t_1, f_2 = t_2, \ldots, f_n = t_m \ \mathrm{IN}\ t ;\]

can be used as a shorthand for

\[\mathrm{LET}\ f_1 = t_1\ \mathrm{IN}\ \mathrm{LET}\ f_2 = t_2\ \mathrm{IN}\ \ldots \ \mathrm{LET}\ f_n = t_m \ \mathrm{IN}\ t ;\]

Example:

t: REAL =
  LET x1 = 42,
      g = LAMBDA(x:INT): x + 1,
      x2 = 2*x1 + 7/2
  IN
     (LET x3 = g(x1) IN x3 + x2) / x1;

Note that the same symbol = is used, unambiguously, in the syntax of global declarations, let declarations, and as a predicate symbol.

Note: A LET term with a multiple symbols defines them sequentially. A parallel version of the LET construct will be introduced in a later version.

Built-in theories and their symbols

In addition to user-defined symbols, CVC4 terms can use a number of predefined symbols: the logical symbols, such as AND, OR, etc., as well as theory symbols, function symbols belonging to one of the built-in theories. They are described next, with the theory symbols grouped by theory.

Logical symbols

The logical symbols in CVC4’s language include the equality and disequality predicate symbols, respectively written as = and /=, the multiarity disequality symbol DISTINCT, together with the logical constants TRUE, FALSE, the connectives NOT, AND, OR, XOR, $\Rightarrow$, $\Leftrightarrow$, and the first-order quantifiers EXISTS and FORALL, all with the standard many-sorted logic semantics.

The binary connectives have infix syntax and type $(\texttt{BOOLEAN}, \texttt{BOOLEAN}) \to \texttt{BOOLEAN}$. The symbols = and /=, which are also infix, are instead parametrically polymorphic, having type $(T,T) \to \texttt{BOOLEAN}$ for every first-order type $T$. They are interpreted respectively as the identity relation and its complement.

The DISTINCT symbol is both overloaded and polymorphic. It has type $(T, \ldots ,T) \to \texttt{BOOLEAN}$ for every sequence $(T, \ldots ,T)$ of length $n \gt 0$ and first-order type $T$. For each $n \gt 0$, it is interpreted as the relation that holds exactly for tuples of pairwise distinct elements.

The syntax for quantifiers is similar to that of the lambda binder.

Here is an example of a formula built just of these logical symbols and variables:

A, B: TYPE;

q: BOOLEAN = FORALL (x,y: A, i,j,k: B):
               i = j AND i /= k => EXISTS (z: A): x /= z OR z /= y;

Binding and scoping of quantified variables follows the same rules as in let expressions. In particular, a quantifier will shadow in its scope any constant and function symbols with the same name as one of the variables it quantifies:

A: TYPE;
i, j: INT;

% The first occurrence of i and of j in f are constant symbols,
% the others are variables.

f: BOOLEAN =  i = j AND FORALL (i,j: A): i = j OR i /= j;

Optionally, it is also possible to specify instantiation patterns for quantified variables. The general syntax for a quantified formula $\psi$ with patterns is

\[Q\;(x_1:T_1, \ldots, x_k:T_k):\; p_1: \ldots\; p_n:\; \varphi\]

where $n \geq 0$, Q is either FORALL or EXISTS, $\varphi$ is a term of type BOOLEAN, and each of the $p_i$’s, a pattern for the quantifier $Q\;(x_1:T_1, \ldots, x_k:T_k)$, has the form

\[\mathrm{PATTERN}\; (t_1, \ldots, t_m)\]

where $m \gt 0$ and $t_1, \ldots, t_m$ are arbitrary binder-free terms (no lets, no quantifiers). Those terms can contain (free) variables, typically, but not exclusively, drawn from $x_1, \ldots, x_k$. (Additional variables can occur if $\psi$ occurs in a bigger formula binding those variables.)

Example:

A: TYPE;
b, c: A;
p, q: A -> BOOLEAN;
r: (A, A) -> BOOLEAN;

ASSERT FORALL (x0, x1, x2: A):
         PATTERN (r(x0, x1), r(x1, x2)):
         (r(x0, x1) AND r(x1, x2)) => r(x0, x2) ;

ASSERT FORALL (x: A):
         PATTERN (r(x, b)):
         PATTERN (r(x, c)):
         p(x) => q(x) ;

ASSERT EXISTS (y: A):
         FORALL (x: A):
           PATTERN (r(x, y), p(y)):
           r(x, y) => q(x) ;

Patterns have no logical meaning: adding them to a formula does not change its semantics. Their purpose is purely operational, as explained in the Instantiation Patterns section.

In addition to these constructs, CVC4 also has a general mixfix conditional operator of the form

\[\mathrm{IF}\ b\ \mathrm{THEN}\ t\ \mathrm{ELSIF}\ b_1\ \mathrm{THEN}\ t_1\ \ldots\ \mathrm{ELSIF}\ b_n\ \mathrm{THEN}\ t_n\ \mathrm{ELSE}\ t_{n+1}\ \mathrm{ENDIF}\]

with $n \geq 0$ where $b, b_1, \ldots, b_n$ are terms of type BOOLEAN and $t, t_1, \ldots, t_n, t_{n + 1}$ are terms of the same first-order type $T$:

% Conditional term
x, y, z, w: REAL;

t: REAL =
  IF x > 0 THEN y
  ELSIF x >= 1 THEN z
  ELSIF x > 2 THEN w
  ELSE 2/3 ENDIF;

Arithmetic

The real arithmetic theory has two types: REAL and INTEGER with the latter a subtype of the first. Its built-in symbols for the usual arithmetic constants and operators over the type REAL, each with the expected type: all numerals 0, 1, …, as well as - (both unary and binary), +, *, /, <, >, <=, >=. Application of the binary symbols are in infix form. Note that + is only binary, and so an expression such as +4 is ill-formed.

Rational values can be expressed in decimal or fractional format, e.g., 0.1, 23.243241, 1/2, 3/4, and so on. A leading 0 is mandatory for decimal numbers smaller than one (e.g., the syntax .3 cannot be used as a shorthand for 0.3). However, a trailing 0 is not required for decimals that are whole numbers (e.g., 3. is allowed as a shorthand for 3.0). The size of the numerals used in the representation of natural and rational numbers is unbounded; more accurately, bounded only by the amount of available memory.

Strings

The following are the operations allowed on string values (with corresponding examples):

CONCAT(X1, …, Xn) concatenates strings X1, …, Xn and returns the concatenated string.
LENGTH(X) returns the numeric length of string X.
CONTAINS(X, Y) returns TRUE if string X contains string Y, and false otherwise.
INDEXOF(X, Y, n) returns an integer representing the index of string X, after index n, at which string Y is located, as follows. If X contains Y beginning at X[i] and i >= n, then INDEXOF(X, Y, n) returns n. Otherwise, it returns -1. String indices begin at 0.
```
s1, s2, s3 : STRING;
i1, i2 : INT;
ASSERT s1 = "abc";
ASSERT s2 = "def";
ASSERT s3 = CONCAT(s1, s2);
CHECKSAT i1 = LENGTH(s3);
CHECKSAT CONTAINS(s3, s2);
CHECKSAT CONTAINS(s2, s1);
CHECKSAT i2 = INDEXOF(s3, s2, 3);
CHECKSAT i2 = INDEXOF(s3, s2, 4);
```

REPLACE(X, Y, Z) where X, Y, and Z are strings, replaces Y in X by Z.

s1, s2 : STRING;
ASSERT s1 = "abcdef";
CHECKSAT s2 = REPLACE(s1, "abc", "xyz");
SUBSTR(X, m, n) returns the sub-string of string X from index m to index n. Indices begin at 0.
s1, s2 : STRING;
i1, i2 : INT;
ASSERT s1 = "abcd";
ASSERT i1 = 1;
ASSERT i2 = 2;
CHECKSAT s2 = SUBSTR(s1, i1, i2);

PREFIXOF(X, Y) returns TRUE if string X is a prefix of string Y.

SUFFIXOF(X, Y) returns TRUE if string X is a suffix of string Y.

s1, s2, s3 : STRING;
ASSERT s1 = "abcdef";
ASSERT s2 = "abc";
CHECKSAT PREFIXOF(s2, s1);
CHECKSAT SUFFIXOF(s2, s2);

STRING_TO_INTEGER(X) returns an integer value corresponding to string X as follows. If X is a string of digits, STRING_TO_INTEGER(X) returns the set of digits as an integer numeral. If there are non-digit characters in X, then it returns -1.

INTEGER_TO_STRING(X) returns a string value corresponding to the integer X. If X is non-negative, INTEGER_TO_STRING returns the value of X as a string literal, otherwise it returns the empty string “”.

s1, s2, s3, s4 : STRING;
i1, i2 : INT;
ASSERT s1 = "123";
ASSERT s2 = "x1a2";
ASSERT i1 = STRING_TO_INTEGER(s1);
ASSERT i2 = STRING_TO_INTEGER(s2);
ASSERT s3 = INTEGER_TO_STRING(i1);
ASSERT s4 = INTEGER_TO_STRING(i2);
CHECKSAT;

UINT16_TO_STRING(X) returns a string value corresponding to integer X, if X is within the 16-bit unsigned integer range (0 to 65535), and returns -1 otherwise.

UINT32_TO_STRING(X) returns a string value corresponding to integer X, if X is within the 32-bit unsigned integer range (0 to 4294967295), and returns -1 otherwise.

i1, i2, i3, i4 : INT;
s1, s2, s3, s4 : STRING;
ASSERT i1 = -5;
ASSERT i2 = 65535;
ASSERT i3 = 4294967295;
ASSERT i4 = 4294967296;
ASSERT s1 = UINT16_TO_STRING(i1);
ASSERT s2 = UINT16_TO_STRING(i2);
ASSERT s3 = UINT32_TO_STRING(i3);
ASSERT s4 = UINT32_TO_STRING(i4);
CHECKSAT;

CHARAT(X,n) - returns a string representing the character at index n in string X. Indices are counted from 0, and CHARAT returns “” when n is out of bounds.
```
s1, s2 : STRING;
ASSERT s1 = "abcd";
CHECKSAT s2 = CHARAT(s1, 2);
```

Bit Vectors

The following are the operations allowed on bitvector values (with corresponding examples):

X « n - left shift the bitvector X by n bits; returns a bitvector of m + n bits, where m is the number of bits in X (« preserves the left-most bit(s) and appends 0(s) onto the right).
```
bv1 : BITVECTOR(4);
bv2 : BITVECTOR(6);
ASSERT bv1 = 0bin1000;
CHECKSAT bv2 = bv1 << 2;
```
X » n - right shift the bitvector X by n bits; returns a bitvector of m bits, where m is the number of bits in X (» shifts out the right-most bit(s) and appends 0(s) onto the left).
```
bv1, bv2 : BITVECTOR(4);
ASSERT bv1 = 0bin0010;
CHECKSAT bv2 = bv1 >> 2;
```

BVPLUS(n, X1, X2) - returns an n-bit sum of bitvector X1 and bitvector X2. In case of overflowing bits, the bits are ignored. The function prepends 0s to the literals, if necessary.

bv1 : BITVECTOR(4);
bv2, bv3 : BITVECTOR(5);
bv4 : BITVECTOR(6);
ASSERT bv1 = 0bin1111;
ASSERT bv2 = 0bin11000;
CHECKSAT bv3 = BVPLUS(5, bv1, bv2);
CHECKSAT bv4 = BVPLUS(6, bv1, bv2);

BVSUB(n, X1, X2) - returns an n-bit difference of X1 - X2. In case of overflowing bits, the bits are ignored. The function prepends 0s to the literals, if necessary.

bv1, bv2 : BITVECTOR(4);
bv3 : BITVECTOR(3);
bv4 : BITVECTOR(5);
ASSERT bv1 = 0bin0100;
ASSERT bv2 = 0bin1100;
CHECKSAT bv3 = BVSUB(3, bv1, bv2);
CHECKSAT bv4 = BVSUB(5, bv1, bv2);
CHECKSAT bv5 = BVSUB(5, bv2, bv1);

BVMULT(n, X1, X2) - returns the n-bit product of X1 and X2. In case of overflowing bits, the bits are ignored. The function prepends 0s to the literals, if necessary.
```
bv1, bv2, bv3 : BITVECTOR(8);
ASSERT bv1 = 0bin00001001;
ASSERT bv2 = 0bin11111011;
CHECKSAT bv3 = BVMULT(8, bv1, bv2);
```
BVUDIV(X1, X2) - returns the quotient of the unsigned binary division (truncated towards 0) of X1 by X2, where X1, X2, and the return value have the same number of bits.

BVUREM(X1, X2) - returns the remainder of the unsigned binary division (truncated towards 0) of X1 by X2, where X1, X2, and the return value have the same number of bits.

bv1, bv2, bv3, bv4 : BITVECTOR(5);
ASSERT bv1 = 0bin11010;
ASSERT bv2 = 0bin00101;
CHECKSAT bv3 = BVUDIV(bv1, bv2) AND bv4 = BVUREM(bv1, bv2); %bv3 = 0bin00101, bv4 = 0bin00001

BVSDIV(X1, X2) - performs 2’s complement signed division on X1 and X2.
BVSREM(X1, X2) - returns the remainder on performing 2’s complement signed division on X1 and X2 (sign follows dividend).
BVSMOD(X1, X2) - same as BVSREM except that the sign follows the divisor in this case.
BVSHL(X1, X2) - shift left bitvector X1 by X2 bits. The X2 left-most bits of X1 are lost and the X2 right-most bits of X1 are replaced by 0.
BVLSHR(X1, X2) - logically shift right bitvector X1 by X2 bits. The X2 right-most bits of X1 are lost and the X2 left-most bits of X1 are replaced by 0.
```
bv1, bv2, bv3 : BITVECTOR(4);
ASSERT bv1 = 0bin1111;
ASSERT bv2 = 0bin0001;
CHECKSAT bv3 = BVSHL(bv1, bv2);
CHECKSAT bv3 = BVLSHR(bv1, bv2);
```
BVASHR(X1, X2) - arithmetically shift right X1 by X2 bits. The X2 left-most bits of the result are the same as the left-most bits of X1.
```
bv1, bv2, bv3 : BITVECTOR(4);
ASSERT bv1 = 0bin1000;
ASSERT bv2 = 0bin0001;
ASSERT bv3 = BVASHR(bv1, bv2);
CHECKSAT;
```
BVUMINUS(X) - 2’s complement unary minus on bitvector X of size n to return a bitvector of size n.
~ X - bitwise complement of bitvector X; returns a bitvector of m bits, where m is the number of bits in X.
```
bv1, bv2 : BITVECTOR(4);
ASSERT bv1 = 0bin1010;
CHECKSAT bv2 = ~ bv1;
```

X & Y - concatenate bitvectors X and Y of size m and n (respectively) to return a bitvector of size m + n.

bv1, bv2 : BITVECTOR(4);
bv3 : BITVECTOR(8);
ASSERT bv1 = 0bin0000;
ASSERT bv2 = 0bin1111;
CHECKSAT bv3 = bv1 & bv2;

BVXOR(X1, X2) - returns XOR of bitvectors X1 and X2 having same size.
BVNAND(X1, X2) - returns NAND of bitvectors X1 and X2 having same size.
BVNOR(X1, X2) - returns NOR of bitvectors X1 and X2 having same size.

BVXNOR(X1, X2) - returns XNOR of bitvectors X1 and X2 having same size.

bv1, bv2, bv3 : BITVECTOR(4);
ASSERT bv1 = 0bin1010;
ASSERT bv2 = 0bin1111;
CHECKSAT bv3 = BVXOR(bv1, bv2);
CHECKSAT bv3 = BVNAND(bv1, bv2);
CHECKSAT bv3 = BVNOR(bv1, bv2);
CHECKSAT bv3 = BVXNOR(bv1, bv2);

BVCOMP(X1, X2) - takes two bitvectors X1 and X2 of the same size, and returns a bitvector of size 1 as follows. BVCOMP(X1, X2) returns 0bin0 if X1 and X2 are unequal, and 0bin1 otherwise.
```
bv1, bv2, bv3 : BITVECTOR(4);
bv4 : BITVECTOR(1);
ASSERT bv1 = 0bin1100;
ASSERT bv2 = 0bin1100;
ASSERT bv3 = 0bin1001;
CHECKSAT bv4 = BVCOMP(bv1, bv2);
CHECKSAT bv4 = BVCOMP(bv1, bv3);
```
BVLT(X1, X2) - takes two bitvectors X1 and X2 of the same size, and returns TRUE if X1 is lesser than X2, and FALSE otherwise.
BVLE(X1, X2) - takes two bitvectors X1 and X2 of the same size, and returns TRUE if X1 is lesser than or equal to X2, and FALSE otherwise.
BVGT(X1, X2) - takes two bitvectors X1 and X2 of the same size, and returns TRUE if X1 is greater than X2, and FALSE otherwise.

BVGE(X1, X2) - takes two bitvectors X1 and X2 of the same size, and returns TRUE if X1 is greater than or equal to X2, and FALSE otherwise.

bv1, bv2 : BITVECTOR(4);
b : BOOLEAN;
ASSERT bv1 = 0bin0010;
ASSERT bv2 = 0bin1001;
CHECKSAT b = BVLT(bv1, bv2);
CHECKSAT b = BVLE(bv2, bv2);
CHECKSAT b = BVGT(bv2, bv1);
CHECKSAT b = BVGE(bv2, bv1);

SX(X1, n) - takes a bitvector X1 of size m, an integer n and sign-extends X1 to size n, as follows. If n > m, the resultant bitvector has the first bit of X1 for the first n - m bits followed by the bits of X1. If n = m, the resultant bitvector is the same as X1. If n < m, the resultant bitvector has only the last n bits of X1.
```
bv1 : BITVECTOR(4);
bv2 : BITVECTOR(2);
bv3 : BITVECTOR(6);
ASSERT bv1 = 0bin1010;
ASSERT bv2 = SX(bv1, 2);
ASSERT bv3 = SX(bv1, 6);
CHECKSAT; %bv2 = 0bin10, bv3 = 0bin111010
```
BVZEROEXTEND(X, m) - takes a bitvector X of size n and returns a bitvector of size n+m by prefixing m zeros to X.
```
bv1 : BITVECTOR(4);
bv2 : BITVECTOR(6);
ASSERT bv1 = 0bin1111;
CHECKSAT bv2 = BVZEROEXTEND(bv1, 2);
```
BVREPEAT(X, m) - takes a bitvector X of size n and returns a bitvector of size n*m by repeatedly concatenating X to itself m times.
```
bv1 : BITVECTOR(2);
bv2 : BITVECTOR(6);
ASSERT bv1 = 0bin10
CHECKSAT bv2 = BVREPEAT(bv1, 3);
```
BVROTL(X, m) - takes a bitvector X of size n and returns a bitvector of size n by rotating the first (left-most) m bits of X to the end (right).
BVROTR(X, m) - takes a bitvector X of size n and returns a bitvector of size n by rotating the last (right-most) m bits of X to the beginning (left).
```
bv1, bv2 : BITVECTOR(4);
ASSERT bv1 = 0bin1100;
CHECKSAT bv2 = BVROTL(1, bv1);
CHECKSAT bv2 = BVROTR(1, bv1);
```
BVSLT(X1, X2) - takes two signed bitvectors X1 and X2 of the same size, and returns TRUE if X1 is lesser than X2, and FALSE otherwise.
BVSLE(X1, X2) - takes two signed bitvectors X1 and X2 of the same size, and returns TRUE if X1 is lesser than or equal to X2, and FALSE otherwise.
BVSGT(X1, X2) - takes two signed bitvectors X1 and X2 of the same size, and returns TRUE if X1 is greater than X2, and FALSE otherwise.

BVSGE(X1, X2) - takes two signed bitvectors X1 and X2 of the same size, and returns TRUE if X1 is greater than or equal to X2, and FALSE otherwise.

bv1, bv2 : BITVECTOR(8);
ASSERT bv1 = 0bin11111110;
ASSERT bv2 = 0bin00000001;
CHECKSAT BVSLT(bv1, bv2); %sat
CHECKSAT BVSLE(bv1, bv1); %sat
CHECKSAT BVSGE(bv1, bv2); %unsat
CHECKSAT BVSGT(bv2, bv1); %sat

User-Defined Functions and Types

The theory of user-defined functions, also know in the SMT literature as the theory of ‘Equality over Uninterpreted Functions’, or EUF, is in effect a family of theories of equality parametrized by the basic types and the free symbols a user can define during a run of CVC4.

This theory has no built-in symbols (other than the logical ones). Its types consist of all and only the user-defined types. Its function symbols consist of all and only the user-defined free symbols.

Arrays

The theory of arrays is a parametric theory of (total) unary maps. It comes equipped with mixfix polymorphic selection and update operators, respectively

_[_] and _ WITH [_] := _ .

The semantics of these operators is the expected one: for all first-order types $T_1$ and $T_2$, if $a$ is of type $\texttt{ARRAY} T_1\ \texttt{OF}\ T_2$, $i$ is of type $T_1$, and $v$ is of type $T_2$,

a[i] denotes the value that a associates to index $i$,
a WITH [i] := v denotes a map that associates v to index i and is otherwise identical to a. Sequential updates can be chained with the shorthand syntax _ WITH [] := _, …, [] := _.

A: TYPE = ARRAY INT OF REAL;
a: A;
i: INT = 4;

% selection:

elem: REAL = a[i];

% update

a1: A = a WITH [10] := 1/2;

% sequential update
% (syntactic sugar for (a WITH [10] := 2/3) WITH [42] := 3/2)

a2: A = a WITH [10] := 2/3, [42] := 3/2;

Since arrays are just maps, equality between them is extensional, that is, for two arrays of the same type to be different they have to map at least one index to differ values.

Datatypes

The theory of inductive data types is in fact a family of theories parametrized by a data type declaration specifying constructors and selectors for one or more user-defined data types.

No built-in operators other than equality and disequality are provided for this family in the native language. Each user-provided data type declaration, however, generates constructor, selector and tester operators as described in the Inductive Data Types section.

Tuples and Records

Semantically both records and tuples can be seen as special instances of inductive data types. CVC4 implements them internally indeed as data types. In essence, a record type $[\#\ l_0:T_0, \ldots, l_n:T_n\ \#]$ is encoded as a data type of the form

\[\begin{array}{l} \mathrm{DATATYPE} \\ \ \ \mathrm{Record} = \mathit{rec}(l_0:T_0, \ldots, l_n:T_n) \\ \mathrm{END}; \end{array}\]

Tuples of length n are in turn special cases of records whose field names are the numerals from 0 to n-1. Externally, tuples and records have their own syntax for constructor and selector operators.

Records of type $[\#\ l_0:T_0, \ldots, l_n:T_n\ \#]$ have the associated record constructor $(\#\ l_0 := \_, \ldots, l_n := \_ \#)$ whose arguments must be terms of type $T_0, \ldots, T_n$, respectively.
Tuples of type $[T_0, \ldots, T_n]$ have the associated tuple constructor $(\_, \ldots, \_)$ whose arguments must be terms of type $T_0, \ldots, T_n$, respectively. Note that the empty tuple can be created with $( )$; however, a single-element tuple cannot be directly created, as $(\_)$ is interpreted as a parenthesized (non-tuple) expression. The selector operators on records and tuples follows a dot notation syntax.

% Record construction and field selection
Item: TYPE = [# key: INT, weight: REAL #];

x: Item = (# key := 23, weight := 43/10 #);
k: INT = x.key;
v: REAL = x.weight;

% Tuple construction and projection
y: [REAL, INT, REAL] = ( 4/5, 9, 11/9 );
first_elem: REAL = y.0;
third_elem: REAL = y.2;

Differently from data types, records and tuples are also provided with built-in update operators similar in syntax and semantics to the update operator for arrays. More precisely, for each record type $[\#\ l_0:T_0, \ldots, l_n:T_n\ \#]$ and each $i = 0, \ldots, n$, CVC4 provides the operator

\[\_\ \mathrm{WITH}\ .l_i\ := \_\]

The operator maps a record $r$ of that type and a value $v$ of type $T_i$ to the record that stores $v$ in field $l_i$ and is otherwise identical to $r$. Analogously, for each tuple type $[T_0, \ldots, T_n]$ and each $i = 0, \ldots, n$, CVC4 provides the operator

\[\_\ \mathrm{WITH}\ .i\ := \_\]

with similar semantics.

% Record updates

Item: TYPE = [# key: INT, weight: REAL #];

x:  Item = (# key := 23, weight := 43/10 #);

x1: Item = x WITH .weight := 48;

% Tuple updates

Tup: TYPE = [REAL,INT,REAL];
y:  Tup = ( 4/5, 9, 11/9 );
y1: Tup = y WITH .1 := 3;

Updates to a nested component can be combined in a single WITH operator:

Cache: TYPE = ARRAY [0..100] OF [# addr: INT, data: REAL #];
State: TYPE = [# pc: INT, cache: Cache #];

s0: State;
s1: State = s0 WITH .cache[10].data := 2/3;

Note that, differently from updates on arrays, tuple and record updates are just additional syntactic sugar. For instance, the record x1 and tuple y1 defined above could have been equivalently defined as follows:

% Record updates

Item: TYPE = [# key: INT, weight: REAL #];

x:  Item = (# key := 23, weight := 43/10 #);

x1: Item = (# key := x.key,  weight := 48 #);

% Tuple updates

Tup: TYPE = [REAL,INT,REAL];
y:  Tup = ( 4/5, 9, 11/9 );
y1: Tup = ( y.0, 3, y.1 );

Sets

The operations allowed on set types are:

Union - X Y returns the set representing the union of sets X and Y of comparable types.
Intersection - X & Y returns the set representing the intersection of sets X and Y of comparable types.

Set Subtraction - X - Y returns the set representing the difference of sets X and Y of comparable types.

set1, set2, set3 : SET OF INT;
ASSERT set1 = {1, 2, 3, 4};
ASSERT set2 = {4, 5, 6, 7};
CHECKSAT set3 = set1 | set2;
CHECKSAT set3 = set1 & set2;
CHECKSAT set3 = set1 - set2;

Membership - X IS_IN Y returns TRUE if element X is a member of set Y, and false otherwise. X and the elements of Y must have comparable types.
Subset - X <= Y returns TRUE if set X is a subset of set Y, and FALSE otherwise. X and Y must have comparable types.

Cardinality - CARD(X) returns an integer representing the cardinality of set X.

set1, set2, set3 : SET OF STRING;
ASSERT set1 = {} :: SET OF STRING;
ASSERT set2 = {"a", "b", "c"};
CHECKSAT "b" IS_IN set2;
CHECKSAT set1 <= set2;
CHECKSAT 0 = CARD(set1);

Complement - ~ X represents the complement of X in the universe of X’s type. Look below for how the universe is defined by CVC4. Complement only works when CVC4 is called with the –sets-ext option.
Universe - UNIVERSE :: SET OF T represents the universe set of type T, that is, the set containing all the values of type T. CVC4 creates the universe of type T, by including all terms of type T specified so far, in it. UNIVERSE only works when CVC4 is called with the –sets-ext option.
```
set1, set2, set3 : SET OF INT;
ASSERT set1 = {1, 2, 3};
CHECKSAT z = ~ x; % z = {} :: SET OF INT
ASSERT set2 = {4, 5, 6};
CHECKSAT z = ~ x; % z = {4, 5, 6}
```

Relations

The operations allowed on relation types are:

Transpose - TRANSPOSE(X) returns the transpose of relation X. If X has type SET OF [T1, …, Tn], then TRANSPOSE(X) has type [Tn, …, T1].
```
r1 : SET OF [INT, STRING];
r2 : SET OF [STRING, INT];
ASSERT r1 = {(1, "one"), (2, "two"), (3, "three")}
CHECKSAT r2 = TRANSPOSE(r1);
```

Transitive Closure - TCLOSURE(X) returns the transitive closure of X if X is a homogeneous binary relation.

r1, r2 : SET OF [INT,INT];
ASSERT r1 = {(1,2),(2,3)}
CHECKSAT r2 = TCLOSURE(r1); %r2 = {(1,2), (2,3), (1,3)}

Join - X JOIN Y returns the relational join of X and Y if they are joinable. Two relations X and Y are joinable if the type of X is {(T11, …, T1n, T)} and the type of Y is {(T, T21, …, t2N)}. Given these types of X and Y, the type of X JOIN Y is {(T11, …, T1n, T21, …, T2n)}. The join consists of all elements (x11, …, x1n, x21, …, x2n) such that (x11, …, x1n, x) belongs to X, and (x, x21, …, x2n) belongs to Y, for some x.
```
r1 : SET OF [INT, STRING];
r2 : SET OF [STRING, BOOLEAN];
r3 : SET OF [INT, BOOLEAN];
ASSERT r1 = {(1, "1"), (2, "2"), (3, "3"), (4, "4")};
ASSERT r2 = {("2", TRUE), ("3", FALSE), ("9", TRUE)};
CHECKSAT r3 = (r1 JOIN r2); %r3 = {(2, TRUE), (3, FALSE)}
```

Product - X PRODUCT Y returns the Cartesian product of X and Y - the set that contains all ordered pairs (x1, …, xn, y1, …, ym) such that (x1, … xn) belongs to X and (y1, … ym) belongs to Y.

r1 : SET OF [INT, BOOLEAN];
r2 : SET OF [STRING, REAL];
r3 : SET OF [INT, BOOLEAN, STRING, REAL];
ASSERT r1 = {(1, TRUE), (2, FALSE)};
ASSERT r2 = {(".", 9.2), (",", 5.6)};
CHECKSAT r3 = (r1 PRODUCT r2); %r3 = {(1, TRUE, ".", 9.2), (1, TRUE, ",", 5.6), (2, FALSE, ".", 9.2), (2, FALSE, ",", 5.6)};

Commands

In addition to declarations of types and function symbols, the CVC4 native language contains the following commands:

ASSERT F - Add the formula F to the current logical context $\Gamma$.
CHECKSAT F - Check if the formula F is satisfiable in the current logical context ($\Gamma \not\models_T \mathrm{NOT}\ F$).
COUNTERMODEL - After an invalid QUERY or satisfiable CHECKSAT, print a model that makes the formula invalid/satisfiable. The model is provided in terms of concrete values for each free symbol.
OPTION o v - Set the command-line option flag o to value v. The argument o is provide as a string literal enclosed in double-quotes and v as an integer value.
POP - Restore the system to the state it was in right before the most recent call to PUSH.
POP n - Equivalent to POP repeated n times.
PUSH - Save (checkpoint) the current state of the system.
PUSH n - Equivalent to PUSH repeated n times.
QUERY F - Check if the formula F is valid in the current logical context ($\Gamma\models_T F$).
RESTART F - After an invalid QUERY or satisfiable CHECKSAT, repeat the check but with the additional assumption F in the context.
PRINT t - Parse and print back the term t.
TRANSFORM t - Simplify the term t and print the result.
WHERE - Print all the formulas in the current logical context $\Gamma$. The remaining commands take a single argument, given as a string literal enclosed in double-quotes.
ECHO s - Print string s
INCLUDE f - Read commands from file f.
TRACE f - Turn on tracing for the debug flag f.
UNTRACE f - Turn off tracing for the debug flag f.

The following CVC3 commands are currently unsupported in CVC4.

CONTINUE
PUSHTO
POPTO
RESTART

QUERY

The command QUERY F invokes the core functionality of CVC4 to check the validity of the formula F with respect to the assertions made thus far, which constitute the context $\Gamma$. The argument F must be well typed term of type BOOLEAN, as described in Terms and Formulas. The execution of this command always terminates and produces one of three possible answers: valid, invalid, and unknown.

A valid answer indicates that $\Gamma \models_T F$. After a query returning such an answer, the logical context $\Gamma$ is exactly as it was before the query.
An invalid answer indicates that $\Gamma \not\models_T F$, that is, there is a model of the background theory $T$ that satisfies $\Gamma \cup \{\mathrm{NOT}\ F\}$. When QUERY F returns invalid, the logical context $\Gamma$ is augmented with a set $\Delta$ of ground (i.e., variable-free) literals such that $\Gamma\cup\Delta$ is satisfiable in $T$, but $\Gamma\cup\Delta\models_T \mathrm{NOT}\ F$!. In fact, in this case $\Delta$ propositionally entails NOT F, in the sense that, every truth assignment to the literals of $\Delta$ that satisfies $\Delta$ falsifies $F$. We call the new context $\Gamma\cup\Delta$ a counterexample for $F$.
An unknown answer is similar to an invalid answer in that additional literals are added to the context which propositionally entail NOT F. The difference in this case is that CVC4 cannot guarantee that $\Gamma\cup\Delta$ is actually satisfiable in $T$.

CVC4 may report unknown when the context or the query contains non-linear arithmetic terms or quantifiers. In all other cases, it is expected to be sound and complete, i.e., to report Valid if $\Gamma \models_T F$ and Invalid otherwise. After an invalid (resp. unknown) answer, counterexamples (resp. possible counterexamples) can be obtained with a WHERE or COUNTERMODEL command. Since the QUERY command may modify the current context, if one needs to check several formulas in a row in the same context, it is a good idea to surround every query by a PUSH and POP invocation in order to preserve the context:

``` PUSH; QUERY <formula>; POP;
```

CHECKSAT

The command CHECKSAT F behaves identically to QUERY NOT\ F. It returns unsat if NOT F is valid, sat if it is invalid, and unknown otherwise.

Here, we explain some of the above commands in more detail.

PUSH AND POP

CVC4’s context can be understood as being stratified in assertion levels which are added and removed in a LIFO fashion. The system starts at assertion level 0 and every execution of the PUSH command creates a new assertion level after the current one. Each formula added to the context either by the user, via an ASSERT command, or internally by the system is added to the most recent assertion level. Executing POP n removes from the context the most recent n assertion levels and all the formulas in them. Note that formulas at assertion level 0 cannot be retracted.

Note: Symbol declarations are always added at assertion level zero. So they are not affected by POP commands.

Instantiation Patterns

CVC4 processes each universally quantified formula in the current context by adding instances of the formula obtained by replacing its universal variables with ground terms. Patterns restrict the choice of ground terms for the quantified variables, with the goal of controlling the potential explosion of ground instances. In essence, adding patterns to a formula is a way for the user to tell CVC4 to focus only on certain instances which, in the user’s opinion, will be most helpful during a proof.

In more detail, patterns have the following effect on formulas that are found in the logical context or get added to it later while CVC4 is trying to prove the validity of some formula F.

If a formula in the current context starts with an existential quantifier, CVC4 Skolemizes it, that is, replaces it in the context with the formula obtained by substituting the existentially quantified variables by fresh constants and dropping the quantifier. Any patterns for the existential quantifier are simply ignored.

If a formula starts with a universal quantifier $\mathrm{FORALL}\; (x_1:T_1, \ldots, x_n:T_n)$, CVC4 adds to the context a number of instances of the formula, with the goal of using them to prove the query F valid. An instance is obtained by replacing each $x_i$ with a ground term of the same type occurring in one of the formulas in the context, and dropping the universal quantifier. If $x_i$ occurs in a pattern $\mathrm{PATTERN}\; (t_1, \ldots, t_m)$ for the quantifier, it will be instantiated only with terms obtained by simultaneously matching all the terms in the pattern against ground terms in the current context $\Gamma$.

Specifically, the matching process produces one or more substitutions $\sigma$ for the variables in \sigma which satisfy the following invariant: for each $i = 1, \ldots, m$, $\sigma(t_i)$ is a ground term and there is a ground term $s_i$ in $\Gamma$ such that $\Gamma \models_T \sigma(t_i) = s_i$. The variables of $(x_1:T_1, \ldots, x_n:T_n)$ that occur in the pattern are instantiated only with those substitutions (while any remaining variables are instantiated arbitrarily).

The Skolemized version or the added instances of a context formula may themselves start with a quantifier. The same instantiation process is applied to them too, recursively.

Note that the matching mechanism is not limited to syntactic matching but is modulo the equations asserted in the context. Because of decidability and/or efficiency limitations, the matching process is not exhaustive. CVC4 will typically miss some substitutions that satisfy the invariant above. As a consequence, it might fail to prove the validity of the query formula F, which makes CVC4 incomplete for contexts containing quantified formulas. It should be noted though that exhaustive matching, which can be achieved simply by not specifying any patterns, does not yield completeness anyway since the instantiation of universal variables is still restricted to just the ground terms in the context, whereas in general additional ground terms might be needed.