An Invitation to Liquid Types: Unifying Type Theory and Model Checking
This was work for a graduate seminar in model checking that I wrote with Sammy Thomas. I later adapted it into a Papers We Love NYC presentation. As it was written for a specialist audience, the prose is a lot denser than my ordinary style, but some of you may find it interesting anyway.
Overview and Motivation
Enterprising formal methods researchers could do worse than position their work in the type-theoretic space. While inexpressive relative to their counterparts in research languages, even “simple” type systems[^1] enjoyed by conventional industrial languages like Java or C# provide a powerful mechanism to statically reject invalid programs.
We call “simple” type systems those derived from the Hindley-Milner subset of System F, which we can intuit as being more or less equivalent in expressiveness to Scala, Haskell, or polymorphic Java. A critical feature of H-M type systems is that both typechecking and reconstruction (inference) is decidable and efficient.
Under the Curry-Howard correspondence[1] we can consider, at least to a first approximation, that a program passing the typechecker corresponds to a proof in a constructive logic[2]. Also, as a logical system, type systems enjoy strong metatheoretic properties. Soundness (that is, invalid programs will always be rejected), reconstruction (that is, types can be inferred via terms’ usage without human annotation), and termination are guaranteed, even when typechecking programs whose behaviour depends on runtime user input or data an “well-typed programs don’t go wrong”[3] has entered the programmer’s popular lexicon[4]; after all, type theory is inarguably the branch of formal methods that software developers most commonly exploit to reason about the behaviour of their programs.
By “not going wrong”, informally, we mean that there is always a way to evaluate a program or program expression down to its final value. A program that can go wrong – for instance, attempting division by zero, calling the wrong method on an object, or iterating off the end of an array – will have to fail at runtime by throwing an exception or perhaps being killed by the operating system. If a well-typed program truly can’t go wrong, then the absence of such failure modes ought to have been proven by the type system.
However, as a form of abstract interpretation[5], Milner’s bon
mot is in reality circumscribed by how much information is lost by the
abstract transformation: in particular, both the typed and untyped program in
Figure 1 can, indeed, go wrong - the type system proves the shape of the
input datatype (morally, an iterable, indexable collection of ints) but lost
facts about particular well-typed terms. Concretely: Since at the type level
an empty list is indistinguishable from a non-empty one, it’s impossible to
reject a program that requires a non-empty list as input.
def avg(xs):
return sum(xs) / len(xs)
avg(42) # Runtime error: int is not iterable
def avg(xs: list[int]) -> int
return sum(xs) // len(xs)
avg([]) # Runtime error: ZeroDivisionError
Figure 1: an ill-typed program that goes wrong, and a well-typed one that also goes wrong[4:1].
Towards a type-theoretic approach
This particular soundness hole can be tackled by making the typelevel
definition of a list more precise. Be it via refactoring the subtyping
hierarchy for List,[6], indexed
types[7], or generalized algebraic
datatypes[8], we need to place constraints on terms of type
List[T] by a statement in some particular constraint domain. In our case,
our domain would need to be rich enough to encode and solve for an nonemptiness
constraint. In general, the richer the constraint domain, the richer and more
general qualifications we can place on a type, but the higher the proof burden
on the humans and proof systems in the loop. In the words of Rondon et al,
type checking [over a constraint domain] is shown to be decidable modulo the
decidability of the domain.
A constraint domain may simply be the existing type system. For instance, the
typelevel definition of the Peano naturals as a type Nat with type
constructors Z and S[N <: Nat] would allow a new encoding of natural
numbers within the type system. In contrast to the term 3 (of type
int), S[S[S[Z]]] is itself a type (namely, a subtype of Nat) with no
inherent runtime semantics (for a more complete example, see[9]).
from __future__ import annotations
from typing import Any, Generic, TypeVar
N,T = TypeVar("N", bound=Nat), TypeVar("T")
class Nat: pass
class Z(Nat): pass
class S(Nat, Generic[N]): pass
# Note: `Two` and `Four` are typedefs for subtypes of Nat, not terms with a concrete value!
Two = S[S[Z]]
Four = S[S[Two]]
Figure 2: A typelevel encoding of the natural numbers.
Representing numbers this way would let us embed a second type parameter L <: Nat to a collection type Vec[L, T], the vector of L elements of type T (see Figure 3).
As a concrete example, [1,2] might be typed as List[S[S[Z], Int], and
cons as 𝛼 → List[l, 𝛼] → List[S[l], 𝛼]. The two constructors nat and
cons enforce the invariant that the typelevel value of L always conforms to
the length of the list. In so doing, we say that Vec is indexed by Nat.
Notice that this is not the same as being indexed by values of type Nat or
int; terms and types in the Python type system occupy different syntactic
domains and are not interchangable in this way.
@dataclass
class Vec(Generic[N,T]): # A Vector of N elements of type T
l: list[T]
def nil() -> Vec[Z, T]: return Vec([])
def cons(t: T, l: Vec[N, T]) -> Vec[S[N], T]: return Vec([t] + l.l)
empty: Vec[Z, int] = nil()
one_three: Vec[Two, int] = cons(1, cons(3, nil()))
four_four: Vec[Four, int] = cons(4, cons(4, nil())) # error: Expression of type "Vec[S[S[S[S[Z]] | Z]], int]" cannot be assigned to declared type "Vec[S[S[S[S[Z]]]], int]"
def avg(l: Vec[S[N], int]) -> int:
return sum(l.l) // len(l.l)
avg(42) # error: Argument of type "Literal[42]" cannot be assigned to parameter "l"
avg(empty) # error: Argument of type "Vec[Z, int]" cannot be assigned to parameter "l"
assert(avg(one_three) == 2)
Figure 3: A vector of elements whose length is indexed by the type system, and
an implementation of avg that rejects ill-typed arguments, but with a
difficult to understand error.
While some may consider contorting an existing type system in this way to be a hack, the benefits are clear: as we have not introduced any new mechanism into an ordinary type system, we know that our existing type-metatheoretic properties of soundness, inference, and termination remain. Also, most typechecking algorithms run in polynomial – if not linear – time. In this way, once written, we get our proofs from our type system for free, and the procedure in which they are generated is decidedly non-magical.
Just as clear, however, are the downsides: that typecheckers operate on program fragments definitionally restrict us to shallow propositions: it’s not difficult to imagine needing constraints over the naturals that are wildly impractical, if not impossible, to state entirely in the language of parametrically-polymorphic types. For instance, while we can prove the length of a Vec is nonzero, we cannot state that elements contained within the Vec are themselves nonzero, nor could we specify a constraint over the index type itself (such as enforcing a list of even length), so we have to lower expectations accordingly.
Further, ergonomic issues abound: a type error should be treated as a form
of counterproof, but often reported errors are at the wrong level of abstraction;
for instance, decoding the two error-producing expressions in Figure 3 requires
exposes implementation details of Nat and Vec while obscuring relevant
ones, making applying the “counterproof” a frictive experience. The type error
on line 7 of Figure 1 is far shorter and more descriptive than its
equivalent on line 24 of Figure 3.
Towards types that depend on terms
A large burden has been placed on application developers to contort the type
system as we’ve just done. Fundamentally, that terms and types occupy distinct
syntactic domains makes a type annotation like Vec[sum(range(10)), int]
nonsensical: In Python’s type system, Vec’s type arguments can only be other
types.
A dependent type system[10], however, is one in which types and terms can intermingle. This obviates the need for awkward typelevel programming with the Peano naturals, but complexity is only pushed around, not eliminated. In its full bloom, we lose full automation of dependent type checking; anyone who has needed to write a Coq proof longer than its associated program knows this in their bones. If we want to maintain full push-button automation for a dependent type theory, we’ll have to think about ways to give up on completeness while still being useful.
Towards a model-theoretic approach
Certainly, the logical statement at no point will len(xs) be zero feels very
much a safety property, suggesting a straightforward “throw a tool for model
checking like SLAM[11], BLAST[12] or Houdini[13] at it”
solution. Much like our non-higher order type systems, model checking
techniques are sound, push-button in their human-in-the-loop requirements, and
can scale up to tackle the practicalities of exploring the execution space in
real-world software.
Indeed, enumerating paths through a program is a property of model-checking that is not well-covered in the type theory space. Type systems are typically path-insensitive as most typechecking algorithms act upon program fragments, and then compose to operate over the program as a whole. This “context-free” nature means the typechecker knows where it’s going, but it doesn’t know where it’s been[14], limiting the richness of the space of counterproofs that it can generate.
The model checking approach is not strictly optimal, however. Hidden within
our straightforward type List[T] is in fact something problematic for model
checkers: under Curry-Howard, a type variable corresponds to quantifying over
propositions in constructive logic[1:1]. In that way, the reading of this
a polymorphic type is: for all propositions T, the proposition List(T) holds.
Generally, therefore, reasoning about inductive data structures, something trivial for a typechecker, can cause a model checker to wander into the realm of undecidability. The encoding schemes for symbolically abstracting states into statesets such as BDDs and subsets of first-order logic, typically eschew quantifiers altogether. Either users are therefore forced to either give up full automation in their models[15], or model checker implementers are required to add new axioms on a per-data structure basis[16] or hope that their system’s heuristics[17] will not cause verification to spiral out of control.
Unifying the two approaches with logically-qualified types
A refinement type[18] is the pairing of an ordinary,
polymorphic type (called the base type) and some logical predicate that
refines it. For example, {v: int | 0 <= v ∧ v < n} refines the base type
of the integers to be bound between 0 and some other value n. Since n is a
program-level term, a refinement type is also a dependent type (in particular,
type zoologists will recognise a familiar sight: this is the dependent type
Fin n). A type definition like this, for instance, could allow the (partial)
function select: List[𝛼] -> int -> 𝛼 to become the dependent total function
select: Vec[n, 𝛼] -> Fin n -> 𝛼, statically-rejecting out-of-bounds indices.
In addition to safety improvements, prior work[10:1] showed the clear
performance benefits of being able to elide runtime bounds checks.
As before, the expressiveness of a refinement type depends on the expressiveness of the refining predicate’s constraint domain. Critically: If the constraint domain is made up of conjunctions of Boolean predicates over program terms, we can call such a refinement type a logically-qualified (LiQuid) type[19]. Ideally, we would be able to pick and choose parts from both the model- and type-theoretic approaches to get the benefits of each. In particular, we’d like to exploit the structure of a refinement type: given the base type and refinement predicate are expressed in different constraint domains, the hope is that we can apply pure type-theoretic mechanisms to the former and model-checking mechanisms to the latter, gain the advantages of both, and avoid destructive interference between them.
@solvent.check
def avg(xs: list[int] | len(xs) > 0) -> int
return sum(xs) // len(xs)
avg(42) # Type-checking error
avg([]) # Type-checking error
Figure 4: a logically-qualified program, checked with the Solvent liquid typechecker for Python, that does not go wrong.
When we say “the best” annotation, that is to say: trivially typing everybody to ⊤ won’t do!
The procedure described in the authors’ initial work[19:1] is primarily motivated by the type reconstruction problem: given a program absent type annotations, infer “the best” annotation for program expressions as opposed to the type checking problem, where a program already annotated with types is verified to be consistent with the theory.
Like all good inverse problems, type reconstruction is the meatier of the two problems. For some type theories, only typechecking is decidable, so it’s clearly in some sense the “harder” problem; and, being able to infer typelevel correctness without human intervention gives us full push-button automation.
Also like all good inverse problems, type reconstruction has degrees of freedom
that checking does not - if several possible valid typings exist, which should
the algorithm deduce? Nobody would claim that inferring lambda x: 2 * x
as object -> object would be terribly useful, correct though it might be. On
the other side of the spectrum, if we had a full program analysis oracle that
told us that double is only ever called with argument 21, the singleton
refinement type {v : int | v = 42} would similarly be correct, but overfits
to the input data in a way that doesn’t provide a useful precondition on the
return value being used later in the program. The authors must aim for the
best of both worlds: a dependent type system rich enough to prove useful safety
properties, while not being so expressive that reconstruction becomes
impossible.
From program expressions to base types
In the paper, the only valid base types are int, bool, and list[int].
This doesn’t mean that these are the only types that program terms can type to,
only that the terms that appear within a refinement type must be drawn from a
small fixed number of built-in types. Contiguous arrays of bytes would appear
later in their refinement work for C[20], and liquid
records and algebraic datatypes would appear later
still[21].
Base types are inferred via the Hindley-Milner (H-M) type inference algorithm[1:2], an absolute bog-standard procedure that we describe here only so that it can be compared to the refinement predicate inference procedure that follows.
H-M is in essence, a lazy type-checker. Instead of checking the types of expressions, as it comes upon them, it records typing constraints for later consideration; using type variables as placeholders as needed. A unifier computes a solution from these constraints, mapping type variables to types, so that type-checking succeeds. If the constraint set is not satisfiable–one or more constraints clash–then we can report a type error.
The constraint set may also be underspecified; some term will map only to a type variable and not a concrete type, so as to say “any type will do here”. This generality is critical: unification is guaranteed to provide the most general (or minimal) sequence of substitutions in its mapping; in this way, we say the principal type for each term is reconstructed. While this ensures that we do not unnecessarily constrain a term’s type, there are also operational implications: because a term’s principal type will always be generated, there’s no value in recomputing it once additional constraints have been computed. As a result, there’s no need to eagerly unify the full constraint set at once: lazy unification means inference and typechecking can be done incrementally[1:3] or interleaved with each other, which is important for more sophisticated bidirectional typing algorithms.
From base types to logical qualifiers
We’ve previously seen the refining predicate’s constraint domain ought to be carefully chosen: in order to maintain decidable reconstruction and automation, we have to give up on arbitrary program expressions and propositions that we might enjoy in a higher-order logic like Coq, Agda, or HOL. To that end: concretely, refinement predicates are drawn from arithmetic and relational program expressions involving integer, boolean, and array sorts, and conjunctions thereof. (While not exhaustively enumerated in the paper, the precise possible predicate types are documented in the authors’ implementation of liquid types for OCaml.) With a bit of squinting, we can see that these comprise the quantifier-free theory of linear arithmetic and uninterpreted functions (QF-UFLIA)!
It’s clear that liquid types are more restrained in their expressivity than full dependent types, for which type checking and reconstruction quickly wander into the realm of undecidability[22]. But, the tradeoff pays dividends: Since QF-UFLIA is decidable, so too is the typechecking and reconstruction of a liquid type, since off-the-shelf SMT solvers like Yices or Z3 can be used to check whether refinement predicate in this logic is valid. It’s interesting that the initial liquid types paper[19:2] mentions using an SMT solver almost as an afterthought, whereas the subsequent work([23], [21:1], [20:1], [24]) makes sure to mention it explicitly and right in the abstract, suggesting that bringing this particular kind of magic to bear resonated with the research community.
Subtyping as implication
Subtype polymorphism is an important component of many type systems, but introduces complications that language designers need to carefully consider since subtyping, anecdotally, tends to interact with language features in non-trivial and unanticipated ways[1:4].
Subtyping is defined in terms of a subsumption relation between two types:
recalling Liskov[25], T1 is a subtype of T2 if occurrences of
the more abstract T2 can be transparently substituted for T1. This
intuition lets us build a straightforward subtyping relationship for refinement
predicates: Stated in the language of logic: T1 is stronger as it places
more constraints on its inhabitants than T2. Framed in a logical context,
asking whether the inhabitants of one type are subsumed by another is
equivalent to asking whether the is therefore checking the validity of the
implication T1 $\implies$ T2 under some context $\Gamma$. This gives us True
and False as the extrema of our subtyping relation, as we would expect; we
will see shortly that by constraining the space of possible subtypes that we
explore, we form a lattice with True and False as our top and bottom.
Figure 5: The Dec-<:-Base typing rule: typing judgements follow from an
equality check on the base types and a validity check on the refinement
predicate subtyping relation. Notice that the subtyping rule does not hold
if the base types are unequal but subtypes.
If programmers were manually annotating all their program terms with refinement types, we could treat the internals of the SMT solver as magic, perform a simple validity check, and go home early. However, recall that our goal is type reconstruction, not just typechecking; we need to divine a predicate appropriate for the refinement type that is neither too weak (i.e. too abstract to say anything meaningful) nor too strong (i.e. doesn’t overfit to precisely the concrete values inhabited by the type). We saw before that the principal type property of H-M type inference made “a good base typing” simply fall out through unification; by contrast, conceivably, any number of “good refinement typings” could be synthesized.
from itertools import product, count
def max1(x, y) -> {V | True}: x if x > y else y
def max2(x, y) -> {V | V == x or V == y}: x if x > y else y
def max3(x, y) -> {V | V >= x and V >= y}: x if x > y else y
def max4(x, y) -> {V | [(x==a and y==a+b and V == y) or (y==a and x==a+b and V == x) or
(x==a-b and y==a and V == y) or (y==a-b and x==a and V == x)
for a,b in product(count(), 2)]} }: ...
Figure 6: Four plausible typings of the max() function, each of varying
levels of goodness: note that max4 depends upon an infinite number of
qualifiers!
Recalling how the technique was applied to great effect whilst contemplating model-checking solutions([11:1], [12:1], [13:1]), a liquid type reconstruction algorithm that consumes a set of predefined qualifiers which they treat as a sort of basis set that refinements will be constructed from. So, a refinement predicate will be a conjunction of such qualifiers. Naturally, this induces a bias into the reconstruction output: whoever wrote down the set of qualifiers for predicate abstraction to choose from is in some sense determining the “basis set” that will comprise a reconstructed liquid type.
quals = [0 < V,
_ <= V,
V < _,
len(_) <= V]
Figure 7: The built-in qualifiers from the paper, in Solvent’s internal
qualifier DSL. The underbar token stands in for any appropriately-typed
program term and V is the distinguished value variable $v$.
By restricting the shape of each clause in the conjunction to ones derived from the templates in the qualifier list, we’re again giving up full expressivity. This time, instead of constricting the overall logic to ensure decidability, we’re ensuring that the inference algorithm has a finite – but hopefully robust! – search space to explore during predicate inference.
Refinement type constraint generation
Constraint generation for the refinement predicate portion of the liquid type follows a similar trajectory to that which we saw for base types. The solver begins by lifting the inferred base types into a refinement template with a constraint variable in place of a concrete predicate.
Since we’ve seen that program terms can be substituted in for the _ wildcard
in the qualifier list (see Figure 7), all refinement templates include a set of
scope constraints: intuitively, a scope constraint is an assertion in the
typing environment that a program term is in use at the particular point that
the type itself comes into scope. A scope constraint says nothing about
whether the program term needs to be used in a refining predicate, only that
it may be.
A flow constraint, by comparison, is an assertion of some relationship
between program terms. Critically, flow constraints are path sensitive: they
are an assumption in the typing environment that depends not on program values
but the control flow through the program. For instance, an if-statement
would fork the typing environment into two flows: one in which the then
branch is taken and another when the else branch is taken. At each leaf of
the AST we concretize a subtype containing a trivial qualifier: for instance,
return x qualifies the return type to be exactly x.
Figure 8: A function annotated with refinement templates – $K_1$, $K_2$, and $K_3$ are constraint variables, and the typing context includes a flow-sensitive subtyping relationship on the type of the return value, which is dependent on which program branches are taken.
Given the constraints shown in Figure 7, an obvious but suboptimal type for
max()'s return value suggests itself. We know that either branch of the
if must be taken; so, the disjunction of the subtype constraints would
technically work (see max2() in Figure 6). This type, of course, only has
two inhabitants and wildly overfits to the input arguments; we’d like it to
generalize as much as possible.
Abstraction from a fixed set of qualifiers
To generate a more general return type that that of Figure 6’s max2(), we’ll
construct an abstraction of the type using our built-in qualifiers and the
accumulated constraints as building blocks. What we gain from these building
blocks – a lattice formed by the conjunction of some fixed set of boolean
predicates – is exactly the requirements to perform Cartesian predicate
abstraction.
In particular, we take our scope constraints, having accumulated all possible
program terms, and substitute each constraint into the qualifier placeholder.
Given the typing environments in Figure 7, we’d be left with the qualifier set
0 < V, V < x, V < y, x <= V, y <= V. (The final qualifier does not appear
because we have no scope constraint that types to a list.)
Figure 9: The strongest possible guess for max()'s return type: the
conjunction of every qualifier, expanded with every scope constraint. It is
likely that, when transformed into an implication and checked by Z3, it will
prove to be too strong, and a refinement process to weaken the formula will
occur.
As the stronger the formula, the more precise the type, so we might hope that
the strongest possible formula – conjoining the cross product of qualifiers
with scope constraints – would produce the most qualified liquid type.
However, it’s clear that the formula is too strong: v = x contradicts v < x. Intuitively: removing v < x, and any other contradictory clause, ought
to yield a formula that does satisfy the subtyping relation. This happens to
be precisely the predicate abstraction refinement loop: while the proposed
refinement predicate does not follow from every flow-sensitive typing
environment, filter out the individual clauses that themselves do not follow,
weakening the type, and re-check.
{ int ;|; v = y } <:& { int ;|: (\underline{0 \leq v}) & ; \wedge ; & (x \leq v) & ; \wedge ; & (y \leq v) & ; \wedge ; & (x \lt v) & ; \wedge ; & (\underline{y \lt v}) } \
& { int ;|: & & (x \leq v) & ; \wedge ; & (y \leq v) } \end{split} $$
Figure 10: Predicate abstraction for the max() example: clauses that fail
the implication check are underlined. The final refinement type can be read as
“an int $v$ that is no less than both x and y”.
Note that while the SMT solver could compute a validity counterproof, the inference engine does not exploit this. While new qualifiers could conceivably be mined from the counterexample, this is not explored in this work. As a result, the weakening process is strictly one that removes qualifiers from the conjunction.
Extensions for recursive functions
Type reconstruction is complicated for recursive functions since the refinement
predicates now need to capture an inductive invariant for each iteration of the
computation. Let’s consider the following summation function that behaves
equivalently to sum(range(k)):
@solvent.infer
def my_sum(k):
"Computes sum(range(k))."
if k < 0:
return 0
else:
s = my_sum(k - 1)
return s + k
As before, the argument k will be typed to int with an unknown refinement
predicate variable $K_1$, as will the body of the function but with $K_2$.
An approximation for the subtyping relations for $K_1$ and $K_2$. Notice that
we have placed the term-level assignment to s rather than a type assertion
for s in the typing context for expository reasons. The intuition expressed
here will be made precise shortly.
Placing a function call and value assignment in a typing context as we’ve done
here would be invalid in a non-dependently typed language, and is arguably at
best an abuse of notation even here. After all, the typing context should
contain the type of s, not its runtime value. However, the type of s
does relate to the program term k-1 because s’s type – the return type
of the sum() function – depends on that program term. At runtime, we’d
perform beta-reduction and substitute all occurrences of k with the reduced
value of k-1. We perform a similar (capture-avoiding) substitution at the
dependent type level, substituting the program expression $k - 1$ for the
free variable $k$. This leaves us with the final set of constraints for $K_2$:
With our constraints finalized, we proceed with constraint solving just like
in the sum() example, where we learn from the base case’s constraints that
$K_2 <: { int | 0 \leq v \wedge k \leq v}$. (It’s possible that the
other path constraint will further remove one or more of these predicates in
subsequent refinement rounds if this type is too strong). Moving to solving
the recursive case, we substitute the program expression k-1 for the free
variable k in $K_2$:
In so doing, we have eliminated the free variables $k$ and $v$, lowered s’s
type assertion into its conjuncts, and yielded a straightforward typing
constraint to solve as we did with max().
\neg(k < 0); ; (0 \leq s); ; (k-1 \leq s) ; & \vdash ; { int ;|; V = s + k } & <: ; & (int ;|: 0 \leq v & ; \wedge ; & k \leq v) \ \end{split} $$
This substitution has let us infer some interesting facts: s must
be nonnegative, as must k: Z3 can therefore infer enough facts about their
sum to conclude that the type of sum() must be $int \rightarrow { int |
0 \leq v \wedge k \leq v}$.
Liquid Types as a heuristic for Quantifier Instantiation
In a more standard program logic, we might describe the behavior of sum with the predicate: $\forall k \cdot 0 \leq sum(k) \wedge k \leq sum(k)$. This expresses the same thing as our refinement type, but using a quantifier. In this particular example, Z3’s pattern-based quantifier instantiation probably wouldn’t have any difficulty checking this verification condition. However, in general, figuring out a good way to instantiate quantified variables is extremely difficult[17:1], and is a subject of extensive and active research.
Liquid types eschew the need for quantifiers by factoring reasoning about them into the type system[26]. Quantified variables can be replaced with function types with refinements on base types. Because refinements can only reference variables in scope, this essentially limits quantifier instantiation to program terms that a function is called with. In practice, the authors claim that this turns out to be a very useful heuristic[27]. It’s worth keeping in mind though, that this introduces bias–it only works for functions that are case-splits over a single data type’s constructors.
Inferring sum()'s closed form with custom qualifiers and manual annotations
One final observation: We know the closed form of this summation: $\Sigma_{i=0}^{n} = \frac{n * (n+1)}{2}$. While a hypothetical oracle could hand down a refinement type with this predicate, we know our current set of qualifiers lack this particular expression shape, so it will never be found.
However, there’s nothing that limits application programmers from inserting their own bias into the reconstruction algorithm if they so choose: if we know we’d like a type with a particular shape to be inferred, and can express it the liquid type system’s qualifier DSL, then the system can incorporate it into reconstruction.
Simply adding (_ * (_ + 1)) / 2 == V into the qualifier list alone doesn’t
suffice, however: because the closed form only holds for nonnegative
integers, and my_sum() can consume any integer at all, the validity check on
this new qualifier will fail and predicate abstraction will filter it from the
valid qualifier list. We can handle the nonnegativity requirement in two ways:
First, the qualifier can be made an implication (_ >= 0) => (_ * (_ + 1)) / 2, which is vacuously true in the base case and satisfied in the recursive
step case. Second, we could abstain from adding the implication to the
qualifier, and instead manually refine the type of the input argument k to
only consume non-negative values, which has the same effect. Both solutions
are shown below:
# Our two custom qualifiers: the precondition required for the closed form
# to hold, and the closed form itself.
nonneg = _ >= 0
closed_form = (_ * (_ + 1)) / 2 == V
# This implementation treats the precondition as part of the qualifier, yielding
# the implication as part of the return type's predicate:
#
# (k:int) -> {int | k <= V and 0 <= V and (-(k >= 0) or k * k + 1 / 2 == V })
@solvent.infer(user_quals=[nonneg.implies(closed_form)])
def my_sum(k):
"Computes sum(range(k))."
if k < 0:
return 0
else:
s = my_sum(k - 1)
return s + k
# By contrast, this implementation adds the nonnegativity constraint as a
# refinement of the argument to the sum: This simplifies the custom qualifier set
# and the return type's predicate, at the expense of explicitly stating the input
# type, and restricting the input domain. The final inferred type is:
#
# (k:{int | V >= 0 }) -> {int | k <= V and 0 <= V and (k * k + 1 / 2 == V })
@solvent.infer(user_quals=[closed_form])
def my_sum(k: {int | nonneg}):
"Computes sum(range(k))."
if k < 0:
return 0
else:
s = my_sum(k - 1)
return s + k
Two different ways for an application developer to customise the type
inference procedure in order to reconstruct the closed form for sum(), each
with expressivity and automation tradeoffs. Notice that the reconstructed
type could be further simplified to remove redundant clauses.
Subsequent Work
The original liquid types paper focused on recursive programs in OCaml with an eye towards provably eliding bounds checks, in the style of Dependent ML[10:2] and previous ML refinement type work[18:1]. Generalizing followup work came soon after, focusing on imperative programs in C[20:2] and lazy functional languages[21:2]. More recently, follow-up work has been done in integrating liquid types into a gradual typing environment[24:1] and industrial languages like TypeScript[28].
More broadly, integration of SMT solvers into programming languages has been explored by both higher-order function languages[29] and imperative, OO-style ones[15:1].
Summary
Dependent type reconstruction is an impossibility if careful constraints are not placed on the type system. Leveraging SMT for discharging subtyping checks lets us maintain all the properties of simple polymorphic type systems while increasing expressiveness. We retain soundness – certainly the most important type-theoretic guarantee – because we know Z3 and its theories are themselves sound. And subtyping and termination guarantees remain because our infinite space of qualifiers is reduced to a finite lattice.
References
Pierce, Benjamin C. “Types and Programming Languages” (2002). MIT Press. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
Mimram, Samuel. “Program = Proof” (2021). ↩︎
Milner, Robin. “A theory of type polymorphism in programming” (1978). Journal of Computer and System Sciences. ↩︎
Vazou, Niki, et al. “LiquidHaskell: Experience With Refinement Types in the Real World”. Programming Languages Design and Implementation (PLDI) 2014. ↩︎ ↩︎
Cousot, Patrick and Cousot, Radhia. “Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints” (1977). POPL. ↩︎
ScalaCats NonEmptyList documentation. ↩︎
Zenger, Christoph. “Indexed Types” (1997). Theoretical Computer Science. ↩︎
Xi, Hongwei and Pfenning, Frank. “Dependent Types in Practical Programming” (1999). POPL. ↩︎
“Shapeless: Generic programming for Scala”. https://github.com/milessabin/shapeless ↩︎
Xi, Hongwei and Pfenning, Frank. “Eliminating Array Bound Checking Through Dependent Types” (1998). PLDI. ↩︎ ↩︎ ↩︎
Ball, Thomas and Rajamani, Sriram. “The SLAM Project: Debugging System Software via Static Analysis” (2002). POPL. ↩︎ ↩︎
Beyer, Dirk, et al. “Lazy Abstraction” (2002). POPL. Berkeley Lazy Abstraction Software Verification Tool (BLAST). ↩︎ ↩︎
Flanagan, Cormac and Qadeer, Shaz. “Predicate Abstraction for Software Verification” (2002). POPL. ↩︎ ↩︎
McMillan, Kenneth L. “Lazy Annotation for Program Testing and Verification” (2010). CAV. [“The Road to Nowhere” approach] ↩︎
Leino, K. Rustan M. “Dafny: An Automatic Program Verifier for Functional Correctness” (2010). LPAR. ↩︎ ↩︎
Ranise, Silvio and Zarba, Calogero G. “A Theory of Singly-Linked Lists and Its Extensible Decision Procedure” (2006). SEFM. ↩︎
Leino, K. Rustan M. and Pit-Claudel, Clément. “Trigger Selection Strategies to Stabilize Program Verifiers” (2016). CAV. ↩︎ ↩︎
Freeman, Tim and Pfenning, Frank. “Refinement Types for ML” (1991). PLDI. ↩︎ ↩︎
Rondon, Patrick, et al. “Liquid Types” (2008). PLDI. ↩︎ ↩︎ ↩︎
Rondon, Patrick, et al. “Low-Level Liquid Types” (2010). POPL. ↩︎ ↩︎ ↩︎
Vazou, Niki, et al. “Refinement Types for Haskell” (2014). ICFP. ↩︎ ↩︎ ↩︎
Wells, J. B. “Typability and type checking in System F are equivalent and undecidable” (1999). Annals of Pure and Applied Logic. ↩︎
Kawaguchi, Ming, et al. “Type-based Data Structure Verification” (2009). PLDI. ↩︎
Lehmann, Nico and Tanter, Eric. “Gradual Liquid Type Inference” (2018). OOPSLA. ↩︎ ↩︎
Liskov, Barbara. “Data Abstraction and Hierarchy” (1987). OOPSLA Keynote. ↩︎
Jhala, Ranjit and Vazou, Niki. “Refinement Types: A Tutorial” (2020). Foundations and Trends in Programming Languages. ↩︎
Vazou, Niki. “Liquid Haskell: Haskell as a Theorem Prover” (2016). PhD Thesis, UC San Diego. ↩︎
Feldthaus, Asger and Møller, Anders. “Adding Refinement Types to TypeScript” (2020). ECOOP. ↩︎
Swamy, Nikhil, et al. “F*: A Higher-Order Effectful Language Designed for Program Verification” (2016). POPL. ↩︎