Polymorphic variant inference that beats OCaml
29 May 2023You may remember that a few months ago, I spent quite some time implementing type inference for polymorphic variant patterns in Polaris. In that post, I settled on roughly the same approach as OCaml. As it turns out, we can do much better!
A quick recap
The issue with inference for these kinds of patterns is that sometimes, polymorphic variant patterns should be inferred to a closed variant type and sometimes to an open type
# This should have type f : < A, B > -> Bool
let f(x) = match x {
A -> true
B -> false
}
# This should have type g : forall r. < A, B | r > -> Bool
let g(x) = match x {
A -> true
B -> false
_ -> false
}
The solution taken by both Polaris and OCaml is to always infer an
open variant type and then later check the patterns for exhaustiveness
and close the type when appropriate (by setting the row variable to the
empty variant <>
).
This is not enough
Consider this function (This may seem a bit artificial, but I ran into something similar in a real Polaris script!)
let f(x) = match x {
A -> B
y -> y
}
What is the type of y
? Both OCaml and current Polaris
will infer < A, B | ?r >
, and therefore infer the
full type
f : forall r. < A, B | r > -> < A, B | r >
.
But if you look closely, you will notice that this is not actually
correct! y
can never be A
, because in
that case, the pattern above it will match.
It gets a bit more complicated still. If we change the function slightly,
let f(x) = match x {
(A, 5) -> B
(y, _) -> y
}
the first pattern may not match, even if the first component is
A
, so y
can be A
now!
Fortunately, there is a relatively simple rule to figure out when to
refine variants. We know that all further patterns cannot contain a
variant, whenever a pattern is otherwise irrefutable
(i.e. will always match). For example, ((A, x), _)
is
irrefutable apart from the variant pattern A
, but
(A, 5)
is not, because 5
is not
irrefutable.
We can then infer the correct type for the function above
(forall r. < A, B | r > -> < B | r >
), by
refining the variant for the remaining patterns.
But how do you implement refinement?
This is a little harder for Polaris than it would be for OCaml, since Polaris’ type system is constraint based. This means we usually cannot match on types directly during type inference as they may still contain unification variables that will later be substituted by the constraint solver.
Right now, match expressions are inferred like this:
- Type inference infers the type of the scrutinee (
x
in the example above. In the difficult case, this is just a unification variable) - All patterns are independently checked against this type
- The body of each case is inferred and the types are all unified. The resulting type is the type of the result of the match expression
To refine variant types, we need to take a slightly different approach:
- We still infer the type of the scrutinee and use this as the working type
- Patterns are checked in order from top to bottom.
- After checking one pattern, we see if the pattern is irrefutable apart from a single variant pattern. If it is, we remember that variant constructor, as well as a path to it in the full type.
- Accordingly, we emit a constraint
RefineVariant (original_type, path, variant, refined_type)
, whereoriginal_type
is the type we matched the pattern against andrefined_type
is a fresh unification variable. - When encountering one of these constraints, the constraint solver
will traverse the path in the type (which is now fully constructed),
remove the desired variant and bind the resulting type to
refined_type
. - We continue by recursing on the other patterns, except that this
time, we check them against
refined_type
. - Inference for the case bodies remains unchanged.
And that’s it! For only a bit more effort, this should infer polymorphic variants much more precisely than OCaml!