Apparently Polaris' constraint solver is non-confluent

27 August 2023

With a constraint based type checker, the order in which constraints are solved should generally not matter. This is called confluence and it is a pretty important property, since it ensures that the constraint solver can process constraints out of order and apply a whole bunch of other clever optimizations. It also makes the type system much more predictable for programmers.

But… currently changing the order in which constraints in Polaris are solved can infer wildly different types and that is pretty bad.

The culprit: program argument constraints

One of the more fundamental features of Polaris are program calls. For example !ls is an expression that runs the program ls and returns it’s output. Of course, programs can also take arguments, but what types should those arguments have? In the end, the operating system is only going to pass strings, but in polaris, lists of strings (and lists of lists of … of lists of strings), numbers and even functions are all valid program arguments. This is achieved by emitting a ProgArg constraint for the type of each argument, which is resolved correctly and quite elegantly for all of the above types.

There is just one catch: What if the argument type is an unsolved unification variable? For example, what should the type of the following function be?

let f(x) = !ls x

!ls returns a string, but we know nothing about its argument other than that it has a ProgArg constraint. We cannot solve the constraint, so we should emit an ambiguity error and reject the program.

But that’s stupid! Obviously, the most natural type for this function is String -> String and that is almost surely what the programmer meant when they wrote it.

So what polaris currently does is that, if the constraint solver fails to make progress, it defaults ProgArg constraints to String, correctly inferring f : String -> String for this example.

This defaulting breaks confluence

This is easier to explain if we assume Polaris had functional dependencies (which it might get at some point). Non-confluence should manifest without them as well, but even if it doesn’t, that just delays the issue.

Let’s say, we define a type class C with two functional dependencies on it’s arguments

class C(a, b) | a -> b, b -> a
instance C(String, Number)
instance C(List(String), String)

What this effectively means is that a constraint C(String, x) implies x ~ Number and C(y, String) implies y ~ List(String) (and vice versa).

Okay, now imagine we task our constraint solver with solving this sequence of constraints

1: ProgArg(?a)
2: C(?a, ?b)
3: ProgArg(?b)

None of these can be solved directly, so our constraint solver will fail to make progress and start defaulting program arguments.

If it starts with constraint (1), it will default ?a ~ String. This unblocks the second constraint, now C(String, ?b), which will be solved by our first instance and imply ?b ~ Number. Constraint (3) is just ProgArg(Number) now, which is trivially solved, leaving us with the successful substitution

?a := String
?b := Number

This actually doesn’t seem too unreasonable, but now consider what happens if our constraint solver decides to solve the constraints in reverse.

If it starts at (3), it defaults ?b ~ String, again unblocking (2), but with the second instance this time. This implies ?a ~ List(String) now, which again leaves our third constraint to be trivially solved as it is just ProgArg(List(String)) now. Therefore, our final substitution this time is

?a := List(String)
?b := String

which completely contradicts the one from before, just because we changed the order in which constraints were solved (or emitted!).

So what are we going to do?

One solution would be to default all ProgArg constraints at once. If we did this above, we would substitute both ?a := String and ?b := String, which would actually lead to the cosntraint solver rejecting the program above, because the functional dependency on C means that C(String, String) is impossible to solve.

This does restore confluence, however it might lead to some very counter intuitive errors. Removing a type signature somewhere might make the constraint solver give up and default a type somewhere completely different that would have been resolved correctly before.

I think there is a better way

Generalizing program argument constraints

Program argument constraints predate type classes by a lot, but now that we finally have them, we might not actually need to default program arguments at all anymore! Remember that the entire reason we cared about defaulting was that we wanted to infer a type for declarations like let f(x) = !ls x.

Now that we have type classes, we can just treat program argument constraints ~exactly like regular type class constraints and let users abstract over them! If we do this, we don’t need to pick a concrete type for x. We can just generalize the function and infer a type f : forall a. ProgArg(a) => a -> String.

Not only is this fully confluent, it also gives the programmer the power to do much more than they could before and possibly even leaves the door open for custom ProgArg instances in the future!