How to read inference rules
28 July 2023The notation used is probably one of the largest barriers of entry to type inference papers, but it is rarely explained explicitly, so… I’m going to do just that!
For starters, inference rules are really nothing more than implications. The inference rule
really just means “if A then B”. These are usually given a name (in this case, creatively, ) to make it easier to refer to them in the rest of the paper.
Now, even though these are technically just implications, it’s usually not a great idea to read them from top to bottom. Inference rules denote relations, but it usually usually makes more sense to read them as (possibly non-deterministic) functions. For example, a judgement for typing function application might look like this. (where means “In a context , the type of an expression is infered to ”)
Naively, one might read this as
If has type in a context and has type in , then has type inbut a much better way to read it, that is much closer to an actual implementation, would be
In order to infer a type for in a context , one first needs to infer a type for with shape in . Now also needs to infer to type in , so that the result (i.e. the type of ) is .
Read this way, the inference rule maps very closely onto an actual implementation! Seriously, compare the corresponding pseudocode to that second description
infer Γ (App e1 e2) = let (τ1 -> τ2) = infer Γ e1 let τ3 = infer Γ e2 unify τ1 τ3 return τ2
The only major difference between this code (which skips error
handling, just like inference rules) and the inference rule is that the
fact that the type of
needs to be equal to
is explicit in the code (unify τ1 τ3
).
Reading off the algorithm like this is possible if the inference rules are syntax directed, i.e. if there is only ever a single rule that might match on a given expression. This is not always the case, so sometimes it’s better to imagine non-deterministically choosing the correct rule to apply, rather than just pattern matching.
And that’s… pretty much all you need to know to read inference rules!
There are a few common conventions in type systems that might be a bit surprising, so let’s go over those as well
Environments and extension
Type inference needs an environment to keep track of the types of variables. This is usually called and extended as .
For example, this inference rule for (annotated) let bindings checks under the environment Γ, extended with the binding .
Extracting information from the environment is achieved through “pattern matching” on the environment, for example in this inference rule for variables.
Unification variables
Unification variables don’t exist in theoretical type systems, but they still map very directly onto a similar concept. Instead of generating a fresh unification variable, inference rules just “guess” a new type (they’re relations, remember?).
For example, this typing rule for (unannotated) lambdas just pulls the type out of thin air.
Lists
Something you will see pretty often in papers by Simon Peyton Jones are lists that are represented by an overline. E.g. the syntax for uncurried function application might be , where consists of 0 or more expressions.
Skolems
Similarly, skolems don’t exist as a separate concept. Instead, “unbound” type variables are treated as skolems, although these obviously cannot conflict with any other type variables in scope! In an implementation, this would be achieved by generating a fresh skolem, but in inference rules, this is expressed by the side condition that the type variable should not occur “free in the environment”, written , where ftv denotes the set of free type variables (= skolems) in .
For example, a rule for let bindings with polymorphic types (that need to be skolemized) might look like this
Where to go from here
Great, with a little practice, you should be able to read inference rules now! I would recommend you read Practical type inference for higher rank types, which is a great, relatively beginner friendly paper about type inference that even contains a full implementation at the end! (And despite the name, is not just about higher rank types).