How to read inference rules

28 July 2023

The notation used is probably one of the largest barriers of entry to type inference papers, but it is rarely explained explicitly, so… I’m going to do just that!

For starters, inference rules are really nothing more than implications. The inference rule

AB(Name)

really just means “if A then B”. These are usually given a name (in this case, creatively, Name) to make it easier to refer to them in the rest of the paper.

Now, even though these are technically just implications, it’s usually not a great idea to read them from top to bottom. Inference rules denote relations, but it usually usually makes more sense to read them as (possibly non-deterministic) functions. For example, a judgement for typing function application might look like this. (where Γe:τ means “In a context Γ, the type of an expression e is infered to τ”)

Γe1  :  τ1τ2      Γe2  :  τ1Γe1(e2)  :  τ2(App)

Naively, one might read this as
If e1 has type τ1τ2 in a context Γ and e2 has type τ1 in Γ, then e1(e2) has type τ2 in Γ
but a much better way to read it, that is much closer to an actual implementation, would be
In order to infer a type for e1(e2) in a context Γ, one first needs to infer a type for e1 with shape τ1τ2 in Γ. Now e2 also needs to infer to type τ1 in Γ, so that the result (i.e. the type of e1(e2)) is τ2.

Read this way, the inference rule maps very closely onto an actual implementation! Seriously, compare the corresponding pseudocode to that second description

infer Γ (App e1 e2) =
    let (τ1 -> τ2) = infer Γ e1
    let τ3 = infer Γ e2
    unify τ1 τ3
    return τ2

The only major difference between this code (which skips error handling, just like inference rules) and the inference rule is that the fact that the type of e2 needs to be equal to τ1 is explicit in the code (unify τ1 τ3).

Reading off the algorithm like this is possible if the inference rules are syntax directed, i.e. if there is only ever a single rule that might match on a given expression. This is not always the case, so sometimes it’s better to imagine non-deterministically choosing the correct rule to apply, rather than just pattern matching.

And that’s… pretty much all you need to know to read inference rules!

There are a few common conventions in type systems that might be a bit surprising, so let’s go over those as well

Environments and extension

Type inference needs an environment to keep track of the types of variables. This is usually called Γ and extended as Γ,x  :  τ.

For example, this inference rule for (annotated) let bindings checks e2 under the environment Γ, extended with the binding x  :  τ1.

Γe1  :  τ1      Γ,x  :  τ1e2  :  τ2Γlet x  :  τ1=e1 in e2  :  τ2Let

Extracting information from the environment is achieved through “pattern matching” on the environment, for example in this inference rule for variables.

Γ,x  :  τx  :  τVar

Unification variables

Unification variables don’t exist in theoretical type systems, but they still map very directly onto a similar concept. Instead of generating a fresh unification variable, inference rules just “guess” a new type (they’re relations, remember?).

For example, this typing rule for (unannotated) lambdas just pulls the type τ out of thin air.

Γ,x  :  τe  :  τ1Γλxe  :  ττ1Lambda

Lists

Something you will see pretty often in papers by Simon Peyton Jones are lists that are represented by an overline. E.g. the syntax for uncurried function application might be e1(e), where e consists of 0 or more expressions.

Skolems

Similarly, skolems don’t exist as a separate concept. Instead, “unbound” type variables are treated as skolems, although these obviously cannot conflict with any other type variables in scope! In an implementation, this would be achieved by generating a fresh skolem, but in inference rules, this is expressed by the side condition that the type variable should not occur “free in the environment”, written aftv(Γ), where ftv denotes the set of free type variables (= skolems) in Γ.

For example, a rule for let bindings with polymorphic types (that need to be skolemized) might look like this

Γe1  :  τ1      aftv(Γ)      Γ,x  :  a.τ1e2  :  τ2Γlet x  :  a.τ1=e1 in e2  :  τ2

Where to go from here

Great, with a little practice, you should be able to read inference rules now! I would recommend you read Practical type inference for higher rank types, which is a great, relatively beginner friendly paper about type inference that even contains a full implementation at the end! (And despite the name, is not just about higher rank types).