How to read inference rules

28 July 2023

The notation used is probably one of the largest barriers of entry to type inference papers, but it is rarely explained explicitly, so… I’m going to do just that!

For starters, inference rules are really nothing more than implications. The inference rule

$\frac{A}{B} (Name)$

really just means “if A then B”. These are usually given a name (in this case, creatively, $Name$ ) to make it easier to refer to them in the rest of the paper.

Now, even though these are technically just implications, it’s usually not a great idea to read them from top to bottom. Inference rules denote relations, but it usually usually makes more sense to read them as (possibly non-deterministic) functions. For example, a judgement for typing function application might look like this. (where $Γ ⊢ e : τ$ means “In a context $Γ$ , the type of an expression $e$ is infered to $τ$ ”)

$\frac{Γ ⊢ e_{1} : τ_{1} \to τ_{2} Γ ⊢ e_{2} : τ_{1}}{Γ ⊢ e_{1} (e_{2}) : τ_{2}} (App)$

Naively, one might read this as

If $e_{1}$ has type $τ_{1} \to τ_{2}$ in a context $Γ$ and $e_{2}$ has type $τ_{1}$ in $Γ$ , then $e_{1} (e_{2})$ has type $τ_{2}$ in $Γ$

but a much better way to read it, that is much closer to an actual implementation, would be

In order to infer a type for $e_{1} (e_{2})$ in a context $Γ$ , one first needs to infer a type for $e_{1}$ with shape $τ_{1} \to τ_{2}$ in $Γ$ . Now $e_{2}$ also needs to infer to type $τ_{1}$ in $Γ$ , so that the result (i.e. the type of $e_{1} (e_{2})$ ) is $τ_{2}$ .

Read this way, the inference rule maps very closely onto an actual implementation! Seriously, compare the corresponding pseudocode to that second description

infer Γ (App e1 e2) =
    let (τ1 -> τ2) = infer Γ e1
    let τ3 = infer Γ e2
    unify τ1 τ3
    return τ2

The only major difference between this code (which skips error handling, just like inference rules) and the inference rule is that the fact that the type of $e_{2}$ needs to be equal to $τ_{1}$ is explicit in the code (unify τ1 τ3).

Reading off the algorithm like this is possible if the inference rules are syntax directed, i.e. if there is only ever a single rule that might match on a given expression. This is not always the case, so sometimes it’s better to imagine non-deterministically choosing the correct rule to apply, rather than just pattern matching.

And that’s… pretty much all you need to know to read inference rules!

There are a few common conventions in type systems that might be a bit surprising, so let’s go over those as well

Environments and extension

Type inference needs an environment to keep track of the types of variables. This is usually called $Γ$ and extended as $Γ, x : τ$ .

For example, this inference rule for (annotated) let bindings checks $e_{2}$ under the environment Γ, extended with the binding $x : τ_{1}$ .

$\frac{Γ ⊢ e_{1} : τ_{1} Γ, x : τ_{1} ⊢ e_{2} : τ_{2}}{Γ ⊢ let x : τ_{1} = e_{1} in e_{2} : τ_{2}} Let$

Extracting information from the environment is achieved through “pattern matching” on the environment, for example in this inference rule for variables.

$\frac{}{Γ, x : τ ⊢ x : τ} Var$

Unification variables

Unification variables don’t exist in theoretical type systems, but they still map very directly onto a similar concept. Instead of generating a fresh unification variable, inference rules just “guess” a new type (they’re relations, remember?).

For example, this typing rule for (unannotated) lambdas just pulls the type $τ$ out of thin air.

$\frac{Γ, x : τ ⊢ e : τ_{1}}{Γ ⊢ λ x \to e : τ \to τ_{1}} Lambda$

Lists

Something you will see pretty often in papers by Simon Peyton Jones are lists that are represented by an overline. E.g. the syntax for uncurried function application might be $e_{1} (\overline{e})$ , where $\overline{e}$ consists of 0 or more expressions.

Skolems

Similarly, skolems don’t exist as a separate concept. Instead, “unbound” type variables are treated as skolems, although these obviously cannot conflict with any other type variables in scope! In an implementation, this would be achieved by generating a fresh skolem, but in inference rules, this is expressed by the side condition that the type variable should not occur “free in the environment”, written $a \notin ftv (Γ)$ , where ftv denotes the set of free type variables (= skolems) in $Γ$ .

For example, a rule for let bindings with polymorphic types (that need to be skolemized) might look like this

$\frac{Γ ⊢ e_{1} : τ_{1} \overline{a} \notin ftv (Γ) Γ, x : \forall \overline{a} . τ_{1} ⊢ e_{2} : τ_{2}}{Γ ⊢ let x : \forall \overline{a} . τ_{1} = e_{1} in e_{2} : τ_{2}}$

Where to go from here

Great, with a little practice, you should be able to read inference rules now! I would recommend you read Practical type inference for higher rank types, which is a great, relatively beginner friendly paper about type inference that even contains a full implementation at the end! (And despite the name, is not just about higher rank types).