In programming language semantics, normalisation by evaluation (NBE) is a method of obtaining the normal form of terms in the λ-calculus by appealing to their denotational semantics. A term is first interpreted into a denotational model of the λ-term structure, and then a canonical (β-normal and η-long) representative is extracted by reifying the denotation. Such an essentially semantic, reduction-free, approach differs from the more traditional syntactic, reduction-based, description of normalisation as reductions in a term rewrite system where β-reductions are allowed deep inside λ-terms.
NBE was first described for the simply typed lambda calculus.[1] It has since been extended both to weaker type systems such as the untyped lambda calculus[2] using a domain theoretic approach, and to richer type systems such as several variants of Martin-Löf type theory.[3] [4] [5] [6]
Consider the simply typed lambda calculus, where types τ can be basic types (α), function types (→), or products (×), given by the following Backus–Naur form grammar (→ associating to the right, as usual):
(Types) τ ::= α | τ1 → τ2 | τ1 × τ2
These can be implemented as a datatype in the meta-language; for example, for Standard ML, we might use:
(Syntax Terms) s,t,… ::= var x | lam (x, t) | app (s, t) | pair (s, t) | fst t | snd t
Here lam/app (resp. pair/fst,snd) are the intro/elim forms for → (resp. ×), and x are variables. These terms are intended to be implemented as a first-order datatype in the meta-language:
The denotational semantics of (closed) terms in the meta-language interprets the constructs of the syntax in terms of features of the meta-language; thus, lam is interpreted as abstraction, app as application, etc. The semantic objects constructed are as follows:
(Semantic Terms) S,T,… ::= LAM (λx. S x) | PAIR (S, T) | SYN t
Note that there are no variables or elimination forms in the semantics; they are represented simply as syntax. These semantic objects are represented by the following datatype:
There are a pair of type-indexed functions that move back and forth between the syntactic and semantic layer. The first function, usually written ↑τ, reflects the term syntax into the semantics, while the second reifies the semantics as a syntactic term (written as ↓τ). Their definitions are mutually recursive as follows:
\begin{align} \uparrow\alphat&=SYN t\\
\uparrow \tau1\to\tau2 v&=LAM(λ
S. \uparrow \tau2 (app (v,
\tau1 \downarrow S)))\\
\uparrow \tau1 x \tau2 v&= PAIR
(\uparrow \tau1 (fst v),
\uparrow \tau2 (snd v))\\[1ex] \downarrow\alpha(SYN t)&=t\\
\tau1\to\tau2 \downarrow (LAM S)&= lam (x,
\tau2 \downarrow
(S (\uparrow \tau1 (var x))))wherexisfresh\\
\tau1 x \tau2 \downarrow (PAIR (S,T))&=
\tau1 pair (\downarrow S,
\tau2 \downarrow T) \end{align}
These definitions are easily implemented in the meta-language:
(* reflect : ty -> tm -> sem *) fun reflect (Arrow (a, b)) t = LAM (fn S => reflect b (app (t, (reify a S)))) | reflect (Prod (a, b)) t = PAIR (reflect a (fst t), reflect b (snd t)) | reflect (Basic _) t = SYN t
(* reify : ty -> sem -> tm *) and reify (Arrow (a, b)) (LAM S) = let val x = fresh_var in lam (x, reify b (S (reflect a (var x)))) end | reify (Prod (a, b)) (PAIR (S, T)) = pair (reify a S, reify b T) | reify (Basic _) (SYN t) = t
By induction on the structure of types, it follows that if the semantic object S denotes a well-typed term s of type τ, then reifying the object (i.e., ↓τ S) produces the β-normal η-long form of s. All that remains is, therefore, to construct the initial semantic interpretation S from a syntactic term s. This operation, written ∥s∥Γ, where Γ is a context of bindings, proceeds by induction solely on the term structure:
\begin{align} \|var x\|\Gamma&=\Gamma(x)\\ \|lam (x,s)\|\Gamma&=LAM (λS. \|s\|\Gamma,)\\ \|app (s,t)\|\Gamma&= S (\|t\|\Gamma)where\|s\|\Gamma=LAM S\\ \|pair (s,t)\|\Gamma&= PAIR (\|s\|\Gamma,\|t\|\Gamma)\\ \|fst s\|\Gamma&= Swhere\|s\|\Gamma=PAIR (S,T)\\ \|snd t\|\Gamma&= Twhere\|t\|\Gamma=PAIR (S,T) \end{align}
In the implementation:
(* lookup : ctx -> string -> sem *) fun lookup (add (remdr, (y, value))) x = if x = y then value else lookup remdr x
(* meaning : ctx -> tm -> sem *) fun meaning G t = case t of var x => lookup G x | lam (x, s) => LAM (fn S => meaning (add (G, (x, S))) s) | app (s, t) => (case meaning G s of LAM S => S (meaning G t)) | pair (s, t) => PAIR (meaning G s, meaning G t) | fst s => (case meaning G s of PAIR (S, T) => S) | snd t => (case meaning G t of PAIR (S, T) => T)
Note that there are many non-exhaustive cases; however, if applied to a closed well-typed term, none of these missing cases are ever encountered. The NBE operation on closed terms is then:
As an example of its use, consider the syntactic term SKK
defined below:
This is the well-known encoding of the identity function in combinatory logic. Normalising it at an identity type produces:
The result is actually in η-long form, as can be easily seen by normalizing it at a different identity type:
Using de Bruijn levels instead of names in the residual syntax makes reify
a pure function in that there is no need for fresh_var
.[8]
The datatype of residual terms can also be the datatype of residual terms in normal form.The type of reify
(and therefore of nbe
) then makes it clear that the result is normalized.And if the datatype of normal forms is typed, the type of reify
(and therefore of nbe
) then makes it clear that normalization is type preserving.[9]
Normalization by evaluation also scales to the simply typed lambda calculus with sums (+
),[7] using the delimited control operators shift
and reset
.[10]