Day 9: Capturing the Notion of Parametricity

Today was spent implementing System F, also known as the polymorphic lambda calculus, or λ2. I am studying System F as a prerequisite for a solid grasp of System Fω, which is itself a prerequisite for moving forward with my planned work on 1ML. My studies today turned up an interesting clue connecting the original motive driving Reynolds’ work on the polymorphic lambda calculus with the pursuit of modularity. The latter has been one of the driving urges of my life for at least a decade, and it is at the root of my interest in 1ML. I have made some notes in this entry recording what I learned. (I expect I will eventually end up reworking and incorporating them into my WIP notes on modularity).

A few notes on the history of System F

Jean-Yves Girard first defined System F in his PhD thesis in 1972. Writting about that work around 2010, Girard characterized the calculus thus:

System F, contrary to simply typed λ-calculus, is constructed around Curry-Howard, as the isomorphic image of intuitionistic second-order propositional calculus.

(Girard 2011)

Two years later, Reynolds would introduce the same formalism with slightly different aspect and the explicitly computational aim of developing a theory of the “type structure of programming languages”¹. According to Nederpelt and Geuvers,

Independent of Girard, J.C. Reynolds invented a similar typed lambda calculus that he called the polymorphic lambda calculus. He constructed this system in order to capture the notion of parametricity.

(Nederpelt and Geuvers 2014)

λ2 “captures the notion of parametricity” in a way that is immanent to the syntax of the calculus itself: it enables polymorphic functions by extending the simply typed lambda calculus with an additional abstraction mechanism for building functions of types that return terms. This augments the simply typed lambda calculus’ terms that depend on terms with terms that depend on types.

Consider the identity function on integers:

let int_id : int -> int = fun x -> x

λ2 allows parameterizing such functions by including functions that can take a type as an argument, returning expressions whose types may depend on the type parameter supplied. We can depict this in the pseudo OCaml below (which is quite close to valid 1ML):

let id (t : type) : t -> t = fun x -> x

This polymorphic id function takes a type t as an argument and returns an identity function on values of that type. With this function in hand, can then define any number of specialized id functions by supplying an appropriate type as a parameter:

let int_id : int -> int       = id (int : type)
let str_id : string -> string = id (string : type)

(* The identity function on functions of some parameterized type *)
let fun_id (t : type) : (t -> t) -> (t -> t) = id (t -> t : type)

Reynolds introduced λ2 in the 1974 paper “Towards a theory of type structure” (Reynolds 1974). The innovation of polymorphic functions just explained is described therein as a solution to “the old but neglected problem of polymorphic functions”, but this problem is only given secondary importance. The principle aim of the work is to formalize a theory of representation independence for programs in order to ensure a program can maintain the seam meaning regardless of the internal representation of its primitive types:

We start with the belief that the meaning of syntactically valid program in a “type-correct” language should never depend upon the particular representations used to implement its primitive types.

… this property of reference independence should hold for user-defined types as well as primitive types. The introduction of a user-defined type t should partition a program into an “outer” region in which t behaves like a primitive type and is manipulated by various primitive operations which are used but not defined, and an “inner” region in which the representation of t is defined in terms of other types, and the primitive operations on t are defined in terms of this representation. We expect that the meaning of such a program will remain unchanged if the inner region is altered by changing the representation of the type and redfining its primitive operations in a consistent manner.

(Reynolds 1974)

It seems to me that we can clearly recognize the reference independence and encapsulation described here as questions of modularity. Modularity is precisely what enables us to seal internal representations of types and expose both primitive and user defined types in a consistent, “outer” interface. In MLs, parametric modularity is also what enables us to maintain the meaning of a program (or at least whatever invariants of the program we wish to enforce) even when the “inner region” is altered by supplying different module inputs. This makes it seems that the achievement of 1ML (which is to unify the core and module languages of the ML family by elaborating the modules into System Fω) is an advance that continues to radicalize and unfold the potential of λ2 in close alignment with its primary objective and purpose.

Today’s Progress

Overall, I’m pleased with my progress today. I completed my program for today, as laid out in Day 8. This is encouraging because it means I set realistic goals and was able to stay on task. However, I was helped by the fact that some scheduled socializing fell through… so I may still need to tweak my expectations around feasible output.

PLT

Themis

Parsing

I generalized the parser utility functions I was using for the simply typed lambda calculus so that a new parser can be produced by providing all the requisite parsing parts. This is what it now looks like to create parser front-ends for the simply typed lambda calculus and λ2, respectively:

module Simply_typed = Make (struct
    module Tokens = Simple_token
    module Lexer = Simple_lexer
    module Ast = Simple.Term
    module Parser = Simple_parser
  end)

module L2 = Make (struct
    module Tokens = L2_token
    module Lexer = L2_lexer
    module Ast = L2.Term
    module Parser = L2_parser
  end)

Deriving
I refactored out and functorized the definition of a derivation, including concepts like statement, declaration, context, and judgment, as defined in (NO_ITEM_DATA:rob2014type.)

The basic tools for working with a derivations in a calculi can now be generated for use in an module defining derivation rules with
```
include Derivation.Make (Term) (Type)
```
where, as expected, Term and Type define the terms and types of the calculus.

λ2 / System F

Defined a parser using ocamllex and menhir
Wrote unit tests for the parser
Started writing the type checker
Finished reading Chapter 3 of (Nederpelt and Geuvers 2014), which presents λ2
Spent some time reading Girard and Reynolds, both very fun :)

Tooling

I wrote some OCaml yas snippets, and am planning to expand this into a library of usable snippets to expedite OCaml code construction. Today I added

module declarations
let foo = function | pattern -> exp constructs

Category Theory

The category theory reading group met to agree on a schedule, a text, and a pace today. That’s great progress :)

Tomorrow’s Program

PLT

Finish implementing λ2
Read Chapter 4 of (Nederpelt and Geuvers 2014)

Community

I have some pairing and coffee chats scheduled

Writing

The remarks on System F and its history in Reynolds work in this post has been helpful in maturing my own burgeoning understanding, and it helped me catch the thread of a deep connection between System F and modularity tracing right back to its origin in Reynold’s work.

References

Girard, Jean-Yves. 2011. The Blind Spot : Lectures on Logic. European Mathematical Society. http://www.worldcat.org/oclc/757486610?referer=xid.

Nederpelt, Rob P., and Herman Geuvers. 2014. Type Theory and Formal Proof: An Introduction. Cambridge University Press.

Reynolds, John C. 1974. “Towards a Theory of Type Structure.” In Lecture Notes in Computer Science, 408–25. Lecture Notes in Computer Science. Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-06859-7_148.

NO_ITEM_DATA:rob2014type.

Footnotes:

For reasons explained in Day 6, and because my studies here at the Recurse Center need to be skewed towards the practical, my current reading around the topic is orbitting around Reynold’s work rather than Girard’s.

Synechepedia