Notes on 1ML
1ML is an experimental dialect of ML. It’s aim and principle distinguishing feature is the unification of the module and core languages into a single level.
I find ML modules important and interesting for reasons I will try to articulate in The Measure of a Module. These notes record aspects and properties of 1ML which I have learned and discovered while studying the theory and implementation of the language.
Comments
(; ... ;)
designate wrapped (possibly inline) comments;;
designates the start of a comment that extends to the end of a line
Modules
According to one perspective, in 1ML “everything is just (‘a mode of use of’) modules” (Rossberg). Therefore, in thinking through the nature of 1ML it is reasonable that we begin with its modules.
Module definition syntax
To define a named module, we follow this schema:
ModuleName (; : MODULE_TYPE ;) = { ;; declarations and definitions };
The MODULE_TYPE
and declarations and defintions
are commented out to
indicate that they are optional.
The empty module
The minimal module is the empty module. We can assign M
to be the empty module
thus:
M = {};
Module types
In dialects of ML, every value has a type. Modules are values in 1ML, so they too have a type.
Consider module M
with one type declaration and two value definitions:
M = { type t = (int, text); x = 5; y = (x, "five"); };
Since we opted out of specifying a type for M
, 1ML will infer the following
type for the module:
M : { type t = (int, text); x : int; y : (int, text); };
Unsurprisingly, the type of a module is just the types of all the values which that module provides together with the types declared in the module.
Note that the module’s type is based on what it provides
, not what it
contains. This is an important distinction, because we can seal information into
modules by specifying module types that omit some of the module’s contents:
M : { type t; y: t; } = { type t = (int, text); x = 5; y = (x, "five"); };
Since the above module specifies a type for the module that excludes x
, x
is
no longer available from outside of the module:
1ML> result = M.y result : (int, text); 1ML> result = M.x stdin:1.10-1.13: field `x' unbound in expression
We can also seal in the types, which results in abstract types. To seal in
types, we specify the module signature using :>
, thus
M :> { type t; y: t; } = { type t = (int, text); x = 5; y = (5, "five"); };
The internal structure of the type M.t
is now hidden from the outside world.
We’ll see the implications of this in the following section.
The key points to take away form this brief discussion of 1ML modules:
- A module is a value in 1ML.
- Every module has a type.
- The type of a module specifies the module’s public interface.
A concrete example: implementing a stack
Fair warning: the following code is a working implemention that can be run in Rossberg’s prototype interpreter. As Rossberg notes in the README for the project
- The language is very basic, and misses lots of features you would expect from a real ML, especially datatypes.
- It is slow and probably buggy. In particular, type checking is implemented very naively, as a direct transliteration of the rules, with no worries about efficiency. Type inference is best considered work in progress. :)
To the list of missing features we can add the critical UI feature of pattern matching.
As a result of these limitations, the following code is less clear than it otherwise could be (and eventually will be!) in a more robust implementation.
Specifying the ADT
One of the most importnat features of modern programming is abstract data types (hereafter, ADTs), which encapsulate some data within a module, providing access to it only through operations that are associated with the module.
(Goguen 1999)
For an ADT essence is form, since ADTs are defined by specification of their interface, and this only gives the externally visible form while saying nothing on the matter of implementation. So let’s recite the specification of a stack, here expressed in terms of the module type:
type STACK = { type t a; empty 'a : t a; push 'a : t a -> a -> t a; ;; Only needed to work around a bug in the type checker that's preventing use ;; of the clearer and more concise type declaration: ;; pop 'a: t a -> opt (t a, a) type pair a = (t a, a); pop 'a : t a -> opt (pair a); isEmpty 'a : t a -> bool; size 'a : t a -> int; };
Uncoded into words:
- A stack is a
type t
of some values of another typea
. - For all types
a
(the “for all” here signaled by the'
), we have anempty
stack (of typet a
). E.g., ifint
is a type, then we have anempty
stack of typet int
. - For all types
a
, if we are given a stack (t a
) and a value of typea
, then we canpush
the value onto the stack. - Let’s us call the type
(t a, a)
(which is a pair with a stack as its first element and a value that could be an element of the stack as its second element)pair a
. - For all types
a
, if we have a stackt a
, we canpop
off the first element to get an optional pairsome (s, x) : opt (pair a)
consisting of the stack remaining after the first element is removed together with the removed element itself. Unless, if the given stack is empty to start with, in which case we end up withnone : opt (pair a)
. - For all types
a
, given a stackt a
, we can check if itisEmpty
. - For all types
a
, given a stackt a
, we can discover it’ssize
.
This elaboration cheats in some constraints that are not captured in the
interface defined by the module’s type. For instance, the type doesn’t tell us
that the second item of the pair pop s
comes from the stack s
, nor does it
say that the new stack returned as the first item in that pair must have the
returned element removed. These are further specifications that we could encode
in a language with dependent types, but part of 1ML’s aim is to realize first
class modules without recourse to dependent types.
As a complement to the long term pursuit of fully dependent typing disciplines, programmers in the ML tradition have developed methods of using properly abstracted types to create “trusted kernels”. This practice provides strong guarantees of reliable behavior without having to tangle with the complex (and still nascent) challenges of full dependent typing. (For more on this practice, see Oleg’s Leightweight Static Guarantees).
In our specification of the stack ADT above we say nothing at all about the
internal structure of type t a
. Consequently, nothing is revealed or claimed
about the way the stack is actually implemented or represented inside the
module. Therefore, it is impossible for a user of a module satisfying that
signature to violate the invariants which we put into our implementation. Before
considering examples of this safety in practice, let’s look at the
implementation:
Implementing the ADT
Stack :> STACK = { type t a = {size: int; content: list a}; empty 'a = {size = 0; content = List.nil}; push 'a (s : t a) x = let size' = s.size + 1 in let content' = s.content in {size = size' + 1; content = List.cons x content'}; peek 'a (s : t a) = let content = s.content in List.head content; type pair a = (t a, a); pop 'a (s : t a) = let content = s.content in caseopt (List.tail content) (fun () => Opt.none) (fun content' => caseopt (List.head content) (fun () => Opt.none) (fun x => let size' = s.size - 1 in let s' = {size = size'; content = content'} in Opt.some (s', x))); isEmpty 'a (s : t a) = let content = s.content in List.isNil content; size 'a (s : t a) = s.size };
The stack is implemented as a record holding the current size
of the stack and
the content
of the stack, represented as a list. There’s nothing interesting
going on here, and I suspect this implementation is largely self-explanatory for
anyone familiar with typed functional programming.
The likely exception is the definition for pop
. The caseopt
in use there is
the way case analysis is encoded in 1ML in absence of pattern matching. It took
me some time to come to grips with how to read these functions, so I offer the
following annotated elaboration:
;; equivalent to a case statement in Haskell or OCaml, `caseopt` is a ;; function which performs case analysis of values of type `opt`. caseopt (List.tail content) ;; Matches the `none` alternative. (fun () => Opt.none) ;; Matches the `some x` alternative, binding the value of the ;; `x` to the `content` variable in the head of the lambda. (fun content' => caseopt (List.head content) ;; This is just a nested case analysis for the other optional value (fun () => Opt.none) (fun x => let size' = s.size - 1 in let s' = {size = size'; content = content'} in Opt.some (s', x)));
Once pattern matching and record field punning is implemented, pop
could look
like something like this:
pop 'a (s : t a) = match List.tail s.content with | none => none | some content => match List.head content with | none => none | some x => some ({size = s.size - 1; content}, x)
TODO On abstract types, sealing, and security
TODO Wait, why do the modules look just like the records?
Types
TODO The size of a types
1ML’s novelty is in unifying the core and module languages from traditional MLs into a single language. The gimmick used to achieve this is a clever elaboration of the module language into System Fω, whereinin which the core language is already at home. However, type inference in System F is undecidable, and the impressive type inference enjoyed in most Haskell and {S,OCa}ml code is achieved by the famous, constrained version of System F known as the Hindley-Milner type system.