436 lines
13 KiB
Markdown
436 lines
13 KiB
Markdown
---
|
|
title: a little ats program
|
|
date: 2022-09-16
|
|
tags: [computer, ATS, cool languages]
|
|
show-toc: true
|
|
...
|
|
|
|
[ats] is a language that is a little infamous among type system likers, for its
|
|
complexity, its weird syntax, and its thin documentation. well i won't be beat,
|
|
so here's a little implementation of simply-typed lambda calculus. right now it
|
|
doesn't actually use many ats-specific features, since i just didn't end up
|
|
needing them.
|
|
|
|
[ats]: http://www.ats-lang.org
|
|
|
|
it does leak memory without gc though, so maybe i should have.
|
|
oh well one thing at a time
|
|
|
|
# installation, et cetera
|
|
|
|
instead of trying to install ats myself, i just used [nix], since i already
|
|
have it installed as the least annoying package manager for idris. i just run
|
|
`nix shell nixpkgs#ats2` and i'm in a shell with `patscc` and `patsopt`
|
|
available. if you want to try building it yourself, that's, uh, your choice.
|
|
|
|
[nix]: https://nixos.org
|
|
|
|
to compile the program, i wrote a makefile. just a small one!
|
|
|
|
```makefile
|
|
ATSCC := patscc
|
|
ATSFLAGS := \
|
|
-DATS_MEMALLOC_GCBDW -lgc \
|
|
-atsccomp \
|
|
'gcc -I$${PATSHOME} -I$${PATSHOME}/ccomp/runtime \
|
|
-L$${PATSHOME}/ccomp/atslib/lib -std=gnu99'
|
|
|
|
%: %.dats
|
|
$(ATSCC) $< $(ATSFLAGS) -o $@
|
|
```
|
|
|
|
(in some installs the compiler is just called `atscc` for some reason.)
|
|
|
|
the first line of `ATSFLAGS` tells it which allocator to use. in this case,
|
|
`libgc`. the rest tells it to use `-std=gnu99`, so that it can find `alloca()`;
|
|
the rest is the same as the default.
|
|
|
|
one last thing. the first two lines of the program have to be this. i didn't
|
|
look into it too hard but they seem to define... something... and if you forget
|
|
to include them the c output by ats isn't even syntactically correct. ouch.
|
|
|
|
```ats
|
|
#include "share/atspre_define.hats"
|
|
#include "share/atspre_staload.hats"
|
|
```
|
|
|
|
# types
|
|
|
|
let's start by defining the syntax for types. it's going to be a single base
|
|
type, and functions.
|
|
|
|
$$
|
|
\newcommand\Typ{\mathit{typ}}
|
|
\newcommand\OR{\mathrel|}
|
|
\Typ ::= \bullet \OR \Typ_1 \to \Typ_2
|
|
$$
|
|
|
|
so obviously, my first attempt was to just write this:
|
|
|
|
```ats
|
|
datatype typ = Base | Arr of (typ, typ)
|
|
```
|
|
|
|
i'm going to need to check two types for equality, so let's do that next.
|
|
|
|
```ats
|
|
fun eq_typ (a : typ, b : typ) .<a>. :<> bool =
|
|
case (a, b) of
|
|
| (Base(), Base()) => true
|
|
| (Arr(a1, a2), Arr(b1, b2)) => eq_typ(a1, b1) && eq_typ(a2, b2)
|
|
| (_, _) => false
|
|
```
|
|
|
|
alas: (it is talking about the `.<a>.`)
|
|
|
|
```
|
|
/home/niss/ats/lam.dats: 149(line=7, offs=33) -- 150(line=7, offs=34):
|
|
error(2): the static identifier [a] is unrecognized.
|
|
```
|
|
|
|
surprise! as far as i can tell ats only allows termination checking to be
|
|
based on _static_ (compile time) information. so i need to index the
|
|
datatype with, uh, something. when we define functions on syntax like this,
|
|
they basically always recurse on a subterm. so the height of the syntax tree
|
|
will do. what i _actually_ have are these more messy definitions.
|
|
|
|
```ats
|
|
datatype typ(int) =
|
|
| Base(1)
|
|
| {h1, h2 : nat} Arr(1 + max(h1, h2)) of (typ(h1), typ(h2))
|
|
|
|
typedef typ = [h : nat] typ(h)
|
|
```
|
|
|
|
the base type $\bullet$ has height 1, and an arrow type has a height one more
|
|
than the max height of its subterms. `nat` is a version of `int` constrained to
|
|
be at least zero, rather than a separate type, so that's why it says `int` one
|
|
place and `nat` the other. the typechecker has a linear arithmetic solver for
|
|
reasoning about inequalities and expressions like `1 + max(h1, h2)`.
|
|
|
|
i don't want to be talking about "types of height $h$" unless i really have to,
|
|
though, so there's a definition for just _a_ type too. the syntax `[x : t1] t2`
|
|
is an existential type that is automatically wrapped and unwrapped, so in this
|
|
case it is a `typ(h)` for some, unknown, `h`. also, names can be overloaded by
|
|
their arity so `typ(h)` and `typ` can both be defined in the same scope.
|
|
|
|
so now `eq_typ` has to have a more intricate type to talk about height. the
|
|
definition is still the same though.
|
|
|
|
```ats
|
|
fun eq_typ {ha : nat} .<ha>. (a : typ(ha), b : typ) :<> bool =
|
|
case (a, b) of
|
|
| (Base(), Base()) => true
|
|
| (Arr(a1, a2), Arr(b1, b2)) => eq_typ(a1, b1) && eq_typ(a2, b2)
|
|
| (_, _) => false
|
|
|
|
overload = with eq_typ
|
|
```
|
|
|
|
since i only actually need to care about one of the arguments' height for the
|
|
termination proof, the other can still just be a `typ`. the last line lets me
|
|
write `a = b` instead of `eq_typ(a, b)`, which is nice.
|
|
|
|
next, i want to be able to print types. ideally this would be a function that
|
|
returns a string, but string stuff involves pointer weirdness i haven't worked
|
|
out yet. so it just prints. the only two precedence levels in these types are
|
|
"left of an arrow" and "not left of an arrow", so that can just be a `bool`.
|
|
unfortunately `print` has no information about its termination behaviour so
|
|
there's no point bothering with that stuff here either. :confused:
|
|
|
|
```ats
|
|
fun print_typ (parens : bool, t : typ) : void =
|
|
case t of
|
|
| Base() => print "•"
|
|
| Arr(a, b) => begin
|
|
if parens then print("(");
|
|
print_typ(true, a);
|
|
print(" → ");
|
|
print_typ(false, b);
|
|
if parens then print(")");
|
|
end
|
|
|
|
fn print_typ (t : typ) : void = print_typ(false, t)
|
|
overload print with print_typ
|
|
```
|
|
|
|
# terms
|
|
|
|
types are done, now let's move on to terms. we just want names, lambdas, and
|
|
application.
|
|
|
|
$$
|
|
\newcommand\Term{\mathit{term}}
|
|
\newcommand\Var{\mathit{var}}
|
|
\Term ::= \Var \OR \lambda \Var : \Typ. \Term \OR \Term_1 \; \Term_2
|
|
$$
|
|
|
|
for the same reason as before, terms are indexed by their height, but _also_ by
|
|
the number of variables in scope, since i want de bruijn indices to be
|
|
intrinsically well-scoped. why even have a type system otherwise. so a
|
|
`term(n, h)` has $n$ variables and height $h$.
|
|
|
|
```ats
|
|
datatype term(int, int) =
|
|
| {n, i : nat | i < n} Var(n, 1) of int(i)
|
|
| {n, h : nat} Lam(n, 1 + h) of (typ, term(1 + n, h))
|
|
| {n, h1, h2 : nat} App(n, 1 + max(h1, h2)) of (term(n, h1), term(n, h2))
|
|
|
|
typedef term(n : int) = [h : nat] term(n, h)
|
|
```
|
|
|
|
a variable is [a de bruijn index][db], which is a natural less than `n`. to
|
|
express this we need to bridge the static world, where `n` lives, and the
|
|
dynamic, where `Var`'s field lives. so i'm using a singleton type `int(i)`,
|
|
whose only value is `i` itself. with this type, you can constrain the static
|
|
`i`, and in turn the value of the field.
|
|
|
|
[db]: https://en.wikipedia.org/wiki/De_Bruijn_index
|
|
|
|
|
|
# contexts
|
|
|
|
```ats
|
|
typedef ctx(n: int) = list(typ, n)
|
|
```
|
|
|
|
a typing context is just a list of types of the appropriate length. since
|
|
there's no binding in types here, that's all we need.
|
|
|
|
looking up into a context is just list lookup, with some constraints to make it
|
|
work.
|
|
|
|
```ats
|
|
fun lookup {n, i : nat | i < n} .<n>. (g : ctx(n), i : nat(i)) :<> typ =
|
|
let val+ list_cons(x, g) = g in
|
|
if i = 0 then x else lookup(g, i - 1)
|
|
end
|
|
```
|
|
|
|
from the constraints $n \ge 0, i \ge 0, i < n$, the compiler can see that, as
|
|
$n \ge 1$, the list is nonempty and `list_cons` _is_ the only case. it can also
|
|
propagate the $i \ne 0$ condition from the `else` to see the recursive call is
|
|
ok too, and this function is total.
|
|
|
|
:::aside
|
|
while writing this post i discovered that this is actually a restatement of
|
|
the type of `xs[i]` in the standard library and i could have just used that.
|
|
oops!
|
|
:::
|
|
|
|
|
|
# typechecker
|
|
|
|
i'm going to write the type checker using exceptions for error handling, because
|
|
using an error monad without `do` notation was just too annoying. but from the
|
|
outside it is still going to use a `result` type.
|
|
|
|
```ats
|
|
datatype result(t@ype, t@ype) =
|
|
| {a, e : t@ype} Ok(a, e) of a
|
|
| {a, e : t@ype} Err(a, e) of e
|
|
```
|
|
|
|
the funky looking `t@ype` refers to a type of unknown size, as opposed to `type`
|
|
which can only be instantiated with pointer-sized types. that will not come up
|
|
again here but "`t@ype`" is definitely the funniest part of ats's syntax.
|
|
|
|
the errors we need here are:
|
|
|
|
- a function takes an argument of type $A$, but is given one of type $B$;
|
|
- a term of a non-arrow type is being applied.
|
|
|
|
```ats
|
|
datatype err =
|
|
| Clash of (typ, typ)
|
|
| NotArr of typ
|
|
|
|
fn print_err (e : err) : void = … // incredibly uninteresting
|
|
overload print with print_err
|
|
|
|
|
|
typedef typ_res = result(typ, err)
|
|
|
|
fn print_res (r : typ_res) : void = …
|
|
overload print with print_res
|
|
```
|
|
|
|
so onto the typechecker. like i said, it's going to use a local exception for
|
|
errors, but then catch it on the outside so that it has a pure type.
|
|
|
|
```ats
|
|
fn typ_of {n : nat} (g : ctx n, t : term(n)) :<> typ_res =
|
|
let
|
|
exception ERR of err
|
|
fun go {n, h : nat} .<h>. (g : ctx(n), t : term(n, h)) :<!exn> typ =
|
|
case t of
|
|
| Var(i) => lookup(g, i)
|
|
| Lam(a, t) => Arr(a, go(list_cons(a, g), t))
|
|
| App(s, t) => begin
|
|
case go(g, s) of
|
|
| Arr(a1, b) =>
|
|
let val a2 = go(g, t) in
|
|
if a1 = a2 then b else $raise ERR(Clash(a1, a2))
|
|
end
|
|
| b => $raise ERR(NotArr(b))
|
|
end
|
|
in
|
|
$effmask_exn(try Ok(go(g, t)) with ~ERR(e) => Err(e))
|
|
end
|
|
|
|
fn typ_of (t : term(0)) :<> typ_res = typ_of(list_nil, t)
|
|
```
|
|
|
|
most of this is fine. a bit noisy i guess. but there are a couple of new things:
|
|
|
|
- in the type of `go`, the `:<!exn>` part means it can throw exceptions, but has
|
|
no other effects.
|
|
- `$effmask_exn(e)` hides the `exn` effect in `e`, because ats doesn't
|
|
distinguish different exceptions in effects so it can't tell that `ERR` is the
|
|
only one that is ever raised.
|
|
- exceptions are linear, and the pattern `~ERR(e)` is consuming the outermost
|
|
constructor.
|
|
- exceptions are the only reason we needed `alloca` in this program
|
|
:woozy_face:
|
|
|
|
|
|
# the end
|
|
|
|
ok, a quick `main` function, and that's it!
|
|
|
|
```ats
|
|
implement main0() = begin
|
|
print("type of [λx:•. x] is: ");
|
|
print(typ_of(Lam(Base, Var(0))));
|
|
print("\ntype of [λx:•. x x] is: ");
|
|
print(typ_of(Lam(Base, App(Var(0), Var(0)))));
|
|
print("\n")
|
|
end
|
|
```
|
|
|
|
and it works, at least on these two examples.
|
|
|
|
```
|
|
type of [λx:•. x] is: • → •
|
|
type of [λx:•. x x] is: expected an arrow type, but got [•]
|
|
```
|
|
|
|
uh. thank's for reading. maybe i'll write another one if i think of another
|
|
nice little program that actually uses more of ats's weirdness.
|
|
|
|
--------------------------------------------------------------------------------
|
|
|
|
<details>
|
|
<summary>full code</summary>
|
|
```ats
|
|
#include "share/atspre_define.hats"
|
|
#include "share/atspre_staload.hats"
|
|
|
|
datatype typ(int) =
|
|
| Base(1)
|
|
| {h1, h2 : nat} Arr(1 + max(h1, h2)) of (typ(h1), typ(h2))
|
|
|
|
typedef typ = [h : nat] typ(h)
|
|
|
|
fun eq_typ {ha : nat} .<ha>. (a : typ(ha), b : typ) :<> bool =
|
|
case (a, b) of
|
|
| (Base(), Base()) => true
|
|
| (Arr(a1, a2), Arr(b1, b2)) => eq_typ(a1, b1) && eq_typ(a2, b2)
|
|
| (_, _) => false
|
|
overload = with eq_typ
|
|
|
|
fun print_typ (parens : bool, t : typ) : void =
|
|
case t of
|
|
| Base() => print "•"
|
|
| Arr(a, b) => begin
|
|
if parens then print("(");
|
|
print_typ(true, a);
|
|
print(" → ");
|
|
print_typ(false, b);
|
|
if parens then print(")");
|
|
end
|
|
|
|
fn print_typ (t : typ) : void = print_typ(false, t)
|
|
overload print with print_typ
|
|
|
|
|
|
datatype term(int, int) =
|
|
| {n, i : nat | i < n} Var(n, 1) of int(i)
|
|
| {n, h : nat} Lam(n, 1 + h) of (typ, term(1 + n, h))
|
|
| {n, h1, h2 : nat} App(n, 1 + max(h1, h2)) of (term(n, h1), term(n, h2))
|
|
|
|
typedef term(n : int) = [h : nat] term(n, h)
|
|
|
|
|
|
typedef ctx(n: int) = list(typ, n)
|
|
|
|
fun lookup {n, i : nat | i < n} .<n>. (g : ctx(n), i : int(i)) :<> typ =
|
|
let val+ list_cons(x, g) = g in
|
|
if i = 0 then x else lookup(g, i - 1)
|
|
end
|
|
|
|
|
|
datatype result(t@ype, t@ype) =
|
|
| {a, e : t@ype} Ok(a, e) of a
|
|
| {a, e : t@ype} Err(a, e) of e
|
|
|
|
|
|
datatype err =
|
|
| Clash of (typ, typ)
|
|
| NotArr of typ
|
|
|
|
fn print_err (e : err) : void =
|
|
case e of
|
|
| Clash(a, b) => begin
|
|
print("clash between types [");
|
|
print(a);
|
|
print("] and [");
|
|
print(b);
|
|
print("]");
|
|
end
|
|
| NotArr(t) => begin
|
|
print("expected an arrow type, but got [");
|
|
print(t);
|
|
print("]");
|
|
end
|
|
overload print with print_err
|
|
|
|
|
|
typedef typ_res = result(typ, err)
|
|
|
|
fn print_res (r : typ_res) : void =
|
|
case r of Ok(t) => print(t) | Err(e) => print(e)
|
|
overload print with print_res
|
|
|
|
|
|
fn typ_of {n : nat} (g : ctx n, t : term(n)) :<> typ_res =
|
|
let
|
|
exception ERR of err
|
|
fun go {n, h : nat} .<h>. (g : ctx(n), t : term(n, h)) :<!exn> typ =
|
|
case t of
|
|
| Var(i) => lookup(g, i)
|
|
| Lam(a, t) => Arr(a, go(list_cons(a, g), t))
|
|
| App(s, t) => begin
|
|
case go(g, s) of
|
|
| Arr(a1, b) =>
|
|
let val a2 = go(g, t) in
|
|
if a1 = a2 then b else $raise ERR(Clash(a1, a2))
|
|
end
|
|
| b => $raise ERR(NotArr(b))
|
|
end
|
|
in
|
|
$effmask_exn(try Ok(go(g, t)) with ~ERR(e) => Err(e))
|
|
end
|
|
|
|
fn typ_of (t : term(0)) :<> typ_res = typ_of(list_nil, t)
|
|
|
|
|
|
implement main0() = begin
|
|
print("type of [λx:•. x] is: ");
|
|
print(typ_of(Lam(Base, Var(0))));
|
|
print("\ntype of [λx:•. x x] is: ");
|
|
print(typ_of(Lam(Base, App(Var(0), Var(0)))));
|
|
print("\n")
|
|
end
|
|
```
|
|
</details>
|