blog/posts-wip/2023-06-05-quox.md

316 lines
12 KiB
Markdown
Raw Normal View History

2023-12-27 19:00:25 -05:00
---
title: intro to quox
date: 2023-06-05
tags: [computer, quox (language)]
show-toc: true
bibliography: quox.bib
link-citations: true
...
<figure class='shaped floating' style='shape-outside: url(/images/quox.png)'
aria-label=''>
<img src=/images/quox.png
alt='quox sprite from quest of ki'
title='quox sprite from quest of ki'>
<figcaption>(this is a quox)</figcaption>
</figure>
hello for _a while_ now i've been working on a language called quox. the
one-sentence, meaningless summary is "qtt and xtt mashed together".
:::aside
wow, q and x! what an amazing coincidence!
:::
but maybe i should say what those are. i'm going to _try_ to aim this at someone
who knows normal languages. i guess we'll see how successful that is. so first,
# dependent types {#dt}
maybe you already know this one. skip it if you want. (maybe you know all of
this but you came to say hi anyway. hi!)
all a <dfn>dependent type</dfn> is is a type that is allowed to talk about run
time values. like a dependent pair might be `(len : ) × Array len String` for
a length paired with an array of strings with that length. a dependent function
with a type like `(len : ) → (x : A) → Array len A` takes a length and element
`x` as arguments, and returns an array of that many copies of `x`.
even ~~parametric polymorphism~~ generics are a specific form of dependent type:
you take a type as a parameter, and get to use it in the types of the other
arguments.
:::aside
<details>
<summary>but i can do that in rust/c++/haskell too</summary>
yeah! well, partially. in rust you can have like
```rust
fn replicate<const N: usize, A: Clone>(val: A) -> [A; N] {
[(); N].map(|_| val.clone())
}
```
but it's a bit more restricted:
- `N` has to always be known at compile time. you can't, for example, have the
length come from a config file or command-line argument
- in rust [(at the time of writing)]{.note} and c++, only certain number-ish
types can be used in this way. in ghc-haskell you have more choice for what
data can be used in types, but you—or template haskell—have to rewrite
functions for the type and value level, and have "singleton" types to bridge
between compile time and run time
so yeah, you can get some of the way there, but not completely.
</details>
:::
dependent types let you classify values more precisely than before, so you can
do things like have ASTs that reflect their local variables in the type.
in quox, and most uses of this technique, it's enough to just keep the _number_
of variables in scope.
[(there are two counts in quox; see [below](#xtt) for why.)]{.note}
in a definition like
<!-- i need a default quantity so i can write this without any "what's that" -->
```quox
def backwards-plus : ω. → ω. =
λ a b ⇒ plus b a
```
:::aside
<details>
<summary>what does all that mean</summary>
- the `ω` before each argument means you have no restrictions on how you can
use it. see [below](#qtt). i want to have a default so you could just write
``, but i can't decide what the default should _be_
- functions are curried, which means they take their arguments one by one, like
in haskell or ocaml, rather than in a tuple. doing it this way makes writing
dependencies (and quantities) easier.
- a function is written as `λ var1 var2 ⇒ body`
- all those funky symbols have ascii alternatives, so you if you like it better
you can also write
```quox
def backwards-plus : #.Nat -> #.Nat -> Nat =
fun a b => plus b a
```
</details>
:::
the right hand side `λ a b ⇒ plus b a` is necessarily a `Term 0 0`, with
no local variables. the body of the function is a `Term 0 2`, because it has two
term variables in scope.
typing contexts also know how many variables they bind, so you can know for sure
you are keeping the context properly in sync with the term under consideration.
and if you forget, then the compiler, uh, "reminds" you. since it's notoriously
easy to make off-by-one errors and similar mistakes when dealing with variables,
so having the computer check your work helps a lot.
--------------------------------------------------------------------------------
you might not want to have every property you will ever care about be always
reflected in types. quox's expressions have their scope size in their type,
because dealing with variables is ubiquitous and fiddly, but they don't have
like, a flag for whether they're reducible. i _do_ care about that sometimes,
but it's easier to have it as a separate value:
```idris
-- in Data.So in the standard library
data Oh : Bool -> Type where
Oh : So True
-- in Data.DPair (simplified for now)
data Subset : (a : Type) -> (p : a -> Type) -> Type where
Element : (x : a) -> p x -> Subset a p
isRedex : Definitions -> Term d n -> Bool
whnf : (defs : Definitions) -> WhnfContext d n ->
Term d n -> Subset (Term d n) (\t => So (not (isRedex defs t)))
```
a term is a <dfn>redex</dfn> (reducible expression) if the top level AST node is
something that can be immediately reduced, like a function being applied to an
argument, or a definition that can be unfolded. <dfn>whnf</dfn> ([weak head
normal form][whnf]) reduces the top of the expression until there are no more
reductions to do, and then returns the result, along with a proof that there are
no more.
[whnf]: https://en.wikipedia.org/wiki/Lambda_calculus_definition#Weak_head_normal_form
datatype arguments can be of any type, but also, data constructors can restrict
the values of those arguments in their return types. (this is what makes them
useful in the first place.) in this case, `So` only has one constructor, only
usable when its argument is `True`, meaning that constructing a value of type
`So p` is only possible if the expression `p` reduces to `True`.
:::aside
<details>
<summary>`So` considered harmful, or whatever</summary>
in a lot of cases you need to write the property inductively, i.e., as a
datatype, like
```idris
data NotRedex : Definitions -> Term d n -> Type
-- DPair is similar to Subset
whnf : (defs : Definitions) -> WhnfContext d n ->
Term d n -> DPair (Term d n) (\t => NotRedex defs t)
```
the reason for this is that it is often easier to define other functions by
matching on the proof rather than the original term.
but in this case that is not necessary and writing a function into `Bool` is
easier.
</details>
:::
other parts of the compiler, like equality checking, can similarly require
a proof that their arguments are not redexes, so that they don't have to keep
calling `whnf` over and over, or risk wrongly failing if one argument isn't
reduced enough.
# qtt (quantitative type theory) {#qtt}
:::note
(idris (2) has this one too, so i can still use real examples for now)
:::
having this extra safety is nice, but it would be even nicer if it we could be
sure it wouldn't affect run time efficiency. for a long time, dependently typed
languages have tried to use heuristics to elide constructor fields that were
already determined by other fields, at least as far back as 2003 [@indices].
but these analyses are anti-modular, in that a constructor field can only be
erased if it is not matched against _anywhere_ in the whole program.
maybe we should try telling the computer what we actually want.
in qtt [@qtt; @nuttin], every local variable is annotated with
a <dfn>quantity</dfn>, telling us how many times we can use it at run time. in
quox [(and idris2)]{.note}, the possible choices are `0` (not at all;
<dfn>erased</dfn>), `1` (exactly once; <dfn>linear</dfn>), and `ω` (any number
of times; <dfn>unrestricted</dfn>, and the default in idris and not written). if
a variable is marked with `0`, then you can't do anything with it that would
affect run time behaviour. for example,
- you can only match on values if their type has one or zero cases. if you
"have" a variable of the empty type `v : {}`, you're already in an unreachable
branch, so it's fine to abort with
`case0 v return 〈whatever〉 of { }`.
if you have an erased pair, it's fine to split it up, but the two parts will
still be erased.
matching on something like `Bool` isn't possible, because the value is no
longer there to look at.
- type signatures only exist at compile time so you can do whatever you want
there.
- equality proofs don't have any computational behaviour (unlike in [some other
type theories][hott]), so [coercion](#xtt) works with an erased proof
[hott]: https://homotopytypetheory.org
as well as erasure, there is also linearity. a linear variable must be used
exactly once in a linear context (and any number of times in an erased context,
like in types or proofs talking about it). this is useful for things like file
handles and other kinds of resources that have strict usage requirements. it's
similar to passing a variable by value in rust, where after you do so, you can't
use it yourself any more.
:::aside
there's no equivalent to <dfn>borrowing</dfn> inside the type system, but
i think with a careful choice of builtins, it would be possible to do a similar
thing in an external library.
_[rust person voice]_ it would be less _ergonomic_ as library, but having
a borrow checker inside the language would immediately blow my _complexity
budget_. :crab:
:::
i don't have much to say about this, honestly, but ask any rust user about the
benefits of tracking resource ownership in types.
--------------------------------------------------------------------------------
so where do these quantities come from? from the types, of course. a function
type in quox, which looks like `ρ.(x : A) → B`, has a quantity ρ attached,
which describes how a function value of that type can use its argument.
an identity function `λ x ⇒ x` can have type `1.A → A` or `ω.A → A`, but not
`0.A → A`. and a "diagonal" function `λ x ⇒ (x, x)` can only be `ω.A → A × A`.
a whole definition can be erased (and if it's a definition of a type, it has to
be, since types don't exist at run time), like
```quox
def0 TwoOfThem : ★ = ×
```
finally, you can mark a specific term with a quantity. say you want to write
a function that returns some number, plus an erased proof that it's even.
obviously you can't mark the whole definition as erased with `def0`, since
you want the number itself. and giving the return type as `(n : ) × Even n`
makes the proof appear at run time, which might be unwanted if it's something
big. so you can erase the second half of the pair by writing
`(n : ) × [0. Even n]`. a value of a "boxed" type `\[π. A]` is written `\[e]`
if `e : A`. for a slightly bigger example, you might want a decidable equality
that gives you _erased_ proofs, so you can use them in coercions, but they don't
show up at run time.
```quox
def0 Not : ω.★ → ★ = λ A ⇒ ω.A → {}
def0 Either : ω.★ → ω.★ → ★ = ⋯ -- constructors Left and Right
def0 Dec : ω.★ → ★ = λ A ⇒ Either [0. A] [0. Not A]
def Yes : 0.(A : ★) → 0.A → Dec A = λ A y ⇒ Left [0. A] [0. Not A] [y]
def No : 0.(A : ★) → 0.(Not A) → Dec A = λ A n ⇒ Right [0. A] [0. Not A] [n]
def0 DecEq : ω.★ → ★ = λ A ⇒ ω.(x y : A) → Dec (x ≡ y : A)
```
you can also use the same construction to have some unrestricted parts of an
otherwise linear structure.
:::aside
still missing from this story, in my opinion, is some form of compile-time
irrelevance. a lot of the time, you don't care about the content of a proof,
only that it is satisfied, so if division has a type like
`div : 1.ℚ → 1.(d : ) → 0.(NonZero d) → ℚ`, you want some way to get
`div x y p₁` and `div x y p₂` to always be equal, without even having to look at
`p₁` and `p₂`. there's no way to do that yet, because it doesn't seem to fit
into qtt cleanly. maybe a single squash type..?
:::
# xtt ("extensional" type theory) {#xtt}
:::aside
but not _that_ extensional type theory
:::
[@xtt]
# other stuff {#misc}
- crude but effective [@crude; @mugen]
- bidirectional typechecking [@bidi]
- ...
# i still don't know how to actually write a program {.unnumbered}
i know. that's ok. i'm just trying to communicate why someone might,
hypothetically, care.
did it work?
# references {#ref}