# When to use rankNTypes?

This is my first Haskell post on this site, hopefully the first of many!

I will explain how I came to use rankNTypes, why they are so useful. Hopefully, when reading this post, you will recognize when you might want to use them in your own situation.

On a side note: if you are just looking for a catch-all data type for Directed Acyclic Graphs, there is a really good blog post about DAGs in Haskell by Twan van Laarhoven. I will just focus on the RankNTypes part. Since the rank2Types switch is currently just a synonym for the rankNTypes switch, I will just talk about the latter.

In 'regular' Haskell, there is just one `forall`

in a type, and it is what a type starts with. It may be implicit or explicit, either way `forall a. a -> a`

is always interpreted as `forall a. (a -> a)`

. RankNTypes allow you to write the brackets differently, that is: `forall`

may occur to the left hand side of a type:

` (forall name. WithDAG name a) -> String`

As an example for its use, I present Directed Acyclic Graphs (or DAGs), which I used in my research on Networks on Chips.

```
{-# OPTIONS_GHC -Wall #-} -- show almost all warnings
{-# LANGUAGE RankNTypes, GeneralizedNewtypeDeriving, RoleAnnotations #-}
module MyDAGS (main, WithDAGT, runDAG, freezeRefN, addNode, showDAG, RefN, Node, DAG) where
import Control.Monad.State
```

## What is a DAG?

A DAG is a set of nodes together with a function from each node to a list (or set, depending on who defines "DAG") of nodes, in such a way that the function forms no cycles.

The "no cycle" requirement means that we will get an invariant which we need to keep consistent. This involves keeping track of which DAG each node belongs to. To do this, we add a type variable to each DAG, node and reference.

```
newtype DAG name = DAG [Node name] deriving Show
data Node name = Node String [RefN name] deriving Show
newtype RefN name = RefN Int deriving Show
-- add a node to the DAG, return a reference to the new node (and the new DAG)
hiddenAddNode :: String -> [RefN n] -> DAG n -> (RefN n, DAG n)
hiddenAddNode label refs (DAG nodes)
= ( RefN (length nodes)
, DAG (nodes ++ [Node label refs])
)
```

Note that the `name`

part is not used anywhere, this is called a Phantom Type, and it will help us keep track of nodes and which references belong to which DAGs. If we refrain from using cyclic definitions, then "new" nodes can just be built from "old" ones. Since the old ones point to existing nodes, the new nodes will point backwards in the latest DAG:

```
(dagnode1,dag1) = hiddenAddNode "node1" [] (DAG [])
(dagnode2,dag2) = hiddenAddNode "node2" [dagnode1] dag1
-- dag2 == DAG [Node "node1" [],Node "node2" [RefN 0]]
```

Unfortunately, Haskell will just match "name" to any type we throw at it, so we could also write:

```
(dagnode1,dag1) = hiddenAddNode "node1" [] (DAG [])
(dagnode2,dag2) = hiddenAddNode "node2" [dagnode1] (DAG [])
-- dag2 == DAG [Node "node2" [RefN 0]]
-- ouch! Node 0 refers to Node 0, a cycle!
```

## When to use rankNTypes

So here is the problem we wish to solve: `"node2"`

points to itself, and we only used `hiddenAddNode`

and `DAG []`

. The problem is that the second `(DAG [])`

refers to an old version of the DAG, while `dagnode1`

is a reference to a newer version. As a solution, we pass the `DAG`

implicitly, such that we are forced to use the newest version all the time. Monads are great for this; we should be able to write:

```
someDAG :: String
someDAG = showDAG (do dagnode1 <- addNode "node1" []
_dagnode2 <- addNode "node2" [dagnode1]
return ())
```

The type of `DAG n -> (DAG n, RefN n)`

is precisely the right type for the state monad. The state monad also allows us to get to the internal state via `get`

and `put`

, which is what we want to prevent. Therefore we create our own monad instead, and use it for addNode:

```
newtype WithDAGT name a = WithDAG (State (DAG name) a)
deriving (Monad,Applicative,Functor)
addNode :: String -> [RefN name] -> WithDAGT name (RefN name)
addNode label refs = WithDAG (state (hiddenAddNode label refs))
```

We have to supply an initial empty DAG somewhere, somehow. This is where rankNTypes come in. I learned the trick from this paper by Simon Peyton Jones.

```
runDAG :: (forall name. WithDAGT name a) -> (a, DAG ())
runDAG (WithDAG st) = runState st (DAG [])
```

The cool thing about `runDAG`

is that it allows us to get all the values. The 'name' of the corresponding DAG, however, is marked as `()`

. This prevents us from ever using that DAG inside `runDAG`

ever again. This is because `name`

does not match with `()`

, which is exactly what that `forall`

was for. In fact, it even prevents us from getting references out of the `WithDAG`

:

` someNode = fst (runDAG (addNode "node1" []))`

gives the following type error:

```
Couldn't match type 'a' with 'RefN name'
because type variable 'name' would escape its scope
```

There is an easy fix:

```
freezeRefN :: RefN name -> WithDAGT name (RefN ())
freezeRefN (RefN i) = return (RefN i)
_someNode :: RefN () -- returns "RefN 0"
_someNode = fst (runDAG (do node<- addNode "node1" []
freezeRefN node))
```

Once again, we get a reference which we will never be able to use inside `runDAG`

. This means that `runDAG`

, `freezeRefN`

, and `addNode`

are perfectly safe to export. We are able to write `someDAG`

with them, since we can implement `showDAG`

with just `runDAG`

:

```
showDAG :: (forall name. WithDAGT name a) -> String
showDAG wd = show (snd (runDAG wd))
main :: IO ()
main = putStrLn someDAG
-- output: DAG [Node "node1" [],Node "node2" [RefN 0]]
```

You'll see that we also exported `RefN`

, `Node`

and `DAG`

. These are just the types, and that is fine too, as long as we do not export their constructors: we want all construction inside `WithDAGT`

. If we do export the constructors, we are in trouble, since it would then be trivial to cast a `RefN name`

to a `RefN name2`

. In fact, there is this new feature (as of GHC 7.8) called `coerce`

that can do just that. This is where I believe things get ugly, but there is a fix. I won't get into the details, but the following lines will make sure that `coerce`

is not allowed.

```
type role RefN nominal
type role WithDAGT nominal nominal
```

Another downside to our solution, is that Haskell is not able to derive all the types automatically anymore. Instead, we get type errors every time we forget to put a forall to the left hand side of an arrow. This can be a bit annoying, and sometimes even very annoying, but I'll save that for another blog post. In GHC 7.10.1, RankNTypes do "just work", but they produce awkward error messages and sometimes require unintuitive work-arounds. This makes rankNTypes feel kind of un-Haskell like to me. People are currently improving the warnings for GADTs, which will influence rankNTypes as well: rankNTypes support will probably only get better over time.

## Exercises

If you wish to check whether you've understood this post, here are some exercises for you:

- Run main (either by renaming the module to Main, or by using ghci).
- Create a DAG from a tree structure using only
`addNode`

and ultimately`runDAG`

(you can use all the Prelude functions, and need to define a tree structure yourself.) - This one is quite tricky: there are several ways to solve this, and I don't believe any of those solutions make the code prettier (mail me if they do).

The type`Node name`

is not present in any of the 'safe' functions (`runDAG`

,`freezeRefN`

, and`addNode`

). Hence it should be possible to create another safe version with the following definition instead:

` data Node = Node String [Int]`

- Write
`getNode`

, which looks up a Node in the DAG according to its reference, and then`freezeNode`

, such that this works:

```
someNode = fst (runDAG (do nodeRef <- addNode "node1" []
realNode <- getNode nodeRef
freezeNode realNode))
```