<aside> ⚠️
WARNING: This page discusses recursion schemes in Haskell
</aside>
For any recursive datatype, such as
data Tree a = Leaf a | Branch [Tree a]
we can factor out self-reference into a type parameter, as in
data TreeF a s = LeafF a | BranchF [s]
This new type represents a “single layer” of a tree, where the type of leaves is given by a and the type of sublayers is given by s.
The type TreeF is called the base functor for Tree. We can recover the original datatype as
-- Infinite self-application of a functor: Fix f ≅ f (Fix f)
data Fix f = Fix (f (Fix f))
Fix (TreeF a) ≅ Tree a
These gymnastics are useful because we can now write a function encapsulating recursion:
-- Collapse a structure by folding it bottom-up
fold :: Functor f => (f r -> r) -> Fix f -> r
fold f (Fix whole) = f (fold f <$> whole)
The input f r -> r specifies how to collapse a single layer of our structure, granted that sublayers have already been collapsed. It is sometimes called an f-algebra. fold then handles the work of applying that function to each layer of the structure, bottom-up.
For instance, we can specify a recursive sum-of-leaves like this:
type Tree a = Fix (TreeF a)
sumTree :: Tree Int -> Int
sumTree = fold \\case
LeafF a -> a
BranchF (as :: [Int]) -> sum as
The function fold (also known as cata) is a recursion scheme — it gives a strategy for walking over a recursive structure, without tying itself to any particular structure.
There are loads of recursion schemes; this is just one. Read on!
data-fix package provides a basic collection of recursion schemes over the Fix datatype. It also works with the Mu and Nu combinators, which I know less about.recursion-schemes package is a more powerful option, providing more generality, better ergonomics, and a greater variety of recursion schemes, at the cost of more complexity.