summaryrefslogtreecommitdiff
path: root/haskell/composition.md
diff options
context:
space:
mode:
authorTom Smeding <tom@tomsmeding.com>2021-08-15 23:10:16 +0200
committerTom Smeding <tom@tomsmeding.com>2021-08-15 23:12:35 +0200
commit32617e00f49c2259bf99767c1f658691bc7fa20e (patch)
tree5fee1fb5aa016053046507fc26761e60d85934a6 /haskell/composition.md
parentb09bcc44c74b7cd2cccf786fde9e0211e54233ec (diff)
Add haskell/composition post
Diffstat (limited to 'haskell/composition.md')
-rw-r--r--haskell/composition.md218
1 files changed, 218 insertions, 0 deletions
diff --git a/haskell/composition.md b/haskell/composition.md
new file mode 100644
index 0000000..aadb6c3
--- /dev/null
+++ b/haskell/composition.md
@@ -0,0 +1,218 @@
+## Function composition in Haskell
+
+In Haskell, the dot operator `(.)`, written infix like `f . g`, is _function composition_.
+For example, suppose you have two functions:
+
+```haskell
+addone :: Int -> Int
+addone n = n + 1
+
+timestwo :: Int -> Int
+timestwo n = n * 2
+```
+
+then you can _compose_ those two functions like `timestwo . addone`:
+
+```haskell
+-- computes 2 * (n + 1), which is 2 * n + 2
+both :: Int -> Int
+both = timestwo . addone
+```
+
+And we can try it out in `ghci`:
+
+```
+Prelude> both 10
+22
+Prelude> timestwo (addone 10)
+22
+Prelude> 2 * (10 + 1)
+22
+```
+
+This works well, like we expect.
+These functions are intentionally very simple; the point of this post is to explain how function composition works, not how to write complex Haskell functions.
+
+Now suppose we have an additional, two-argument function to play with:
+
+```haskell
+plus :: Int -> Int -> Int
+plus a b = a + b
+```
+
+If we give `plus` some arguments, we can apply `timestwo` to the result:
+
+```
+Prelude> timestwo (plus 3 4) -- 2 * (3 + 4) = 14
+14
+```
+
+Getting inspiration from the first example with `timestwo` and `addone`, we might want to try writing a function like `both` that composes `timestwo` and `plus` using `(.)`.
+
+```haskell
+both2 = timestwo . plus
+```
+
+But the compiler won't accept this:
+
+```
+ • Couldn't match type ‘Int -> Int’ with ‘Int’
+ Expected type: Int -> Int
+ Actual type: Int -> Int -> Int
+ • Probable cause: ‘plus’ is applied to too few arguments
+ In the second argument of ‘(.)’, namely ‘plus’
+ In the expression: timestwo . plus
+ In an equation for ‘both2’: both2 = timestwo . plus
+ |
+11 | both2 = timestwo . plus
+ | ^^^^
+```
+
+Why is this case different than the first example with `both`?
+The difference is that `plus` takes two arguments, not one.
+As always in Haskell, let's take a look at the types to see what's going on here.
+
+
+### Types
+
+What are the types of the functions we're using?
+
+```haskell
+addone :: Int -> Int
+timestwo :: Int -> Int
+plus :: Int -> Int -> Int
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+```
+
+Remember that you can look up the type of standard library functions online: e.g. [here for `(.)`](https://hackage.haskell.org/package/base-4.14.1.0/docs/Prelude.html#v:.).
+
+Function composition, `(.)`, takes a function from type `b` to type `c` and a function from type `a` to type `b`, and returns a function from type `a` to type `c`.
+Note that we can also write the type of `(.)` slightly differently, without the final pair of parentheses:
+
+```haskell
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+(.) :: (b -> c) -> (a -> b) -> a -> c
+```
+
+These both mean the exact same thing.
+
+When at first, we wrote `timestwo . addone` as the definition of `both`, we were really applying `(.)` to two arguments: `timestwo` and `addone`.
+Indeed, we could also have written this:
+
+```haskell
+both :: Int -> Int
+both = (.) timestwo addone
+```
+
+which means exactly the same thing as `both = timestwo . addone`.
+
+Let's do some of the compiler's work here and figure out how the types match up.
+<small>(If you want to learn more about how you can do type inference by hand, one guide is [this one](http://www.vex.net/~trebla/haskell/type-inference.html).)</small>
+
+- The first argument to `(.)` is `timestwo` of type `Int -> Int`.
+ From the type signature of `(.)` we read that the first argument to `(.)` _should_ have type `b -> c` for certain `b` and `c`; here we get that `b` and `c` are both `Int`.
+- The second argument to `(.)` is `addone`, again of type `Int -> Int`.
+ This means that `a -> b` (which, from the previous point, we know to be `a -> Int`) is really `Int -> Int`, so `a` is also `Int`.
+
+We don't apply any more arguments to `(.)`, so the return type is `a -> c`; since we know that both `a` and `c` are `Int`, this is `Int -> Int`, so `both` indeed has the right type signature.
+
+Now let's try to do the same thing with our second attempt in `both2`, namely `timestwo . plus`, which is the same as `(.) timestwo plus`.
+Remember the types:
+
+```haskell
+timestwo :: Int -> Int
+plus :: Int -> Int -> Int
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+```
+
+- First argument to `(.)`: given is `timestwo :: Int -> Int`, which should match `b -> c`.
+ So: `b` and `c` are both `Int`.
+- Second argument to `(.)`: given is `plus :: Int -> Int -> Int`, which should match `a -> b`; since we already know that `b` is `Int`, this is `a -> Int`.
+ So `Int -> Int -> Int`, which is the same as `Int -> (Int -> Int)`, should match `a -> Int` for some `a`:
+
+ ```
+ Int -> (Int -> Int)
+ a -> Int
+ ```
+
+ But this can never be true!
+ We can make `a` equal to `Int`, sure, but `Int -> Int` will never match `Int`.
+ Remember the compiler error we got earlier:
+
+ ```
+ • Couldn't match type ‘Int -> Int’ with ‘Int’
+ Expected type: ...
+ ```
+
+ Suspiciously similar, isn't it?
+
+
+### Seen it coming
+
+Could we have seen this coming?
+Of course.
+Remember that the following Haskell types mean exactly the same:
+
+```haskell
+a -> b -> c -> d -> e
+a -> (b -> (c -> (d -> e)))
+```
+
+This can be explained (or remembered) in two ways:
+
+1. `->` is right-associative: its parentheses collect to the right.
+ This is exactly like you may have learned that subtraction is left-associative (`1 - 2 - 3 = (1 - 2) - 3`) and exponentiation is right-associative (`1 ^ 2 ^ 3 = 1 ^ (2 ^ 3)` = 1<sup>2<sup>3</sup></sup>).
+2. Multi-argument functions in Haskell aren't multi-argument functions; instead, they're functions that take one argument and return a function taking the rest.
+
+The _second_ interpretation is the one that you should use, because it is by far the most useful one to really understand what is going on in Haskell with functions -- and since Haskell is a functional programming language, (almost) everything is functions, so understanding them is important!
+
+Now look again at the type of `(.)` (all three mean the same):
+
+```haskell
+(.) :: (b -> c) -> (a -> b) -> a -> c
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+(.) :: (b -> c) -> ((a -> b) -> (a -> c))
+```
+
+<small>(Why can't we remove the parentheses around the `b -> c` and `a -> c` in there? That is because it the type would then suddenly mean something different. Try to work that out for yourself!)</small>
+
+We can read this as: `(.)` takes a function `g :: b -> c`, a function `f :: a -> b` and a value `x :: a`.
+It applies `f` to `x`, applies `g` to the result, and returns the result of that.
+Indeed, `(.)` can be implemented like this:
+
+```haskell
+(.) :: (b -> c) -> (a -> b) -> a -> c
+(.) g f x = g (f x)
+```
+
+Finally, now look again at our failed attempt to write `both2` as `timestwo . plus`.
+`plus` really takes only a single argument (an `Int`eger), and it returns a _function_ that takes a second integer and returns the sum of those two integers.
+So the value `f x` in the definition of `(.)` is a function in the case of `both2`!
+And `timestwo` takes an `Int` as input, not a function.
+So this is not going to work, as we've seen in a different way above.
+
+
+### Conclusion
+
+We figured out why `timestwo . plus` doesn't work in two different ways: by following what the compiler does, and by reasoning on a somewhat more abstract level about what `(.)` does.
+Both arrive at the same conclusion, which is always good: if you have two different ways of deriving something and they give the same result, then the chances are already smaller that _both_ are wrong.
+
+Is there a way to _fix_ the error, and write the composition of `timestwo` and `plus` somehow?
+Yes; there are multiple ways, in fact.
+In this case, the best one is probably to just write it out:
+
+```haskell
+both2 a b = timestwo (plus a b)
+```
+
+Arguably, this makes it the clearest what's going on.
+
+If you really want a [_pointfree_](https://wiki.haskell.org/Pointfree) (also jokingly called "pointless") version that doesn't mention any variable names, try one of the following:
+
+```haskell
+both2 = (timestwo .) . plus -- same as: (\f -> timestwo . f) . plus
+both2 = curry (timestwo . uncurry plus)
+```
+
+But that's only for fun.
+Don't actually do that.