summaryrefslogtreecommitdiff
path: root/haskell/composition.md
blob: c7348a24c797a74ad25a18ec0a7fbf5a557e83cf (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
## Function composition in Haskell

(This post is intended for Haskell beginners.)

In Haskell, the dot operator `(.)`, written infix like `f . g`, is _function composition_.
For example, suppose you have two functions:

```haskell
addone :: Int -> Int
addone n = n + 1

timestwo :: Int -> Int
timestwo n = n * 2
```

then you can _compose_ those two functions like `timestwo . addone`:

```haskell
-- computes 2 * (n + 1), which is 2 * n + 2
both :: Int -> Int
both = timestwo . addone
```

And we can try it out in `ghci`:

```
Prelude> both 10
22
Prelude> timestwo (addone 10)
22
Prelude> 2 * (10 + 1)
22
```

This works well, like we expect.
These functions are intentionally very simple; the point of this post is to explain how function composition works, not how to write complex Haskell functions.

Now suppose we have an additional, two-argument function to play with:

```haskell
plus :: Int -> Int -> Int
plus a b = a + b
```

If we give `plus` some arguments, we can apply `timestwo` to the result:

```
Prelude> timestwo (plus 3 4)             -- 2 * (3 + 4) = 14
14
```

Getting inspiration from the first example with `timestwo` and `addone`, we might want to try writing a function like `both` that composes `timestwo` and `plus` using `(.)`.

```haskell
both2 = timestwo . plus
```

But the compiler won't accept this:

```
    • Couldn't match type ‘Int -> Int’ with ‘Int’
      Expected type: Int -> Int
        Actual type: Int -> Int -> Int
    • Probable cause: ‘plus’ is applied to too few arguments
      In the second argument of ‘(.)’, namely ‘plus’
      In the expression: timestwo . plus
      In an equation for ‘both2’: both2 = timestwo . plus
   |
11 | both2 = timestwo . plus
   |                    ^^^^
```

Why is this case different than the first example with `both`?
The difference is that `plus` takes two arguments, not one.
As always in Haskell, let's take a look at the types to see what's going on here.


### Types

What are the types of the functions we're using?

```haskell
addone :: Int -> Int
timestwo :: Int -> Int
plus :: Int -> Int -> Int
(.) :: (b -> c) -> (a -> b) -> (a -> c)
```

Remember that you can look up the type of standard library functions online: e.g. [here for `(.)`](https://hackage.haskell.org/package/base-4.14.1.0/docs/Prelude.html#v:.).

Function composition, `(.)`, takes a function from type `b` to type `c` and a function from type `a` to type `b`, and returns a function from type `a` to type `c`.
Note that we can also write the type of `(.)` slightly differently, without the final pair of parentheses:

```haskell
(.) :: (b -> c) -> (a -> b) -> (a -> c)
(.) :: (b -> c) -> (a -> b) -> a -> c
```

These both mean the exact same thing.

When at first, we wrote `timestwo . addone` as the definition of `both`, we were really applying `(.)` to two arguments: `timestwo` and `addone`.
Indeed, we could also have written this:

```haskell
both :: Int -> Int
both = (.) timestwo addone
```

which means exactly the same thing as `both = timestwo . addone`.

Let's do some of the compiler's work here and figure out how the types match up.
<small>(If you want to learn more about how you can do type inference by hand, one guide is [this one](http://www.vex.net/~trebla/haskell/type-inference.html).)</small>

- The first argument to `(.)` is `timestwo` of type `Int -> Int`.
  From the type signature of `(.)` we read that the first argument to `(.)` _should_ have type `b -> c` for certain `b` and `c`; here we get that `b` and `c` are both `Int`.
- The second argument to `(.)` is `addone`, again of type `Int -> Int`.
  This means that `a -> b` (which, from the previous point, we know to be `a -> Int`) is really `Int -> Int`, so `a` is also `Int`.

We don't apply any more arguments to `(.)`, so the return type is `a -> c`; since we know that both `a` and `c` are `Int`, this is `Int -> Int`, so `both` indeed has the right type signature.

Now let's try to do the same thing with our second attempt in `both2`, namely `timestwo . plus`, which is the same as `(.) timestwo plus`.
Remember the types:

```haskell
timestwo :: Int -> Int
plus :: Int -> Int -> Int
(.) :: (b -> c) -> (a -> b) -> (a -> c)
```

- First argument to `(.)`: given is `timestwo :: Int -> Int`, which should match `b -> c`.
  So: `b` and `c` are both `Int`.
- Second argument to `(.)`: given is `plus :: Int -> Int -> Int`, which should match `a -> b`; since we already know that `b` is `Int`, this is `a -> Int`.
  So `Int -> Int -> Int`, which is the same as `Int -> (Int -> Int)`, should match `a -> Int` for some `a`:

  ```
  Int -> (Int -> Int)
   a  ->     Int
  ```

  But this can never be true!
  We can make `a` equal to `Int`, sure, but `Int -> Int` will never match `Int`.
  Remember the compiler error we got earlier:

  ```
  • Couldn't match type ‘Int -> Int’ with ‘Int’
    Expected type: ...
  ```

  Suspiciously similar, isn't it?


### Seen it coming

Could we have seen this coming?
Of course.
Remember that the following Haskell types mean exactly the same:

```haskell
a -> b -> c -> d -> e
a -> (b -> (c -> (d -> e)))
```

This can be explained (or remembered) in two ways:

1. `->` is right-associative: its parentheses collect to the right.
   This is exactly like you may have learned that subtraction is left-associative (`1 - 2 - 3 = (1 - 2) - 3`) and exponentiation is right-associative (`1 ^ 2 ^ 3 = 1 ^ (2 ^ 3)` = 1<sup>2<sup>3</sup></sup>).
2. Multi-argument functions in Haskell aren't multi-argument functions; instead, they're functions that take one argument and return a function taking the rest.

The _second_ interpretation is the one that you should use, because it is by far the most useful one to really understand what is going on in Haskell with functions -- and since Haskell is a functional programming language, (almost) everything is functions, so understanding them is important!

Now look again at the type of `(.)` (all three mean the same):

```haskell
(.) :: (b -> c) -> (a -> b) -> a -> c
(.) :: (b -> c) -> (a -> b) -> (a -> c)
(.) :: (b -> c) -> ((a -> b) -> (a -> c))
```

<small>(Why can't we remove the parentheses around the `b -> c` and `a -> b` in there? That is because it the type would then suddenly mean something different. Try to work that out for yourself!)</small>

We can read this as: `(.)` takes a function `g :: b -> c`, a function `f :: a -> b` and a value `x :: a`.
It applies `f` to `x`, applies `g` to the result, and returns the result of that.
Indeed, `(.)` can be implemented like this:

```haskell
(.) :: (b -> c) -> (a -> b) -> a -> c
(.) g f x = g (f x)
```

Finally, now look again at our failed attempt to write `both2` as `timestwo . plus`.
`plus` really takes only a single argument (an `Int`eger), and it returns a _function_ that takes a second integer and returns the sum of those two integers.
So the value `f x` in the definition of `(.)` is a function in the case of `both2`!
And `timestwo` takes an `Int` as input, not a function.
So this is not going to work, as we've seen in a different way above.


### Conclusion

We figured out why `timestwo . plus` doesn't work in two different ways: by following what the compiler does, and by reasoning on a somewhat more abstract level about what `(.)` does.
Both arrive at the same conclusion, which is always good: if you have two different ways of deriving something and they give the same result, then the chances are already smaller that _both_ are wrong.

Is there a way to _fix_ the error, and write the composition of `timestwo` and `plus` somehow?
Yes; there are multiple ways, in fact.
In this case, the best one is probably to just write it out:

```haskell
both2 a b = timestwo (plus a b)
```

Arguably, this makes it the clearest what's going on.

If you really want a [_pointfree_](https://wiki.haskell.org/Pointfree) (also jokingly called "pointless") version that doesn't mention any variable names, try one of the following:

```haskell
both2 = (timestwo .) . plus    -- same as: (\f -> timestwo . f) . plus
both2 = curry (timestwo . uncurry plus)
```

But that's only for fun.
Don't actually do that.