From 32617e00f49c2259bf99767c1f658691bc7fa20e Mon Sep 17 00:00:00 2001 From: Tom Smeding Date: Sun, 15 Aug 2021 23:10:16 +0200 Subject: Add haskell/composition post --- haskell/composition.html | 151 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 151 insertions(+) create mode 100644 haskell/composition.html (limited to 'haskell/composition.html') diff --git a/haskell/composition.html b/haskell/composition.html new file mode 100644 index 0000000..27d8648 --- /dev/null +++ b/haskell/composition.html @@ -0,0 +1,151 @@ +

Function composition in Haskell

+

In Haskell, the dot operator (.), written infix like f . g, is function composition. +For example, suppose you have two functions:

+
addone :: Int -> Int
+addone n = n + 1
+
+timestwo :: Int -> Int
+timestwo n = n * 2
+
+

then you can compose those two functions like timestwo . addone:

+
-- computes 2 * (n + 1), which is 2 * n + 2
+both :: Int -> Int
+both = timestwo . addone
+
+

And we can try it out in ghci:

+
Prelude> both 10
+22
+Prelude> timestwo (addone 10)
+22
+Prelude> 2 * (10 + 1)
+22
+
+

This works well, like we expect. +These functions are intentionally very simple; the point of this post is to explain how function composition works, not how to write complex Haskell functions.

+

Now suppose we have an additional, two-argument function to play with:

+
plus :: Int -> Int -> Int
+plus a b = a + b
+
+

If we give plus some arguments, we can apply timestwo to the result:

+
Prelude> timestwo (plus 3 4)             -- 2 * (3 + 4) = 14
+14
+
+

Getting inspiration from the first example with timestwo and addone, we might want to try writing a function like both that composes timestwo and plus using (.).

+
both2 = timestwo . plus
+
+

But the compiler won't accept this:

+
    • Couldn't match type ‘Int -> Int’ with ‘Int’
+      Expected type: Int -> Int
+        Actual type: Int -> Int -> Int
+    • Probable cause: ‘plus’ is applied to too few arguments
+      In the second argument of ‘(.)’, namely ‘plus’
+      In the expression: timestwo . plus
+      In an equation for ‘both2’: both2 = timestwo . plus
+   |
+11 | both2 = timestwo . plus
+   |                    ^^^^
+
+

Why is this case different than the first example with both? +The difference is that plus takes two arguments, not one. +As always in Haskell, let's take a look at the types to see what's going on here.

+

Types

+

What are the types of the functions we're using?

+
addone :: Int -> Int
+timestwo :: Int -> Int
+plus :: Int -> Int -> Int
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+
+

Remember that you can look up the type of standard library functions online: e.g. here for (.).

+

Function composition, (.), takes a function from type b to type c and a function from type a to type b, and returns a function from type a to type c. +Note that we can also write the type of (.) slightly differently, without the final pair of parentheses:

+
(.) :: (b -> c) -> (a -> b) -> (a -> c)
+(.) :: (b -> c) -> (a -> b) -> a -> c
+
+

These both mean the exact same thing.

+

When at first, we wrote timestwo . addone as the definition of both, we were really applying (.) to two arguments: timestwo and addone. +Indeed, we could also have written this:

+
both :: Int -> Int
+both = (.) timestwo addone
+
+

which means exactly the same thing as both = timestwo . addone.

+

Let's do some of the compiler's work here and figure out how the types match up. +(If you want to learn more about how you can do type inference by hand, one guide is this one.)

+ +

We don't apply any more arguments to (.), so the return type is a -> c; since we know that both a and c are Int, this is Int -> Int, so both indeed has the right type signature.

+

Now let's try to do the same thing with our second attempt in both2, namely timestwo . plus, which is the same as (.) timestwo plus. +Remember the types:

+
timestwo :: Int -> Int
+plus :: Int -> Int -> Int
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+
+ +

Seen it coming

+

Could we have seen this coming? +Of course. +Remember that the following Haskell types mean exactly the same:

+
a -> b -> c -> d -> e
+a -> (b -> (c -> (d -> e)))
+
+

This can be explained (or remembered) in two ways:

+
    +
  1. -> is right-associative: its parentheses collect to the right. +This is exactly like you may have learned that subtraction is left-associative (1 - 2 - 3 = (1 - 2) - 3) and exponentiation is right-associative (1 ^ 2 ^ 3 = 1 ^ (2 ^ 3) = 123).
  2. +
  3. Multi-argument functions in Haskell aren't multi-argument functions; instead, they're functions that take one argument and return a function taking the rest.
  4. +
+

The second interpretation is the one that you should use, because it is by far the most useful one to really understand what is going on in Haskell with functions -- and since Haskell is a functional programming language, (almost) everything is functions, so understanding them is important!

+

Now look again at the type of (.) (all three mean the same):

+
(.) :: (b -> c) -> (a -> b) -> a -> c
+(.) :: (b -> c) -> (a -> b) -> (a -> c)
+(.) :: (b -> c) -> ((a -> b) -> (a -> c))
+
+

(Why can't we remove the parentheses around the b -> c and a -> c in there? That is because it the type would then suddenly mean something different. Try to work that out for yourself!)

+

We can read this as: (.) takes a function g :: b -> c, a function f :: a -> b and a value x :: a. +It applies f to x, applies g to the result, and returns the result of that. +Indeed, (.) can be implemented like this:

+
(.) :: (b -> c) -> (a -> b) -> a -> c
+(.) g f x = g (f x)
+
+

Finally, now look again at our failed attempt to write both2 as timestwo . plus. +plus really takes only a single argument (an Integer), and it returns a function that takes a second integer and returns the sum of those two integers. +So the value f x in the definition of (.) is a function in the case of both2! +And timestwo takes an Int as input, not a function. +So this is not going to work, as we've seen in a different way above.

+

Conclusion

+

We figured out why timestwo . plus doesn't work in two different ways: by following what the compiler does, and by reasoning on a somewhat more abstract level about what (.) does. +Both arrive at the same conclusion, which is always good: if you have two different ways of deriving something and they give the same result, then the chances are already smaller that both are wrong.

+

Is there a way to fix the error, and write the composition of timestwo and plus somehow? +Yes; there are multiple ways, in fact. +In this case, the best one is probably to just write it out:

+
both2 a b = timestwo (plus a b)
+
+

Arguably, this makes it the clearest what's going on.

+

If you really want a pointfree (also jokingly called "pointless") version that doesn't mention any variable names, try one of the following:

+
both2 = (timestwo .) . plus    -- same as: (\f -> timestwo . f) . plus
+both2 = curry (timestwo . uncurry plus)
+
+

But that's only for fun. +Don't actually do that.

-- cgit v1.2.3-54-g00ecf