aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: 13eb9fdb07c10d848b7589442348dd4f4647b96d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
## ox-arrays

ox-arrays is a wrapper library around `orthotope` that defines nested arrays,
including tuples, of (eventually) unboxed values. The arrays are represented in
struct-of-arrays form via the `Data.Vector.Unboxed` data family trick.

Because of the struct-of-arrays representation, nested arrays are not fully
general: indeed, arrays are not actually nested under the hood, so if one has an
array of arrays, those element arrays must all have the same shape (length,
width, etc.). If one has an array of tuples of arrays, then all the `fst`
components must have the same shape and all the `snd` components must have the
same shape, but the two pair components themselves can be different.

However, the nesting functionality of ox-arrays can be completely ignored if you
only care about other parts of its API, or the vectorised arithmetic operations
(using hand-written C code). Nesting support mostly does not get in the way, and
has essentially no overhead (both when it's used and when it's not used).

ox-arrays defines three array types: `Ranked`, `Shaped` and `Mixed`.
- `Ranked` corresponds to `orthotope`'s
  [RankedS](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-RankedS.html)
  and has the _rank_ of the array (its number of dimensions) on the type level.
  For example, `Ranked 2 Float` is a two-dimensional array of `Float`s, i.e. a
  matrix.
- `Shaped` corresponds to `orthotope`'s
  [ShapedS](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-ShapedS.html).
  and has the full _shape_ of the array (its dimension sizes) on the type level
  as a type-level list of `Nat`s. For example, `Shaped [2,3] Float` is a 2-by-3
  matrix. The innermost dimension correspond to the right-most element in the
  list.
- `Mixed` is halfway between the two: it has a type parameter of kind
  `[Maybe Nat]` whose length is the rank of the array; `Nothing` elements have
  unknown size, whereas `Just` elements have the indicated size. The type
  `Mixed [Nothing, Nothing] a` is equivalent to `Ranked 2 a`; the type
  `Mixed [Just n, Just m] a` is equivalent to `Shaped [n, m] a`.

In various places in the API of a library like ox-arrays, one can make a
decision between 1. requiring a type class constraint providing certain
information (e.g.
[KnownNat](https://hackage.haskell.org/package/base-4.21.0.0/docs/GHC-TypeLits.html#t:KnownNat)
or `orthotope`'s
[Shape](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-ShapedS.html#t:Shape)),
or 2. taking singleton _values_ that encode said information in a way that is
linked to the type level (e.g.
[SNat](https://hackage.haskell.org/package/base-4.21.0.0/docs/GHC-TypeLits.html#t:SNat)).
`orthotope` chooses the type class approach; ox-arrays chooses the singleton
approach. Singletons are more verbose at times, but give the programmer more
insight in what data is flowing where, and more importantly, more control: type
class inference is very nice and implicit, but if it's not powerful enough for
the trickery you're doing, you're out of luck. Singletons allow you to explain
as precisely as you want to GHC what exactly you're doing.

Below the surface layer, there is a more low-level wrapper (`XArray`) around
`orthotope` that defines a non-nested `Mixed`-style array type.

Here is a little taster of the API, to get a sense for the design:

```haskell
import GHC.TypeLits (Nat)

data Ranked (n :: Nat) a             {- e.g. -}  Ranked 3 Float
data Shaped (sh :: '[Nat]) a         {- e.g. -}  Shaped [2,3,4] Float
data Mixed (xsh :: '[Maybe Nat]) a   {- e.g. -}  Mixed [Just 2, Nothing, Just 4] Float

-- Shape types are written Sh{R,S,X}. The 'I' prefix denotes a Int-filled shape;
-- ShR and ShX are more general containers. ShS is a singleton.
rshape :: Elt a => Ranked n a  -> IShR n
sshape :: Elt a => Shaped sh a -> ShS sh
mshape :: Elt a => Mixed xsh a -> IShX xsh

-- Index types are written Ix{R,S,X}.
rindex :: Elt a => Ranked n a  -> IIxR n   -> a
sindex :: Elt a => Shaped sh a -> IIxS sh  -> a
mindex :: Elt a => Mixed xsh a -> IIxX xsh -> a

-- The index types can be used as if they were defined as follows; pattern
-- synonyms are provided to construct the illusion. (The actual definitions are
-- a bit more general and indirect.)
data IIxR n where
  ZIR   :: IIxR 0
  (:.:) :: Int -> IIxR n -> IIxR (n + 1)

data IIxS sh where
  ZIS   :: IIxS '[]
  (:.$) :: Int -> IIxS sh -> IIxS (n : sh)

data IIxX xsh where
  ZIX   :: IIxX '[]
  (:.%) :: Int -> IIxX xsh -> IIxX (mn : xsh)

-- Similarly, the shape types can be used as if they were defined as follows.
data IShR n where
  ZSR   :: IShR 0
  (:$:) :: Int -> IShR n -> IShR (n + 1)

data ShS sh where
  ZSS   :: ShS '[]
  (:$$) :: SNat n -> ShS sh -> ShS (n : sh)

data IShX xsh where
  ZSX   :: IShX '[]
  (:$%) :: SMayNat Int SNat mn -> IShX xsh -> IShX (mn : xsh)
-- where:
data SMayNat i f n where
  SUnknown :: i   -> SMayNat i f Nothing
  SKnown   :: f n -> SMayNat i f (Just n)

-- Occasionally one needs a singleton for only the _known_ dimensions of a mixed
-- shape -- that is to say, only the statically-known part of a mixed shape.
-- StaticShX provides for this need. It can be used as if defined as follows:
data StaticShX xsh where
  ZKX   :: StaticShX '[]
  (:!%) :: SMayNat () SNat mn -> StaticShX xsh -> StaticShX (mn : xsh)

-- The Elt class describes types that can be used as elements of an array. While
-- it is technically possible to define new instances of this class, typical
-- usage should regard Elt as closed. The user-relevant instances are the
-- following:
class Elt a
instance                   Elt ()
instance                   Elt Bool
instance                   Elt Float
instance                   Elt Double
instance                   Elt Int
instance (Elt a, Elt b) => Elt (a, b)
instance Elt a          => Elt (Ranked n a)
instance Elt a          => Elt (Shaped sh a)
instance Elt a          => Elt (Mixed xsh a)

-- Essentially all functions that ox-arrays offers on arrays are first-order:
-- add two arrays elementwise, transpose an array, append arrays, compute
-- minima/maxima, zip/unzip, nest/unnest, etc. The first-order approach allows
-- operations to be vectorised (using hand-written C code) without needing any
-- sort of JIT compilation.
-- Exceptionally, also one higher-order function is provided per array type:
-- 'generate'. These functions have the caveat that regularity of arrays must be
-- preserved: all returned 'a's must have equal shape. See the documentation of
-- 'mgenerate'.
-- Warning: because the invocations of the function you pass cannot be
-- vectorised, 'generate' is rather slow if 'a' is small.
-- The 'KnownElt' class captures an API infelicity where constraint-based shape
-- passing is the only practical option.
rgenerate :: KnownElt a => IShR n   -> (IxR n   -> a) -> Ranked n a
sgenerate :: KnownElt a => ShS sh   -> (IxS sh  -> a) -> Shaped sh a
mgenerate :: KnownElt a => IShX xsh -> (IxX xsh -> a) -> Mixed xsh a

-- Under the hood, Ranked and Shaped are both newtypes over Mixed. Mixed itself
-- is a data family over XArray, which is a newtype over orthotope's RankedS.
newtype Ranked n  a = Ranked (Mixed (Replicate n Nothing) a)
newtype Shaped sh a = Shaped (Mixed (MapJust sh)          a)
```

About the name: when importing `orthotope` array modules, a possible naming
convention is to use qualified imports as `OR` for "orthotope ranked" arrays and
`OS` for "orthotope shaped" arrays. ox-arrays was started to fill the `OX` gap,
then grew out of proportion.