aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorTom Smeding <tom@tomsmeding.com>2025-05-27 23:06:09 +0200
committerTom Smeding <tom@tomsmeding.com>2025-05-27 23:06:09 +0200
commit29ea7b8772d3aeb25e520753346cf68d7bbb8174 (patch)
tree21b7da51b56a48fce5af1d0a2ef1824f401c40e2
parent95e6eb45844b2318b1b7f0c9f8e7ce14f5262dd8 (diff)
Update README
-rw-r--r--README.md181
1 files changed, 140 insertions, 41 deletions
diff --git a/README.md b/README.md
index 9b8d543..a9d9599 100644
--- a/README.md
+++ b/README.md
@@ -1,56 +1,155 @@
-Wrapper library around `orthotope` that defines nested arrays, including
-tuples, of (eventually) unboxed values. The arrays are represented in
-struct-of-arrays form via the `Data.Vector.Unboxed` data family trick. Below
-the surface layer, there is a more low-level wrapper around `orthotope` that
-defines an array type type-indexed by `[Maybe Nat]`: some dimensions are
-shape-typed (i.e. have their size statically known), and some not.
+## ox-arrays
-An overview of the API:
+ox-arrays is a wrapper library around `orthotope` that defines nested arrays,
+including tuples, of (eventually) unboxed values. The arrays are represented in
+struct-of-arrays form via the `Data.Vector.Unboxed` data family trick.
+
+Because of the struct-of-arrays representation, nested arrays are not fully
+general: indeed, arrays are not actually nested under the hood, so if one has an
+array of arrays, those element arrays must all have the same shape (length,
+width, etc.). If one has an array of tuples of arrays, then all the `fst`
+components must have the same shape and all the `snd` components must have the
+same shape, but the two pair components themselves can be different.
+
+However, the nesting functionality of ox-arrays can be completely ignored if you
+only care about other parts of its API, or the vectorised arithmetic operations
+(using hand-written C code). Nesting support mostly does not get in the way, and
+has essentially no overhead (both when it's used and when it's not used).
+
+ox-arrays defines three array types: `Ranked`, `Shaped` and `Mixed`.
+- `Ranked` corresponds to `orthotope`'s
+ [RankedS](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-RankedS.html)
+ and has the _rank_ of the array (its number of dimensions) on the type level.
+ For example, `Ranked 2 Float` is a two-dimensional array of `Float`s, i.e. a
+ matrix.
+- `Shaped` corresponds to `orthotope`'s
+ [ShapedS](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-ShapedS.html).
+ and has the full _shape_ of the array (its dimension sizes) on the type level
+ as a type-level list of `Nat`s. For example, `Shaped [2,3] Float` is a 2-by-3
+ matrix. The innermost dimension correspond to the right-most element in the
+ list.
+- `Mixed` is halfway between the two: it has a type parameter of kind
+ `[Maybe Nat]` whose length is the rank of the array; `Nothing` elements have
+ unknown size, whereas `Just` elements have the indicated size. The type
+ `Mixed [Nothing, Nothing] a` is equivalent to `Ranked 2 a`; the type
+ `Mixed [Just n, Just m] a` is equivalent to `Shaped [n, m] a`.
+
+In various places in the API of a library like ox-arrays, one can make a
+decision between 1. requiring a type class constraint providing certain
+information (e.g.
+[KnownNat](https://hackage.haskell.org/package/base-4.21.0.0/docs/GHC-TypeLits.html#t:KnownNat)
+or `orthotope`'s
+[Shape](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-ShapedS.html#t:Shape)),
+or 2. taking singleton _values_ that encode said information in a way that is
+linked to the type level (e.g.
+[SNat](https://hackage.haskell.org/package/base-4.21.0.0/docs/GHC-TypeLits.html#t:SNat)).
+`orthotope` chooses the type class approach; ox-arrays chooses the singleton
+approach. Singletons are more verbose at times, but give the programmer more
+insight in what data is flowing where, and more importantly, more control: type
+class inference is very nice and implicit, but if it's not powerful enough for
+the trickery you're doing, you're out of luck. Singletons allow you to explain
+as precisely as you want to GHC what exactly you're doing.
+
+Below the surface layer, there is a more low-level wrapper (`XArray`) around
+`orthotope` that defines a non-nested `Mixed`-style array type.
+
+Here is a little taster of the API, to get a sense for the design:
```haskell
-data Ranked (n :: INat) a {- e.g. -} Ranked 3 Float
-data Shaped (sh :: '[Nat]) a {- e.g. -} Shaped [2,3,4] Float
-data Mixed (xsh :: '[Maybe Nat]) a {- e.g. -} Mixed [Just 2, Nothing, Just 4] Float
-
-Ranked I0 a = Ranked Z a ~~= Acc.Array Z a = Acc.Scalar a
-Ranked I1 a = Ranked (S Z) a ~~= Acc.Array (Z :. Int) a = Acc.Vector a
-Ranked I2 a = Ranked (S (S Z)) a ~~= Acc.Array (Z :. Int :. Int) a = Acc.Matrix a
+import GHC.TypeLits (Nat)
+data Ranked (n :: Nat) a {- e.g. -} Ranked 3 Float
+data Shaped (sh :: '[Nat]) a {- e.g. -} Shaped [2,3,4] Float
+data Mixed (xsh :: '[Maybe Nat]) a {- e.g. -} Mixed [Just 2, Nothing, Just 4] Float
-rshape :: (Elt a, KnownINat n) => Ranked n a -> IxR n
-sshape :: (Elt a, KnownShape sh) => Shaped sh a -> IxS sh
-mshape :: (Elt a, KnownShapeX xsh) => Mixed xsh a -> IxX xsh
+-- Shape types are written Sh{R,S,X}. The 'I' prefix denotes a Int-filled shape;
+-- ShR and ShX are more general containers. ShS is a singleton.
+rshape :: Elt a => Ranked n a -> IShR n
+sshape :: Elt a => Shaped sh a -> ShS sh
+mshape :: Elt a => Mixed xsh a -> IShX xsh
-rindex :: Elt a => Ranked n a -> IxR n -> a
-sindex :: Elt a => Shaped sh a -> IxS sh -> a
-mindex :: Elt a => Mixed xsh a -> IxX xsh -> a
+-- Index types are written Ix{R,S,X}.
+rindex :: Elt a => Ranked n a -> IIxR n -> a
+sindex :: Elt a => Shaped sh a -> IIxS sh -> a
+mindex :: Elt a => Mixed xsh a -> IIxX xsh -> a
-data IxR n where
- IZR :: IxR Z
- (:::) :: Int -> IxR n -> IxR (S n)
+-- The index types can be used as if they were defined as follows; pattern
+-- synonyms are provided to construct the illusion. (The actual definitions are
+-- a bit more general and indirect.)
+data IIxR n where
+ ZIR :: IIxR 0
+ (:.:) :: Int -> IIxR n -> IIxR (n + 1)
-data IxS sh where
- IZS :: IxS '[]
- (::$) :: Int -> IxS sh -> IxS (n : sh)
+data IIxS sh where
+ ZIS :: IIxS '[]
+ (:.$) :: Int -> IIxS sh -> IIxS (n : sh)
-data IxX sh where
- IZX :: IxX '[]
- (::@) :: Int -> IxX sh -> IxX (Just n : sh)
- (::?) :: Int -> IxX sh -> IxX (Nothing : sh)
+data IIxX xsh where
+ ZIX :: IIxX '[]
+ (:.%) :: Int -> IIxX xsh -> IIxX (mn : xsh)
+-- Similarly, the shape types can be used as if they were defined as follows.
+data IShR n where
+ ZSR :: IShR 0
+ (:$:) :: Int -> IShR n -> IShR (n + 1)
+
+data ShS sh where
+ ZSS :: ShS '[]
+ (:$$) :: SNat n -> ShS sh -> ShS (n : sh)
+
+data IShX xsh where
+ ZSX :: IShX '[]
+ (:$%) :: SMayNat Int SNat mn -> IShX xsh -> IShX (mn : xsh)
+-- where:
+data SMayNat i f n where
+ SUnknown :: i -> SMayNat i f Nothing
+ SKnown :: f n -> SMayNat i f (Just n)
+
+-- Occasionally one needs a singleton for only the _known_ dimensions of a mixed
+-- shape -- that is to say, only the statically-known part of a mixed shape.
+-- StaticShX provides for this need. It can be used as if defined as follows:
+data StaticShX xsh where
+ ZKX :: StaticShX '[]
+ (:!%) :: SMayNat () SNat mn -> StaticShX xsh -> StaticShX (mn : xsh)
+
+-- The Elt class describes types that can be used as elements of an array. While
+-- it is technically possible to define new instances of this class, typical
+-- usage should regard Elt as closed. The user-relevant instances are the
+-- following:
class Elt a
-instance Elt ()
-instance Elt Double
-instance Elt Int
-instance (Elt a, Elt b) => Elt (a, b)
-instance (Elt a, KnownINat n) => Elt (Ranked n a)
-instance (Elt a, KnownShape sh) => Elt (Shaped sh a)
-instance (Elt a, KnownShapeX xsh) => Elt (Mixed xsh a)
-
-rgenerate :: Elt a => IxR n -> (IxR n -> a) -> Ranked n a
-sgenerate :: (Elt a, KnownShape sh) => (IxS sh -> a) -> Shaped sh a
-mgenerate :: (Elt a, KnownShapeX xsh) => IxX xsh -> (IxX xsh -> a) -> Mixed xsh a
+instance Elt ()
+instance Elt Bool
+instance Elt Float
+instance Elt Double
+instance Elt Int
+instance (Elt a, Elt b) => Elt (a, b)
+instance Elt a => Elt (Ranked n a)
+instance Elt a => Elt (Shaped sh a)
+instance Elt a => Elt (Mixed xsh a)
+
+-- Essentially all functions that ox-arrays offers on arrays are first-order:
+-- add two arrays elementwise, transpose an array, append arrays, compute
+-- minima/maxima, zip/unzip, nest/unnest, etc. The first-order approach allows
+-- operations to be vectorised (using hand-written C code) without needing any
+-- sort of JIT compilation.
+-- Exceptionally, also one higher-order function is provided per array type:
+-- 'generate'. These functions have the caveat that regularity of arrays must be
+-- preserved: all returned 'a's must have equal shape. See the documentation of
+-- 'mgenerate'.
+-- Warning: because the invocations of the function you pass cannot be
+-- vectorised, 'generate' is rather slow if 'a' is small.
+-- The 'KnownElt' class captures an API infelicity where constraint-based shape
+-- passing is the only practical option.
+rgenerate :: KnownElt a => IShR n -> (IxR n -> a) -> Ranked n a
+sgenerate :: KnownElt a => ShS sh -> (IxS sh -> a) -> Shaped sh a
+mgenerate :: KnownElt a => IShX xsh -> (IxX xsh -> a) -> Mixed xsh a
+-- Under the hood, Ranked and Shaped are both newtypes over Mixed. Mixed itself
+-- is a data family over XArray, which is a newtype over orthotope's RankedS.
newtype Ranked n a = Ranked (Mixed (Replicate n Nothing) a)
newtype Shaped sh a = Shaped (Mixed (MapJust sh) a)
```
+
+About the name: when importing `orthotope` array modules, a possible naming
+convention is to use qualified imports as `OR` for "orthotope ranked" arrays and
+`OS` for "orthotope shaped" arrays. ox-arrays exists to fill the `OX` gap.