diff options
author | Tom Smeding <tom@tomsmeding.com> | 2025-05-27 23:06:09 +0200 |
---|---|---|
committer | Tom Smeding <tom@tomsmeding.com> | 2025-05-27 23:06:09 +0200 |
commit | 29ea7b8772d3aeb25e520753346cf68d7bbb8174 (patch) | |
tree | 21b7da51b56a48fce5af1d0a2ef1824f401c40e2 | |
parent | 95e6eb45844b2318b1b7f0c9f8e7ce14f5262dd8 (diff) |
Update README
-rw-r--r-- | README.md | 181 |
1 files changed, 140 insertions, 41 deletions
@@ -1,56 +1,155 @@ -Wrapper library around `orthotope` that defines nested arrays, including -tuples, of (eventually) unboxed values. The arrays are represented in -struct-of-arrays form via the `Data.Vector.Unboxed` data family trick. Below -the surface layer, there is a more low-level wrapper around `orthotope` that -defines an array type type-indexed by `[Maybe Nat]`: some dimensions are -shape-typed (i.e. have their size statically known), and some not. +## ox-arrays -An overview of the API: +ox-arrays is a wrapper library around `orthotope` that defines nested arrays, +including tuples, of (eventually) unboxed values. The arrays are represented in +struct-of-arrays form via the `Data.Vector.Unboxed` data family trick. + +Because of the struct-of-arrays representation, nested arrays are not fully +general: indeed, arrays are not actually nested under the hood, so if one has an +array of arrays, those element arrays must all have the same shape (length, +width, etc.). If one has an array of tuples of arrays, then all the `fst` +components must have the same shape and all the `snd` components must have the +same shape, but the two pair components themselves can be different. + +However, the nesting functionality of ox-arrays can be completely ignored if you +only care about other parts of its API, or the vectorised arithmetic operations +(using hand-written C code). Nesting support mostly does not get in the way, and +has essentially no overhead (both when it's used and when it's not used). + +ox-arrays defines three array types: `Ranked`, `Shaped` and `Mixed`. +- `Ranked` corresponds to `orthotope`'s + [RankedS](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-RankedS.html) + and has the _rank_ of the array (its number of dimensions) on the type level. + For example, `Ranked 2 Float` is a two-dimensional array of `Float`s, i.e. a + matrix. +- `Shaped` corresponds to `orthotope`'s + [ShapedS](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-ShapedS.html). + and has the full _shape_ of the array (its dimension sizes) on the type level + as a type-level list of `Nat`s. For example, `Shaped [2,3] Float` is a 2-by-3 + matrix. The innermost dimension correspond to the right-most element in the + list. +- `Mixed` is halfway between the two: it has a type parameter of kind + `[Maybe Nat]` whose length is the rank of the array; `Nothing` elements have + unknown size, whereas `Just` elements have the indicated size. The type + `Mixed [Nothing, Nothing] a` is equivalent to `Ranked 2 a`; the type + `Mixed [Just n, Just m] a` is equivalent to `Shaped [n, m] a`. + +In various places in the API of a library like ox-arrays, one can make a +decision between 1. requiring a type class constraint providing certain +information (e.g. +[KnownNat](https://hackage.haskell.org/package/base-4.21.0.0/docs/GHC-TypeLits.html#t:KnownNat) +or `orthotope`'s +[Shape](https://hackage.haskell.org/package/orthotope-0.1.7.0/docs/Data-Array-ShapedS.html#t:Shape)), +or 2. taking singleton _values_ that encode said information in a way that is +linked to the type level (e.g. +[SNat](https://hackage.haskell.org/package/base-4.21.0.0/docs/GHC-TypeLits.html#t:SNat)). +`orthotope` chooses the type class approach; ox-arrays chooses the singleton +approach. Singletons are more verbose at times, but give the programmer more +insight in what data is flowing where, and more importantly, more control: type +class inference is very nice and implicit, but if it's not powerful enough for +the trickery you're doing, you're out of luck. Singletons allow you to explain +as precisely as you want to GHC what exactly you're doing. + +Below the surface layer, there is a more low-level wrapper (`XArray`) around +`orthotope` that defines a non-nested `Mixed`-style array type. + +Here is a little taster of the API, to get a sense for the design: ```haskell -data Ranked (n :: INat) a {- e.g. -} Ranked 3 Float -data Shaped (sh :: '[Nat]) a {- e.g. -} Shaped [2,3,4] Float -data Mixed (xsh :: '[Maybe Nat]) a {- e.g. -} Mixed [Just 2, Nothing, Just 4] Float - -Ranked I0 a = Ranked Z a ~~= Acc.Array Z a = Acc.Scalar a -Ranked I1 a = Ranked (S Z) a ~~= Acc.Array (Z :. Int) a = Acc.Vector a -Ranked I2 a = Ranked (S (S Z)) a ~~= Acc.Array (Z :. Int :. Int) a = Acc.Matrix a +import GHC.TypeLits (Nat) +data Ranked (n :: Nat) a {- e.g. -} Ranked 3 Float +data Shaped (sh :: '[Nat]) a {- e.g. -} Shaped [2,3,4] Float +data Mixed (xsh :: '[Maybe Nat]) a {- e.g. -} Mixed [Just 2, Nothing, Just 4] Float -rshape :: (Elt a, KnownINat n) => Ranked n a -> IxR n -sshape :: (Elt a, KnownShape sh) => Shaped sh a -> IxS sh -mshape :: (Elt a, KnownShapeX xsh) => Mixed xsh a -> IxX xsh +-- Shape types are written Sh{R,S,X}. The 'I' prefix denotes a Int-filled shape; +-- ShR and ShX are more general containers. ShS is a singleton. +rshape :: Elt a => Ranked n a -> IShR n +sshape :: Elt a => Shaped sh a -> ShS sh +mshape :: Elt a => Mixed xsh a -> IShX xsh -rindex :: Elt a => Ranked n a -> IxR n -> a -sindex :: Elt a => Shaped sh a -> IxS sh -> a -mindex :: Elt a => Mixed xsh a -> IxX xsh -> a +-- Index types are written Ix{R,S,X}. +rindex :: Elt a => Ranked n a -> IIxR n -> a +sindex :: Elt a => Shaped sh a -> IIxS sh -> a +mindex :: Elt a => Mixed xsh a -> IIxX xsh -> a -data IxR n where - IZR :: IxR Z - (:::) :: Int -> IxR n -> IxR (S n) +-- The index types can be used as if they were defined as follows; pattern +-- synonyms are provided to construct the illusion. (The actual definitions are +-- a bit more general and indirect.) +data IIxR n where + ZIR :: IIxR 0 + (:.:) :: Int -> IIxR n -> IIxR (n + 1) -data IxS sh where - IZS :: IxS '[] - (::$) :: Int -> IxS sh -> IxS (n : sh) +data IIxS sh where + ZIS :: IIxS '[] + (:.$) :: Int -> IIxS sh -> IIxS (n : sh) -data IxX sh where - IZX :: IxX '[] - (::@) :: Int -> IxX sh -> IxX (Just n : sh) - (::?) :: Int -> IxX sh -> IxX (Nothing : sh) +data IIxX xsh where + ZIX :: IIxX '[] + (:.%) :: Int -> IIxX xsh -> IIxX (mn : xsh) +-- Similarly, the shape types can be used as if they were defined as follows. +data IShR n where + ZSR :: IShR 0 + (:$:) :: Int -> IShR n -> IShR (n + 1) + +data ShS sh where + ZSS :: ShS '[] + (:$$) :: SNat n -> ShS sh -> ShS (n : sh) + +data IShX xsh where + ZSX :: IShX '[] + (:$%) :: SMayNat Int SNat mn -> IShX xsh -> IShX (mn : xsh) +-- where: +data SMayNat i f n where + SUnknown :: i -> SMayNat i f Nothing + SKnown :: f n -> SMayNat i f (Just n) + +-- Occasionally one needs a singleton for only the _known_ dimensions of a mixed +-- shape -- that is to say, only the statically-known part of a mixed shape. +-- StaticShX provides for this need. It can be used as if defined as follows: +data StaticShX xsh where + ZKX :: StaticShX '[] + (:!%) :: SMayNat () SNat mn -> StaticShX xsh -> StaticShX (mn : xsh) + +-- The Elt class describes types that can be used as elements of an array. While +-- it is technically possible to define new instances of this class, typical +-- usage should regard Elt as closed. The user-relevant instances are the +-- following: class Elt a -instance Elt () -instance Elt Double -instance Elt Int -instance (Elt a, Elt b) => Elt (a, b) -instance (Elt a, KnownINat n) => Elt (Ranked n a) -instance (Elt a, KnownShape sh) => Elt (Shaped sh a) -instance (Elt a, KnownShapeX xsh) => Elt (Mixed xsh a) - -rgenerate :: Elt a => IxR n -> (IxR n -> a) -> Ranked n a -sgenerate :: (Elt a, KnownShape sh) => (IxS sh -> a) -> Shaped sh a -mgenerate :: (Elt a, KnownShapeX xsh) => IxX xsh -> (IxX xsh -> a) -> Mixed xsh a +instance Elt () +instance Elt Bool +instance Elt Float +instance Elt Double +instance Elt Int +instance (Elt a, Elt b) => Elt (a, b) +instance Elt a => Elt (Ranked n a) +instance Elt a => Elt (Shaped sh a) +instance Elt a => Elt (Mixed xsh a) + +-- Essentially all functions that ox-arrays offers on arrays are first-order: +-- add two arrays elementwise, transpose an array, append arrays, compute +-- minima/maxima, zip/unzip, nest/unnest, etc. The first-order approach allows +-- operations to be vectorised (using hand-written C code) without needing any +-- sort of JIT compilation. +-- Exceptionally, also one higher-order function is provided per array type: +-- 'generate'. These functions have the caveat that regularity of arrays must be +-- preserved: all returned 'a's must have equal shape. See the documentation of +-- 'mgenerate'. +-- Warning: because the invocations of the function you pass cannot be +-- vectorised, 'generate' is rather slow if 'a' is small. +-- The 'KnownElt' class captures an API infelicity where constraint-based shape +-- passing is the only practical option. +rgenerate :: KnownElt a => IShR n -> (IxR n -> a) -> Ranked n a +sgenerate :: KnownElt a => ShS sh -> (IxS sh -> a) -> Shaped sh a +mgenerate :: KnownElt a => IShX xsh -> (IxX xsh -> a) -> Mixed xsh a +-- Under the hood, Ranked and Shaped are both newtypes over Mixed. Mixed itself +-- is a data family over XArray, which is a newtype over orthotope's RankedS. newtype Ranked n a = Ranked (Mixed (Replicate n Nothing) a) newtype Shaped sh a = Shaped (Mixed (MapJust sh) a) ``` + +About the name: when importing `orthotope` array modules, a possible naming +convention is to use qualified imports as `OR` for "orthotope ranked" arrays and +`OS` for "orthotope shaped" arrays. ox-arrays exists to fill the `OX` gap. |