diff options
author | Tom Smeding <tom@tomsmeding.com> | 2025-03-14 21:57:56 +0100 |
---|---|---|
committer | Tom Smeding <tom@tomsmeding.com> | 2025-03-14 21:58:51 +0100 |
commit | 6276ed3c7bcd20c8b860e1275386ecd068671bcc (patch) | |
tree | b2710f261d12a7a1b73962691c187752663543f6 /test/Tests/C.hs | |
parent | 308ca9fac150cd28d62afef852f26ae4c40fa5a0 (diff) |
Optimise reductions and dotprod with more vectorisation
Turns out that if you don't supply -ffast-math, the C compiler will
faithfully reproduce your linear reduction order, which is rather
disastrous for parallelisation with vector units.
This changes the summation order, so numerical results might differ
slightly. To wit: the test suite needed adjustment.
Diffstat (limited to 'test/Tests/C.hs')
-rw-r--r-- | test/Tests/C.hs | 8 |
1 files changed, 6 insertions, 2 deletions
diff --git a/test/Tests/C.hs b/test/Tests/C.hs index bc8e0de..a0f103d 100644 --- a/test/Tests/C.hs +++ b/test/Tests/C.hs @@ -35,6 +35,10 @@ import Gen import Util +-- | Appropriate for simple different summation orders +fineTol :: Double +fineTol = 1e-8 + prop_sum_nonempty :: Property prop_sum_nonempty = property $ genRank $ \outrank@(SNat @n) -> do -- Test nonempty _results_. The first dimension of the input is allowed to be 0, because then OR.rerank doesn't fail yet. @@ -46,7 +50,7 @@ prop_sum_nonempty = property $ genRank $ \outrank@(SNat @n) -> do genStorables (Range.singleton (product sh)) (\w -> fromIntegral w / fromIntegral (maxBound :: Word64)) let rarr = rfromOrthotope inrank arr - rtoOrthotope (rsumOuter1 rarr) === orSumOuter1 outrank arr + almostEq fineTol (rtoOrthotope (rsumOuter1 rarr)) (orSumOuter1 outrank arr) prop_sum_empty :: Property prop_sum_empty = property $ genRank $ \outrankm1@(SNat @nm1) -> do @@ -74,7 +78,7 @@ prop_sum_lasteq1 = property $ genRank $ \outrank@(SNat @n) -> do genStorables (Range.singleton (product insh)) (\w -> fromIntegral w / fromIntegral (maxBound :: Word64)) let rarr = rfromOrthotope inrank arr - rtoOrthotope (rsumOuter1 rarr) === orSumOuter1 outrank arr + almostEq fineTol (rtoOrthotope (rsumOuter1 rarr)) (orSumOuter1 outrank arr) prop_sum_replicated :: Bool -> Property prop_sum_replicated doTranspose = property $ |