<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ox-arrays/src/Data/Array/Mixed/Internal/Arith, branch repro-9.14-branch</title>
<subtitle>Nested, compositional struct-of-arrays orthotope arrays
</subtitle>
<id>https://git.tomsmeding.com/ox-arrays/atom?h=repro-9.14-branch</id>
<link rel='self' href='https://git.tomsmeding.com/ox-arrays/atom?h=repro-9.14-branch'/>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/'/>
<updated>2025-03-20T12:01:24Z</updated>
<entry>
<title>Separate arith routines into a library</title>
<updated>2025-03-20T12:01:24Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-20T12:01:24Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=55036a5ea4a6e590d0404638b2823c6a4aec3fba'/>
<id>urn:sha1:55036a5ea4a6e590d0404638b2823c6a4aec3fba</id>
<content type='text'>
The point is that this separate library does not depend on orthotope.
</content>
</entry>
<entry>
<title>arith: Don't FFI-import unused dotprod_*_strided ops</title>
<updated>2025-03-18T20:32:49Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-18T20:32:49Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=27c2823387b21e8ed801e4d8eeb0b3e5588a2920'/>
<id>urn:sha1:27c2823387b21e8ed801e4d8eeb0b3e5588a2920</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Optimise reductions and dotprod with more vectorisation</title>
<updated>2025-03-14T20:58:51Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-14T20:57:56Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=6276ed3c7bcd20c8b860e1275386ecd068671bcc'/>
<id>urn:sha1:6276ed3c7bcd20c8b860e1275386ecd068671bcc</id>
<content type='text'>
Turns out that if you don't supply -ffast-math, the C compiler will
faithfully reproduce your linear reduction order, which is rather
disastrous for parallelisation with vector units.

This changes the summation order, so numerical results might differ
slightly. To wit: the test suite needed adjustment.
</content>
</entry>
<entry>
<title>Implement quot/rem</title>
<updated>2025-03-13T08:27:51Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-12T22:20:13Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=ed6acbe5f409aba2fb222693da567ce04b7c4e01'/>
<id>urn:sha1:ed6acbe5f409aba2fb222693da567ce04b7c4e01</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Binary ops without normalisation</title>
<updated>2025-03-12T21:25:35Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-05T23:08:40Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=766a925698a97cac03e972bdaa2500085be17c65'/>
<id>urn:sha1:766a925698a97cac03e972bdaa2500085be17c65</id>
<content type='text'>
Before:
&gt; sum(*) Double [1e6] stride 1; -1:  OK
&gt;   68.9 ms ± 4.7 ms
After:
&gt; sum(*) Double [1e6] stride 1; -1:  OK
&gt;   1.44 ms ±  50 μs
</content>
</entry>
<entry>
<title>arith: Unary float ops on strided arrays without normalisation</title>
<updated>2025-03-05T21:09:50Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-05T21:09:50Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=984e5315768dd190a97069167daf970c17c3c867'/>
<id>urn:sha1:984e5315768dd190a97069167daf970c17c3c867</id>
<content type='text'>
</content>
</entry>
<entry>
<title>arith: Only strided unary int ops</title>
<updated>2025-02-16T22:50:07Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-02-16T22:49:56Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=71908c23307952fac26a4e24066e064d9cbb71c0'/>
<id>urn:sha1:71908c23307952fac26a4e24066e064d9cbb71c0</id>
<content type='text'>
This should have negligible overhead and will save a whole bunch of C
code duplication when the FUnops are also converted to strided form.
</content>
</entry>
<entry>
<title>arith: Unary int ops on strided arrays without normalisation</title>
<updated>2025-02-15T23:30:25Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-02-15T23:30:25Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=c14017f4bc28951be7e298d01769b5b49384a7c3'/>
<id>urn:sha1:c14017f4bc28951be7e298d01769b5b49384a7c3</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Add {m,r,s}dot1Inner</title>
<updated>2024-06-19T13:57:43Z</updated>
<author>
<name>Tom Smeding</name>
<email>t.j.smeding@uu.nl</email>
</author>
<published>2024-06-19T13:57:43Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=aafe5f6b5fa772d0e2e9f9b4f91bc3e7cf696840'/>
<id>urn:sha1:aafe5f6b5fa772d0e2e9f9b4f91bc3e7cf696840</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Clean up Foreign.hs</title>
<updated>2024-06-18T19:55:35Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2024-06-18T19:55:35Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=97ab8502b9cd3f7d908160d13c7d85d23c99e203'/>
<id>urn:sha1:97ab8502b9cd3f7d908160d13c7d85d23c99e203</id>
<content type='text'>
</content>
</entry>
</feed>
