<feed xmlns='http://www.w3.org/2005/Atom'>
<title>ox-arrays/cbits, branch repro-9.14-branch</title>
<subtitle>Nested, compositional struct-of-arrays orthotope arrays
</subtitle>
<id>https://git.tomsmeding.com/ox-arrays/atom?h=repro-9.14-branch</id>
<link rel='self' href='https://git.tomsmeding.com/ox-arrays/atom?h=repro-9.14-branch'/>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/'/>
<updated>2025-04-15T15:18:34Z</updated>
<entry>
<title>arith: Better typing of enum stats_binary_id</title>
<updated>2025-04-15T15:18:34Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-04-15T15:18:34Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=9106993eeb2036f1dc5165535e1f2be77c273f59'/>
<id>urn:sha1:9106993eeb2036f1dc5165535e1f2be77c273f59</id>
<content type='text'>
</content>
</entry>
<entry>
<title>arith: Don't use C23 features</title>
<updated>2025-04-15T15:18:18Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-04-15T15:18:18Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=8210378510b92f8ec224c6adcda3ecc77625f1a0'/>
<id>urn:sha1:8210378510b92f8ec224c6adcda3ecc77625f1a0</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Dotprod: Optimise reversed and replicated dimensions</title>
<updated>2025-03-25T16:09:20Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-25T16:09:20Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=a78ddeaa5d34fa8b6fa52eee42977cc46e8c36a5'/>
<id>urn:sha1:a78ddeaa5d34fa8b6fa52eee42977cc46e8c36a5</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Separate arith routines into a library</title>
<updated>2025-03-20T12:01:24Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-20T12:01:24Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=55036a5ea4a6e590d0404638b2823c6a4aec3fba'/>
<id>urn:sha1:55036a5ea4a6e590d0404638b2823c6a4aec3fba</id>
<content type='text'>
The point is that this separate library does not depend on orthotope.
</content>
</entry>
<entry>
<title>arith stats: Print timings with 3 digits precision</title>
<updated>2025-03-18T22:51:51Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-18T22:51:51Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=5414434df62b2b196354b9748b265093c168601b'/>
<id>urn:sha1:5414434df62b2b196354b9748b265093c168601b</id>
<content type='text'>
If you render microseconds timings as milliseconds, you _have_ only 3
digits behind the decimal point.
</content>
</entry>
<entry>
<title>arith stats: Improve output format</title>
<updated>2025-03-18T22:37:56Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-18T22:37:56Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=7883bed5997f430219077202c84af7bf80ada2b7'/>
<id>urn:sha1:7883bed5997f430219077202c84af7bf80ada2b7</id>
<content type='text'>
This makes it nicer to process using unix tools. Try:

$ sed -n '/ox-arrays-arith-stats start/,/ox-arrays-arith-stats end/ !d; /===/ !p' | sort -n -k4,4 -k6,6
</content>
</entry>
<entry>
<title>Arith statistics collection from C</title>
<updated>2025-03-18T21:32:16Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-18T21:32:16Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=cb758277b3fa2d74551c45340b8ff0539713078c'/>
<id>urn:sha1:cb758277b3fa2d74551c45340b8ff0539713078c</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Optimise reductions and dotprod with more vectorisation</title>
<updated>2025-03-14T20:58:51Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-14T20:57:56Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=6276ed3c7bcd20c8b860e1275386ecd068671bcc'/>
<id>urn:sha1:6276ed3c7bcd20c8b860e1275386ecd068671bcc</id>
<content type='text'>
Turns out that if you don't supply -ffast-math, the C compiler will
faithfully reproduce your linear reduction order, which is rather
disastrous for parallelisation with vector units.

This changes the summation order, so numerical results might differ
slightly. To wit: the test suite needed adjustment.
</content>
</entry>
<entry>
<title>arith: Remove CASE1, add restrict</title>
<updated>2025-03-14T13:40:02Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-14T13:40:02Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=639acb0abed995400d984203684e178a11d91fa1'/>
<id>urn:sha1:639acb0abed995400d984203684e178a11d91fa1</id>
<content type='text'>
Turns out that GCC already splits generates separate code for an inner
stride of 1 automatically, so no need to do fancy stuff in C.

Also, GCC generated a whole bunch of superfluous code to correctly
handle the case where output and input arrays overlap; since this never
happens in our case, let's add `restrict` and save some binary size.
</content>
</entry>
<entry>
<title>Add atan2</title>
<updated>2025-03-13T08:28:24Z</updated>
<author>
<name>Tom Smeding</name>
<email>tom@tomsmeding.com</email>
</author>
<published>2025-03-13T08:26:20Z</published>
<link rel='alternate' type='text/html' href='https://git.tomsmeding.com/ox-arrays/commit/?id=a87c80b1fbaa826142605d0846479c94d6ee2bcc'/>
<id>urn:sha1:a87c80b1fbaa826142605d0846479c94d6ee2bcc</id>
<content type='text'>
</content>
</entry>
</feed>
