| Commit message (Collapse) | Author | Age | |
|---|---|---|---|
| * | arith: Better typing of enum stats_binary_id | Tom Smeding | 2025-04-15 | 
| | | |||
| * | arith: Don't use C23 features | Tom Smeding | 2025-04-15 | 
| | | |||
| * | Dotprod: Optimise reversed and replicated dimensions | Tom Smeding | 2025-03-25 | 
| | | |||
| * | Separate arith routines into a library | Tom Smeding | 2025-03-20 | 
| | | | | | The point is that this separate library does not depend on orthotope. | ||
| * | arith stats: Print timings with 3 digits precision | Tom Smeding | 2025-03-18 | 
| | | | | | | If you render microseconds timings as milliseconds, you _have_ only 3 digits behind the decimal point. | ||
| * | arith stats: Improve output format | Tom Smeding | 2025-03-18 | 
| | | | | | | | This makes it nicer to process using unix tools. Try: $ sed -n '/ox-arrays-arith-stats start/,/ox-arrays-arith-stats end/ !d; /===/ !p' | sort -n -k4,4 -k6,6 | ||
| * | Arith statistics collection from C | Tom Smeding | 2025-03-18 | 
| | | |||
| * | Optimise reductions and dotprod with more vectorisation | Tom Smeding | 2025-03-14 | 
| | | | | | | | | | | Turns out that if you don't supply -ffast-math, the C compiler will faithfully reproduce your linear reduction order, which is rather disastrous for parallelisation with vector units. This changes the summation order, so numerical results might differ slightly. To wit: the test suite needed adjustment. | ||
| * | arith: Remove CASE1, add restrict | Tom Smeding | 2025-03-14 | 
| | | | | | | | | | | Turns out that GCC already splits generates separate code for an inner stride of 1 automatically, so no need to do fancy stuff in C. Also, GCC generated a whole bunch of superfluous code to correctly handle the case where output and input arrays overlap; since this never happens in our case, let's add `restrict` and save some binary size. | ||
| * | Add atan2 | Tom Smeding | 2025-03-13 | 
| | | |||
| * | arith: Fix enum typing typos | Tom Smeding | 2025-03-13 | 
| | | |||
| * | Implement quot/rem | Tom Smeding | 2025-03-13 | 
| | | |||
| * | Binary ops without normalisation | Tom Smeding | 2025-03-12 | 
| | | | | | | | | | | Before: > sum(*) Double [1e6] stride 1; -1: OK > 68.9 ms ± 4.7 ms After: > sum(*) Double [1e6] stride 1; -1: OK > 1.44 ms ± 50 μs | ||
| * | C: Simplify DOTPROD_STRIDED_OP signature | Tom Smeding | 2025-03-05 | 
| | | |||
| * | arith: Unary float ops on strided arrays without normalisation | Tom Smeding | 2025-03-05 | 
| | | |||
| * | arith: Only strided unary int ops | Tom Smeding | 2025-02-16 | 
| | | | | | | This should have negligible overhead and will save a whole bunch of C code duplication when the FUnops are also converted to strided form. | ||
| * | arith: Unary int ops on strided arrays without normalisation | Tom Smeding | 2025-02-16 | 
| | | |||
| * | Add {m,r,s}dot1Inner | Tom Smeding | 2024-06-19 | 
| | | |||
| * | More sensible argument order to reduce1 C op | Tom Smeding | 2024-06-18 | 
| | | |||
| * | C cleanup: abstract strides[rank-1] case into macro | Tom Smeding | 2024-06-18 | 
| | | |||
| * | sumAllPrim | Tom Smeding | 2024-06-17 | 
| | | |||
| * | Only use intel SIMD on intel platforms | Tom Smeding | 2024-06-12 | 
| | | |||
| * | Fix SIMD code to allow for unaligned arrays | Tom Smeding | 2024-06-11 | 
| | | |||
| * | Manual vectorisation of dot product for floating points | Tom Smeding | 2024-06-10 | 
| | | |||
| * | Dot product | Tom Smeding | 2024-06-10 | 
| | | |||
| * | Rename arg{min,max} to {min,max}Index | Tom Smeding | 2024-06-10 | 
| | | |||
| * | argmin and argmax | Tom Smeding | 2024-06-09 | 
| | | |||
| * | Fast (C) Floating ops | Tom Smeding | 2024-05-27 | 
| | | |||
| * | Fast Fractional ops via C code | Tom Smeding | 2024-05-26 | 
| | | |||
| * | Refactor C interface to pass operation as enum | Tom Smeding | 2024-05-26 | 
| | | | | | | This is hmatrix style, less proliferation of functions as the number of ops increases | ||
| * | Add more const in C arith ops | Tom Smeding | 2024-05-24 | 
| | | |||
| * | Better naming in C code | Tom Smeding | 2024-05-23 | 
| | | |||
| * | Fast sum | Tom Smeding | 2024-05-23 | 
| | | | | | Also fast product, but that's currently unused | ||
| * | Fast numeric operations for Num | Tom Smeding | 2024-05-23 | 
