Dot product can be auto-vectorized only with
Possible solutions:
So I wrote implementations of dot product for real and complex numbers.
Complex version is similar. The main loop:
There are gdc and ldc implementations, but ldc implementation was not tested. Source code is available at GitHub.
fast-math
or similar compiler option. fast-math
allows compiler to use associative floating point transformations. Other math functions like exponent can be damaged consequently.
Possible solutions:
- Compile
fast_math
code from other program separately and then link it. This is easy solution. However this is a step back to C. - To introduce a
@fast_math
attribute. This is hard to realize. But I hope this will be done for future compilers. - Do vectorization yourself. In that case you need to realize SIMD accessory functions like unaligned load.
So I wrote implementations of dot product for real and complex numbers.
Code
Dot product for real numbers:Complex version is similar. The main loop:
There are gdc and ldc implementations, but ldc implementation was not tested. Source code is available at GitHub.
Benchmarks
Processor | Intel i5-4570, Haswell |
Instruction set | AVX2 |
System | Ubuntu 13.10 |
Compiler | GDC-4.8.1 |
Compiler flags | -march=native -fno-bounds-check -frename-registers -frelease -O3 |
apparently for short vectors (like those used in 3D math), the trivial one performs the best in all tests.
ReplyDelete