Compiler optimizations

We should consider using compile-time optimizations. I am no expert on this, but I believe there is a lot to be gained by enabling either -O3 or -Ofast. Beyond this, there are a few other optimizations we could experiment with:

  • Link-time optimizations, -flto. No clue how much this impacts performance.
  • CPU-specific instructions, -march=native. This obviously won't work if we ever want to distribute an executable.
  • Profiling. Should be able to give us a nice performance boost, but only really works if we distribute executables.

I don't know if the latter three are available in clang. Your thoughts, @ext-olki ?

Edited by Kristian Lytje