Well. Fast Python loops in CPython has been tried before and failed. I seriously don't see a way of getting a working JIT that really optimizes a lot of code out there and native support for CPython extensions. There will be some side that suffers. Also, numpy is fairly special as it does have a good potential to be optimized by the JIT in ways that are not quite possible using C or Cython.