It is hard to do relevant tests of which language is the fastest.
Really, writing fast code is mostly down to the programmer. For example C is widely recognized as the fastest non-assembly language simply because it leaves a lot to the programmer, C won't magically make your terrible code fast, unless you are using time-to-segfault as a metric. Assembly is the fastest if you know what you are doing, very few know what they are doing.
So, what kind of code are you going to use for your benchmark? Highly optimized code written by experts spending way too much time, the "most idiomatic" code, code written by an average skilled programmer picked at random, code extracted from a big open source project? This can drastically change the ranking, so which one is the most relevant? If you go with the "most idiomatic" for instance, you miss out on the idea that parts can be optimized if needed, and that in real life, programmers aren't perfect and can write suboptimal code by mistake.
There is also a cultural aspect to languages that may not be caught in benchmarks. For example, C programmers tend to have a culture of performance, they tend to know about their hardware, will try to save memory, make data structure efficient, etc... Python programmers, not so much, instead they tend to value readability and development time.
You can't test languages like you test CPUs for instance. With CPUs, you just run the same code on them and time them. You can't do that for obvious reasons: your C compiler won't accept your Python code, it is necessarily an apples to oranges comparison.
> Assembly is the fastest if you know what you are doing, very few know what they are doing.
Just a nitpick but for any reasonably sized code, no. While some people can indeed do impressive optimizations on small segments of assembly, they are humans and they will fail to do trivial optimizations that are reliably done by compilers.
If garbage collection happens on a separate thread, and makes allocation much faster, is it really “slower”? You have to call malloc which will try to defragment your memory, and then later you will have to call free. Those block the calling thread, if anything, for certain problems they are slower than GC.
Garbage collection itself isn't really slow, per se. But allocating a lot of short-lived objects on the heap still means that a lot of objects have to be reclaimed fairly frequently. And at least .NET's garbage collector isn't able to do all that without pauses. And those add up.
But even if the GC is really concurrent: If there's a way of not doing that work it's still better, IMHO.
One interesting profile I've seen at work recently spent about 30 % creating objects, and another 35 % in garbage collection (of pretty much the same objects that have been created all the time). So if there was a way of not allocating that much, or not doing it on the heap, the algorithm could be about twice as fast.
But comparing it to not doing that work is somewhat dishonest — for that you use compare a malloc for each call and a destructor at the end of the scope — and surprisingly, malloc will often do much worse than a good GC implementation, with trying to defragment a bit, etc.
Also, Java can often accumulate big heaps because it only runs the GC when it absolutely must — as you mentioned, it would be unnecessarily work otherwise. It might be interesting to mention that OpenJDK is the “greenest” out of the managed languages due to that.
This is so true. The fastest implementation I’ve ever seen of a priority queue in PHP looks nothing like a priority queue by taking advantage of PHP’s sparse hash maps (aka, arrays). If you use any “standard” implementation it will be slower. I imagine this is true of most optimized algorithms in most other languages.
Really, writing fast code is mostly down to the programmer. For example C is widely recognized as the fastest non-assembly language simply because it leaves a lot to the programmer, C won't magically make your terrible code fast, unless you are using time-to-segfault as a metric. Assembly is the fastest if you know what you are doing, very few know what they are doing.
So, what kind of code are you going to use for your benchmark? Highly optimized code written by experts spending way too much time, the "most idiomatic" code, code written by an average skilled programmer picked at random, code extracted from a big open source project? This can drastically change the ranking, so which one is the most relevant? If you go with the "most idiomatic" for instance, you miss out on the idea that parts can be optimized if needed, and that in real life, programmers aren't perfect and can write suboptimal code by mistake.
There is also a cultural aspect to languages that may not be caught in benchmarks. For example, C programmers tend to have a culture of performance, they tend to know about their hardware, will try to save memory, make data structure efficient, etc... Python programmers, not so much, instead they tend to value readability and development time.
You can't test languages like you test CPUs for instance. With CPUs, you just run the same code on them and time them. You can't do that for obvious reasons: your C compiler won't accept your Python code, it is necessarily an apples to oranges comparison.