> OdinMonkey is a hammer that will kill incentive to optimize JavaScript that humans write.
Eh, I'm not worried about this. Many apps will have no reason to migrate to asm.js, and for as long as some big and important apps are written in JavaScript, there will be incentive to optimize plain JavaScript.
JavaScript and other dynamically-typed languages exist because in many cases they are the best and most convenient way to write apps. But suppose this weren't true; suppose that in the long-term developers started favoring statically-typed languages for web apps, either because of performance or because of usability (Eclipse-like IDE convenience; hard to provide in a dynamic language). Is the author saying that we should artificially prop up usage of dynamic languages by taking away some of the inherent performance benefits of static languages? That doesn't sound like the best way to achieve technical excellence in the long term.
In the marketplace of ideas and technologies, let things succeed or fail based on their demonstrated merit, rather than trying to pick winners and losers based on our preconceptions.
> Somebody might say that [my proposed bytecode] does not run everywhere. Nope. It does run everywhere: just take JavaScript and write a simple one pass translator from this bytecode to JavaScript.
You can think of asm.js as just that; a one-pass translator of an implicitly-defined bytecode to JavaScript. If you want to view/edit it as a more traditional-looking byte-code, you can easily implement a YourBytecode<->asm.js compiler. It just so happens that asm.js is a backward-compatible representation, so works more conveniently as the standardized wire encoding.
(I have not studied the asm.js spec in detail, but I've seen pcwalton describe it as an alternate encoding of LLVM bitcode, so I suspect that the idea of implementing a YourBytecode<->asm.js compiler is actually reasonable given the asm.js definition).
* asm.js has a very primitive type system, unlike LLVM bitcode. I think this is better for a distribution format; the complex types make sense for compiler IR optimizations but not so much for distribution after the optimizations are already done.
* asm.js doesn't have goto, unlike LLVM bitcode. I'm told that this doesn't seem to matter much in practice, as the Relooper is quite refined now. However, I'm certainly willing to believe that there are some oddly-shaped CFG's that this will hurt. Perhaps JS will need goto.
* asm.js doesn't have 64-bit types. This is unquestionably unfortunate, as it will need 64-bit types to achieve native parity. (Of course, JS needs 64-bit types anyway—it's important on the server, for instance.)
* asm.js doesn't have SIMD, unlike LLVM which has support for SIMD with its vector types. This will need to be added on to be competitive on some workloads. This is a good opportunity for collaboration between Dart and JS, as Brendan pointed out.
Regarding the original post, (and speaking only for myself, not my employer) I actually agree with mraleph to some degree. From an aesthetic point of view I don't like the subset approach as much as anyone. But if V8 doesn't implement asm.js AOT compilation, ignores the "use asm" directive, and achieves the same performance as Odin, then that's actually a good outcome in my mind. At least asm.js will have fulfilled the role of an informal standard that engines and compiler writers alike can agree to target for maximum performance.
Is that really true? The way I read the spec was that asm.js was all about adding type annotations to JavaScript in a backwards-compatible way.
And just thinking logically, it's hard to imagine asm.js optimizing anything beyond what JS is already doing without explicit type annotations – and we already know asm.js gives insane speed increases with a compiler that understands the type annotations.
There are degrees of typed-ness. :) What I mean to say is that asm.js has a much weaker type system—in particular, asm.js does not have aggregate or structural types. The only types are numerics. This is in contrast to LLVM's type system, which features structs, arrays, and so forth. This gives the LLVM compiler much more information for important optimizations such as scalar replacement of aggregates, but those optimizations are typically done before asm.js emission.
This is a feature, not a drawback. Aggregate types for C-like languages tend to bloat the code for no real optimization benefit. See the comparison between LLVM IR and Emscripten-generated JavaScript here: http://mozakai.blogspot.com/2011/11/code-size-when-compiling...
(Of course, for GC'd languages, we will need aggregate types.)
Actually, aggregate types help optimization a lot in C like languages, because it makes it easy to disambiguate between accesses based on offsets. IE two random int pointers is not helpful, two accesses to fields through a pointer to a structure is helpful.
In non-pointer C like languages, yes, they are mostly a burden.
The reason the .bc in the example given is large is because the toolchain does not try to optimize at all for intermediate .bc size. Final .bc size should be relatively sane.
I don't see that I can download the .bc files from that blog post, but I'm mildly curious if they are before or after running opt, because if i had to guess, based on the fact that it gzip's well, i'd guess it's before running opt on it.
Sorry, I should have been clearer: the types are definitely helpful for optimization. For distribution I'm not so sure. The IR-level optimizations where types really help (scalar replacement of aggregates, etc) are already run before the code hits the wire.
It depends.
For distribution where you know you will never optimize at runtime, yes, it's completely worthless.
But part of the reason you may find it worthless is because none of these VM's really perform the strong alias analysis and load/store redundancy removal techniques you could perform at runtime.
Martin Richards compiler is pretty easy to add a backend to. I got the distribution he covers in his BCPL for young people (raspberrypi) guide. I targeted VideoCore for some tinkering I was doing, asm.js sould be fairly easy - just the relooping stuff would be the main effort.
Would be interesting to see Martin Richards classic benchmark compiled to asm.js from BCPL and compared with the port to js.
I'm not sure why you think oddly shaped CFG's that have goto's would hurt it.
At worst, you can always turn oddly shaped CFG's into sanely shaped ones at the cost of code duplication.
Very early on in the days of the tree-ssa project, Sebastian actually implemented goto elimination for GCC.
It made zero performance difference in any real code (even those people building goto heavy interpreters).
So it was removed.
In any case, real, sparse, SSA based optimizations don't care about even fully connected CFG's (There are plenty in GCC's testsuite). Yes, dataflow optimizations care, but if you aren't doing something sparsely, you should fix that. :)
As a complete aside:
I agree with the original poster that asm.js is a leaky abstraction, but disagree with everything else. All sane optimizing compilers do lowering of some sort (i'm aware of V8's direct code generation, as well as Go's. At least Go's normal compiler doesn't make the claim that this generates amazing native code, only reasonable native code). Even gcc has what is known as "high gimple", which looks like C, but is normalized a bit, and then "gimple", which looks like C, but is even more normalized.
(LLVM, by comparison, has something a lot less like C, but lately lots of metadata has been added to recover some of the losses)
All asm.js does is expose the "GIMPLE" level.
If the argument is "it's not sane for folks to write asm.js instead of javascript (the equivalent to "it's not sane for folks to write GIMPLE instead of C), this is kind of a truism. To the degree asm.js folks think people should be writing asm.js formed code directly, they rightly deserve to be mocked :P.
(Generating it at runtime, of course, is a different thing altogether)
To the degree the original poster's argument is "optimizing asm.js takes away the desire to optimize anything not asm.js", this is well, wrong. Performance benchmarks and apps are still written in normal JS. Just because GCC/OdinMonkey has GIMPLE/asm.js doesn't mean they don't try to optimize regular C/javascript. The whole point of having GIMPLE/asm.js is to make it easier to optimize testcases by normalizing them so you can guarantee that if you perform this optimization, it will work as well no matter how crazy the actual original input code.
Yes, you can almost always extend everything to handled the unbounded set of the real language. You can see ways to directly do better codegen from JS without an intermediate form.
In fact, there were for a long time, source to source optimizing C and C++ translators.
One of the reasons they basically all died is because they were 100x harder to maintain and improve than all of the standard "high IR/mid IR/low IR" arrangement of optimizing compilers, and over time, the optimizing compilers won.
By a large margin.
It wasn't even close.
Heck, GCC did very well for many years with no normalized version of the high level language: It used to go from AST to very low level IR and work on that. Right up until about 1994, this worked well.
Then it started losing to compilers with a normalized high level IR (ICC, PGI, XLC, everyone). By factors of 2-10.
If the world was all still fortran, and we were using fortran in the browser, maybe the "i don't see the need for asm.js" would be a reasonable discussion.
> I'm not sure why you think oddly shaped CFG's that have goto's would hurt it. At worst, you can always turn oddly shaped CFG's into sanely shaped ones at the cost of code duplication.
Very true, I hadn't thought about that. That's a great point. I believe that the Relooper falls back to a large switch statement for irreducible control flow (which is very rare), but it could do code duplication instead.
If you search gcc-patches, we quantified the cost of doing this at one point, though i have no recollection of the results.
Back in the day, I talked with Ken Zadeck (who used to be my office mate at IBM) and some others at IBM Research, and it turns out there is a bunch of good unpublished algorithms and research on both eliminating and analyzing irreducible control flow.
Sadly, this seems to be a case where this knowledge won't make it out of IBM :(
This challenge covers much of compilers research. Many heuristics and techniques that are used in practice in compilers (certainly in ours and GHC and ocaml and other research ones I've talked with) have solutions we've all verbally shared with one another on how to solve these problems. But we're all convinced we couldn't get a paper out of it (novel enough, but it's too hard to meet the "evaluation bar"), so it remains stuff we talk about during breaks at conferences and in "Limited" circles on G+...
You can use labeled break and continue for a lot. That said, there is some irreducible control flow that the Relooper falls down on. It's quite rare in practice, though—with if, while, break, continue, labeled break, and labeled continue, the Relooper can reloop nearly all control structures seen in actual code. But, as I mentioned, I'm willing to believe you can come up with a benchmark where you'll need goto.
> there will be incentive to optimize plain JavaScript
I am not that worried about average JavaScript code. I am more concerned about computational cores that people will start rewriting to asm.js.
Or here is another question: how do you keep incentive to optimize Emscripten generated code so that eventually you can kill "use asm" and go full speed without it?
> You can think of asm.js as just that; a one-pass translator of an implicitly-defined bytecode to JavaScript.
No, when I said about one pass translator, I meant it should run on the client. asm.js is not just that.
> I am not that worried about average JavaScript code. I am more concerned about computational cores that people will start rewriting to asm.js.
> Or here is another question: how do you keep incentive to optimize Emscripten generated code so that eventually you can kill "use asm" and go full speed without it?
I hope that some JS engine does no special asm.js optimizations, but at the same time achieves the same speed. That would be both an amazing technical achievement, and a very useful result (since presumably it would apply to more code)! :)
Btw, I actually was pushing for that direction in early asm.js-or-something-like-it talks. So I think I see where you are coming from on this matter, and most of the rest of your post as well - good points.
However, I strongly believe that despite the correctly-pointed-out downsides, asm.js is by far the best option the web has: We need backwards compatibility + near-native speed + something reasonably simple to implement + minimal standardization difficulties. I understand the appeal of the alternatives, but think asm.js is far closer to achieving all of those necessary goals.
> Or here is another question: how do you keep incentive to optimize Emscripten generated code so that eventually you can kill "use asm" and go full speed without it?
Why is that important? I don't get it.
Emscripten -> JS is lossy; it throws away information that the code generator could use to generate more efficient code. A clever JIT can re-discover that information by observing the code in action. And sure, that's a satisfying challenge from a VM implementor's perspective, why is it a priori worse to just tunnel that information through via asm.js rather than throwing it away?
Even if the VM is perfect at this, there is a non-zero runtime cost that, I would expect, scales linearly in the size of the application.
I believe it can be done, thus because it can be done I don't see why it should not be done. Why keep two front-ends (two parsers even!), two separate IR generators etc in the system if you can have one?
I also think that such code can occur and occurs in real world applications as well. And I want any JS code go faster to its limit, without requiring people to rewrite anything.
It is true that dynamic compilation incurs certain overhead and requires warm up. But it is also true that AOT compilation is not cheap either (that is why a special API to cache generated code is being suggested).
" If you want to view/edit it as a more traditional-looking byte-code, you can easily implement a YourBytecode<->asm.js compiler."
There are quite a few hurdles, and some of them will be resolved with some stuff Mozilla has proposed in ES6. But as it stands now, you can't use CLR IL bytecode or even JVM bytecode.
Sorry, when I said "your bytecode," I meant whatever bytecode the author had in mind that is trivially one-pass compilable to JavaScript. Ie. one that has compatible semantics. Mapping an arbitrary bytecode onto asm.js is not going to be trivial, since VMs have nontrivial differences in semantics surrounding garbage collection, concurrency, etc.
You couldn't (reasonably) compile the CLR or the JVM to asm.js? I'd assume you could, which would seem to point toward a translation of bytecodes (since that's what the asm.js-hosted CLR/JVM would be doing) being feasible…
You can't because both are garbage collected, which asm.js is unable to access (unless you wrote your own garbage collector)
At the end of the day, asm.js uses some code patterns (like taking bit-or with 0, where the spec mandates that it occurs as if the data is coerced to 32 bit integers) to optimize performance of those operations. It is only capable of doing so when you strip away most of those features which make JS a really nice language to write code in.
But at that point the VM-atop-asm.js-atop-the-JavaScript-engine would be the thing doing the GC against its own asm.js-provided heap. (Right?) Perhaps performance would be awful, though I guess I don't immediately see why: if asm.js can get reasonably close to native speed (within 2x, I think I've read?) and .NET/Java can get within a comparable distance of native, then the two together should be slower, but not unusable for many things, I'd think.
Upon seeing your edit: much of the speedup also comes from the lack of dynamicity, too, from my understanding.
"VM-atop-asm.js-atop-the-JavaScript-engine would be the thing doing the GC"
"much of the speedup also comes from the lack of dynamicity, too."
In the language, JS gives you no control over its underlying GC. Every single assumption that asm.js leverages depends on the determinism, and introducing a GC into the mix really messes with AOT optimizations
But the JavaScript engine's GC isn't applicable here, is it? Because everything from the asm.js layer up is using a privately-managed heap implemented as a single (?) JavaScript typed array (ArrayBuffer) with no JavaScript-engine-level visibility of the distinct asm.js-layer values stored in it.
Edit: perhaps the thing I'm missing is that the discussion is about direct translation of CLR IL/JVM bytecodes into asm.js without any vestige of the CLR/JVM still running alongside. I was merely thinking of compiling the CLR or JVM to asm.js then hosting unmodified CLR IL/JVM bytecode atop that. Which seems feasible, though not as performant as a VM-less translation which I agree doesn't seem possible.
Eh, I'm not worried about this. Many apps will have no reason to migrate to asm.js, and for as long as some big and important apps are written in JavaScript, there will be incentive to optimize plain JavaScript.
JavaScript and other dynamically-typed languages exist because in many cases they are the best and most convenient way to write apps. But suppose this weren't true; suppose that in the long-term developers started favoring statically-typed languages for web apps, either because of performance or because of usability (Eclipse-like IDE convenience; hard to provide in a dynamic language). Is the author saying that we should artificially prop up usage of dynamic languages by taking away some of the inherent performance benefits of static languages? That doesn't sound like the best way to achieve technical excellence in the long term.
In the marketplace of ideas and technologies, let things succeed or fail based on their demonstrated merit, rather than trying to pick winners and losers based on our preconceptions.
> Somebody might say that [my proposed bytecode] does not run everywhere. Nope. It does run everywhere: just take JavaScript and write a simple one pass translator from this bytecode to JavaScript.
You can think of asm.js as just that; a one-pass translator of an implicitly-defined bytecode to JavaScript. If you want to view/edit it as a more traditional-looking byte-code, you can easily implement a YourBytecode<->asm.js compiler. It just so happens that asm.js is a backward-compatible representation, so works more conveniently as the standardized wire encoding.
(I have not studied the asm.js spec in detail, but I've seen pcwalton describe it as an alternate encoding of LLVM bitcode, so I suspect that the idea of implementing a YourBytecode<->asm.js compiler is actually reasonable given the asm.js definition).