Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Rust's slow compiles are such a turn off for me. Like why does it take tens of seconds to recompile when I am just changing a single number in a file? Does it really need to waste so much of my time to change a single byte in the output binary?


I once had a similar project to OP that took 30 seconds to compile a one-line change (albeit on a 10-year-old CPU). I split it into about 7 crates, and got compiles down to 3 seconds. Since then I'm always vigilant about keeping crates nice and small. I think as long as you're keeping an eye on your crate sizes, compile times won't get away from you.

This is different from C++ where each individual source file can be compiled in parallel, so it's something I had to re-learn.


Can this be automated? Could a script divide code into small crates upon compilation?


I did it by hand. It's a mostly mechanical transformation, so in theory it could be automated, but I've never seen an automated refactor for any language that can e.g. take a giant file and decide how to split it up into 5 smaller files. But that's something I'd love to see exist


That's a pretty bad incentive though and you end up with very large dependency trees similar to JS. And we know how that went.

OTOH there might be a benefit too as more code becomes re-usable in independent crates.


> That's a pretty bad incentive though and you end up with very large dependency trees similar to JS. And we know how that went.

That's... not the same thing at all.

Rust lets you have multiple crate in a single workspace, which allows parallelising the build (because crates are the concurrency unit of building rust).

That's got nothing whatsoever to do with pulling random crates, which is a separate issue.


Why doesn't Rust default to one crate per source file then? Wouldn't that make the build massively parallelized :-)


Because you can have circular dependencies within a compilation unit, but not between units.


That actually sounds more appealing to me. In fact it's exactly how OCaml works. Of course in OCaml compilation units are just files already.


I don't think there is a moral justification for circular dependency scopes to always be individual files, especially given that import statements count as circular. The same issue in other languages where circular scopes are individual files leads to some horrible workarounds:

* Python TYPE_CHECKING https://adamj.eu/tech/2021/05/13/python-type-hints-how-to-fi...

* C/C++ header files

But regardless, there is too much Rust code in existence to change this now.


A 'moral' justification is not needed. We are talking about potentially massive build speed improvement here. Sounds like it could just be a build setting.


It would definitely be interesting to run some experiments with automatically splitting up Rust crates into compilation units if possible.


Username checks out :-)


> you end up with very large dependency trees similar to JS. And we know how that went.

The problem in the JS ecosystem is not the number of dependencies, but the number of entities you have to trust. There is no real difference between having one big crate in a repo managed by one entity or fifteen small crates in a repo managed by one entity.


Sounds like the community still has to decide/write about what are best practises to make the compiler go fast?


What was your heuristic for deciding if a crate is too big or not?


Going by feel, same as when I have a giant function I decide to split up. IIRC the crates ended up between 1k-10k LOC


In OP the biggest crate after the split-up is 1,513 loc, with most of them a few hundred loc.


You are right. There is no technical reason it should be as slow as it is. It's just that not enough resource was spent to make it fast. I mean, if this were a higher priority, Rust issue #26600 would have been fixed years ago.

https://github.com/rust-lang/rust/issues/26600


> Currently, GlobalISel is within 1.5 the speed of FastISel according to https://llvm.org/docs/GlobalISel/index.html and they have some ambitions for getting it within 1.1 or 1.2 in time, so it seems likely that GlobalISel will close this issue before FastISel grows the relevant support.

Heh


Looks like it's blocked by lack of required LLVM support.


Again, if this were a priority, it would have happened. But compile time is low priority. Its priority is lower than a quite niche feature like cross-language LTO, which happened in 2019.

https://blog.llvm.org/2019/09/closing-gap-cross-language-lto...


I think speed is separate from caching, albeit overlapping. If the compiler can see exactly what changes your code will have in the final binary, then it can do very very little work. However, it would also suffice for the compiler to just be so fast that it can do a complete compile from zero to finished in some acceptable time (tcc and I think Go favor this). Those are different goals, or at least different ways of achieving the desired outcome.


> Like why does it take tens of seconds to recompile when I am just changing a single number in a file?

Impossible to tell without knowing more. Steps to reproduce the issue would help. Just consider how TFA went through many different things to investigate.

> Does it really need to waste so much of my time to change a single byte in the output binary?

Does it actually only change a single byte in the binary? Changing a single value can cascade much further than that.


>Impossible to tell without knowing more.

It was a rhetorical question based off my experience with rust. I was running into this issue with building a site using warp along with some other dependencies. Those projects have been deleted off my system. I am not interested in chasing down performance issues with the compiler. Remaking the site using C++ did not have me run into slow compile times. I just want to be able to quickly iterate on stuff I work on.

>Does it actually only change a single byte in the binary?

It would be possible. I could do it myself using binary ninja / ghidra. I just want to be able to iterate quickly and try out multiple different values. I understand it's not that simple of a problem, but I just want it to work else I'll gravitate to making projects in some other language.


I feel the same pain, specially since when using C++ I tend to just use binary libraries for all the code I don't own, so even cold builds are quite fast even when talking about C++.

Just by having cargo support binary libraries would be an improvement for cold builds, but I understand it isn't a priority.

The Rust/WinRT folks have a big binary crate that they use to workaround Rust's compilation times, https://github.com/microsoft/windows-rs/tree/master/.windows


> I just want it to work else I'll gravitate to making projects in some other language

Sounds like Go or C++, or maybe evne Zig, are good choices for you then


Mhh, perhaps there is a place for detecting something like an "incremental change that ONLY changed static values".

It is indeed very frustrating trying to play with these kind of "settings" values.


Are you using incremental compilation? Ive personally never had problems with incremental compiles. Sure, production builds are slow, but that's rarely a problem.


Contrary to most, I actually like the long compile times.

I also like programming in Rust without a linter.

It really makes me think about what it is I'm doing. I do not have the luxury of typing something down, compiling it, and running it after every change to see: "does it work now?"

To be fair, that's only in my personal projects. I understand this is a major issue in the "ship it quickly -- everything else be damned" environment of commercial programming, where one must rely on all these aides to get something reasonably done in a reasonable amount of time.

It was a major headache when I first learned the language, because the syntax is so rigid and exacting; but it did force me to really understand the language, instead of just being able to throw shit at the wall and consult technical docs everytime I wanted to do something.


No offense, but I want to hate this opinion, yet I kind of agree with it on some level. There was a time where I could write several pages of compilable code on paper, using the standard library and 3rd party dependencies, without looking at docs or anything. I could do that after a relatively short time after using a language.

Now, I can still do the same, but that's after spending years using a particular language or ecosystem.

I wouldn't want to go back to writing code on paper without docs, but it definitely made you think differently when it came to both writing code and learning. Kind of like how when you had a question about something before the internet, you had to stew with it until you could answer it yourself, or until you met someone who could answer your question, or you did your own research. Now, I can take out my phone and answer most questions within 20 seconds.


I believe incremental mode is still off by default.


Not for debug builds, according to the article.


Use bazel?

Individual files won't be faster but it's been a wonder for whole project compilations


This could be interesting to look into. It looks like there already exists a project that can generate BUILD files for external crates.

My 2 main gripes with Bazel was that it was a pain having to rewrite the build system for all of my dependencies and that it's claim of being reproducible is weak in the sense that there are not even warnings when you use resources from the base system (eg. compilers can include files from the base system which may not be the same / exist on another) (this is a problem since I wanted to use a header from the dependency I built with Bazel and not something already on my system)


Are you enabling sandboxing? Use `--spawn_strategy=sandboxed`, or better yet `--spawn_strategy=worker,sandboxed --worker_sandboxing`. That should disallow using files from the base system.

This does disable multiplex workers,.which can make it more memory intensive. Working on that.


Yes, I was using their sandbox. They intentionally make their sandbox weak so you can use things like gcc from the system without having to bootstrap them.

I don't know exactly what I tried since this was maybe a year and a half ago. I tried asking on their slack, but I think I was told that it was not possible. I don't have the project around anymore to try out your suggestion.


This is controlled by their default toolchain that includes /usr/include and such. You can define your own toolchain with different include directories.


Yep, agree on all fronts, the last point being especially painful when working on software you distribute to end users whose systems aren't in your control.


Does anyone have an example of a Rust Bazel build project?

Or even better, a massive Rust monorepo orchestrated with Bazel build?


Not what you asked for, but maybe still interesting to look into.

Android uses Soong for Rust,

https://source.android.com/setup/build/rust/building-rust-mo...

While Fucshia uses GN with Rust,

https://fuchsia.dev/fuchsia-src/development/build/concepts/b...

https://fuchsia.dev/fuchsia-src/development/languages/rust


Fantastic resources, thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: