To elaborate, we use a tool called Crater to estimate the potential impact of breaking changes. It attempts to compile all of the packages on crates.io with a new version of the compiler to determine any regressions, effectively using Rust's public package repository as an extended test suite.
Here's Crater's regression report for the linked soundness fix: https://gist.github.com/nikomatsakis/2f851e2accfa7ba2830d#ro... . It detected four root regressions, which means that there were four packages that were relying on unsound behavior. This isn't necessarily the full extent of the regressions, however, because if we ship a compiler that breaks those packages, then any other packages that have the formerly-unsound packages as dependencies will obviously also break.
Having a concrete list of regressions also allows us to go through the ecosystem and submit PRs ourselves to bring the affected packages back to building, which is usually quite easy. Crater is a really, really great tool.
> ...effectively using Rust's public package repository as an extended test suite.
This is one of the coolest and most practical things I've read in a while, seeing what happens to stable(ish) real wild code. So many practical applications and analytics are coming to mind in many areas.
I believe Perl and CPAN originally coined this approach. I'm glad to see from the parent and other comments that this approach seems to have caught on in other ecosystems as well.
Scala also has something similar, called the community build: https://github.com/scala/community-builds
It does not contain everything, but (open source) authors are encourage to add their library to the mix.
This has been one of the many benefits of quicklisp (a repository for common lisp packages); there is a test-grid maintained by Xach and he reports any breakages when there is a release candidate of a new sbcl.
Does Crater ever find unstable tests that don't pass consistently on a good baseline? If so, do you just blacklist those packages and/or feedback improvements?
NM: after reading more carefully, it's clear that at least in the case where rust causes the "regression" you feedback the improvements, so likely the same for unstable tests.
Well, the "tests" here aren't actually tests—they're whether the package compiles or not. If the compiler is being inconsistent, that's a pretty big bug ;)
Ah, I thought it was doing "cargo test" to see if the compiled code had a functional regression. But that would be challenging to investigate regressions (especially in the face of unstable tests).
Here's Crater's regression report for the linked soundness fix: https://gist.github.com/nikomatsakis/2f851e2accfa7ba2830d#ro... . It detected four root regressions, which means that there were four packages that were relying on unsound behavior. This isn't necessarily the full extent of the regressions, however, because if we ship a compiler that breaks those packages, then any other packages that have the formerly-unsound packages as dependencies will obviously also break.
Having a concrete list of regressions also allows us to go through the ecosystem and submit PRs ourselves to bring the affected packages back to building, which is usually quite easy. Crater is a really, really great tool.