- The compare script is fragile. Often times it doesn't want to compare two tests I did with the same exact config, just flipping code I'm testing against.
- It doesn't have a good mechanism for storing auxiliary information. We end up faking errors for it but it just looks ugly and hard to distinguish a correct run from a bad one.
- The compare script is fragile. Often times it doesn't want to compare two tests I did with the same exact config, just flipping code I'm testing against.
- It doesn't have a good mechanism for storing auxiliary information. We end up faking errors for it but it just looks ugly and hard to distinguish a correct run from a bad one.