Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As best I can tell, the author understands that the async write-ahead fails to be a guarantee where the sync one does… then turns their async write into two async writes… but there’s still no guarantee comparable to the synchronous version.

So I fail to see how the two async writes are any guarantee at all. It sounds like they just happen to provide better consistency than the one async write because it forces an arbitrary amount of time to pass.



Yeah, I feel like I’m missing the point of this. The original purpose of the WAL was for recovery, so WAL entries are supposed to be flushed to disk.

Seems like OP’s async approach removes that, so there’s no durability guarantee, so why even maintain a WAL to begin with?


Reading through the article it’s explained in the recovery process. He reads the intent log entries and the completion entries and only applies them if they both exist.

So there is no guarantee that operations are committed by virtue of not being acknowledged to the application (asynchronous) the recovery replay will be consistent.

I could see it would be problematic for any data where the order of operations is important, but that’s the trade off for performance. This does seem to be an improvement to ensure asynchronous IO will always result in a consistent recovery.


There's not even a guarantee that the intent log flushes to disk before the completion log. You can get completions entries in the completion log that were lost in the intent log. So, no, there's no guarantee of consistent recovery.

You'd be better off with a single log.


I think he says he checks for both

It's interesting as a weaker safety guarantee. He is guaranteeing write integrity, so valid WAL view on restart by throwing out mismatching writes. But from an outside observation, premature signaling of completion, which would mean data loss as a client may have moved on without retries due to thinking the data was safely saved. (I was a bit confused in the completion meaning around this, so not confident.)

We hit some similar scenarios in Graphistry where we treat recieving server disk/RAM during browser uploads as writethrough caches in front of our cloud storage persistence tiers. The choice of when to signal success to the uploader is funny -- disk/RAM vs cloud storage -- and timing difference is fairly observable to the web user.


The first part is correct, which is why during recovery transactions need to exist in both places to be applied else they are discarded (from either). If it works as stated on paper then it would give the C for consistency in recovery but of course fails at durability.


There's no guarantee of ordering of writes within the two logs either.

This seems nightmarish to recover from.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: