The problem with all of these examples is that it’s using a model of Snapshot Isolation which guarantees writes don’t conflict, but doesn’t provide a similar guarantee for reads. What you really want is the pair of constraints:
- Anything I write hasn’t been concurrently written to by another transaction
- (added): Anything I read also hasn’t been concurrently written to by another transaction
Adding “conflicting reads” fixes every single example in this post. The linked list example can be fixed in any number of ways. The most efficient might be to “spread out” the set of conflicting keys to also include the next / prev pointers of the deleted items b and c. This makes the transactions conflict. Or maybe better - include all the next pointers of all items before the modified item. This adds an implicit guarantee that the visited / modified item must be in the list for the transaction to be successfully committed.
Now, adding reads to the conflict set might change SI into a different concurrency model. The blog post does use the same definition as Wikipedia. And that definition only mentions writes. But I think any SI system can be made to work this way by setting X = X for any value read during each transaction.
The “two threads” example becomes this:
beginTxn();
if (x == 0 && y == 0) x = 1;
x = x; y = y; // implicit
endTxn();
beginTxn();
if (x == 0 && y == 0) y = 1;
x = x; y = y;
endTxn();
And the two transactions will function correctly.
Having the database handle this for you is best because:
- Humans will forget
- If two transactions read some value but neither transaction writes to it, the two transactions don’t actually need to conflict. This is too strict.
Foundationdb in my opinion has the best API for this I’ve seen. Reads and writes within a transaction are tracked like this automatically, and you can manually add, remove or change the set of conflicting keys using an explicit api if you want to. But I’m sure it’s not alone.
- Anything I write hasn’t been concurrently written to by another transaction
- (added): Anything I read also hasn’t been concurrently written to by another transaction
Adding “conflicting reads” fixes every single example in this post. The linked list example can be fixed in any number of ways. The most efficient might be to “spread out” the set of conflicting keys to also include the next / prev pointers of the deleted items b and c. This makes the transactions conflict. Or maybe better - include all the next pointers of all items before the modified item. This adds an implicit guarantee that the visited / modified item must be in the list for the transaction to be successfully committed.
Now, adding reads to the conflict set might change SI into a different concurrency model. The blog post does use the same definition as Wikipedia. And that definition only mentions writes. But I think any SI system can be made to work this way by setting X = X for any value read during each transaction.
The “two threads” example becomes this:
And the two transactions will function correctly.Having the database handle this for you is best because:
- Humans will forget
- If two transactions read some value but neither transaction writes to it, the two transactions don’t actually need to conflict. This is too strict.
Foundationdb in my opinion has the best API for this I’ve seen. Reads and writes within a transaction are tracked like this automatically, and you can manually add, remove or change the set of conflicting keys using an explicit api if you want to. But I’m sure it’s not alone.