For those interested in the subject: Cliff Click of Azul Systems had a presentation on his design for a hash table meant for heavy multithreading. One part I remember was that he stored the hashes of keys alongside the keys, allowing for faster resolution of the "not match" case (you compare the hashes before doing the full key comparison).
But this trashes the cache line by half. Only half buckets are now in the cache, a miss costs 50 insn. It's a typical hashtable trade-off.
It pays off with bigger tables, as you have much more misses there.
I have no idea what you think we’re disagreeing about. I’ve watched Cliff’s presentation a handful of times over the past decade and implemented something similar (albeit simpler) myself. Yes, the particular FSM in his hash table is interesting and novel, but just as insightful to me was his argument that data structures implemented as state machines built on atomic CAS can save you much of the trouble of proving you’ve put memory barriers in the right places. “All” you have to do is demonstrate that every possible state is valid.
https://web.stanford.edu/class/ee380/Abstracts/070221_LockFr...
While searching for the above, I also found this: http://preshing.com/20130605/the-worlds-simplest-lock-free-h...