In practice we are using an image size of 17x16 which will result in a hash size of 256 bits and currently it seems to work pretty well. I ran the algorithm through the whole dataset (about 330.000+ icons) and I would say that from all the duplicate matches about 1% where false positives.
Also, we will be integrating this into the reviewing process for an iconset, where we also do a manual quality check, showing possible matches to something currently uploaded so skimming over one or two false positives isn't such a big deal and we where more interested in the speed of the algorithm.
Also, we will be integrating this into the reviewing process for an iconset, where we also do a manual quality check, showing possible matches to something currently uploaded so skimming over one or two false positives isn't such a big deal and we where more interested in the speed of the algorithm.