Check this out: http://blinkdb.org/ A new research collaboration between UC Berk...

rickmode · on May 1, 2012

"BlinkDB can execute a range of queries over a real-world query trace on up to 17 TB of data and 100 nodes in 2 seconds, with an error of 2–10%"

17 TB on 100 nodes? That a lot of nodes to hold 17 TB; on average 174 GB-ish of data each.

The speed is super impressive, but using 100 nodes makes this look more like a parallel processing achievement than "big data".

rxin · on May 2, 2012

Even with parallel processing, assuming you can handle 1G of data per node per second (which is a fairly optimistic estimation for on disk data):

1G/node/sec * 100 nodes * 2 sec = 200G.

17TB is 85 times that number.