Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Secrets of Building Realtime Big Data Systems (slideshare.net)
61 points by nathanmarz on March 25, 2011 | hide | past | favorite | 6 comments


I love that the slide after "[Data systems must be] 1. Robust to machine failure" is an error "Oops! This slide did not convert properly hence cannot be displayed." Oh the irony.


How does Python hook into all this ? Seems very J2EE. Jython or API? Or maybe they just use it instead of SH to set up env ?

BackType looks cool btw, got an API key a while back and started playing with it.


We use Python for all our data collecting, general scripting, and some of our stream processing (though stream processing is moving to a more generalized framework that is Java/Clojure based).

Also, our underlying API (that serves results to all our products/API/etc) is a collection of Python servers written with Twisted and Thrift.


Clever. Does the system also support non-aggregate queries? Such as if you wanted to display to a user individual pieces of data for their site for example.


Thanks for the great slideshow. I've found another slideshow that is realy great: "The Secrets of Building Realtime Big Data Systems"


I thought the stated 200 machine cluster for 300 reqs per sec seemed quite high... Anyone else with RT DM experience care to comment?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: