Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The app was a C# frontend (https://github.com/kg/HeapProfiler) that drove windows OS services to take heap snapshots and capture stack traces, so I ended up writing a custom key/value store in C# to avoid having to do cross-language interop, marshaling, etc (the cost of sending blobs to SQLite and running queries was adding up.). It's hard to beat the best-in-class optimized databases on their own turf but if you can just grab a spot to dump your data into, you end up being a lot faster.

By the end it ran fast enough that it was able to saturate the kernel's paging infrastructure and make my mouse cursor stutter, and I was able to take 1-2 snapshots per second of a full running Firefox process with real webpages in it, so it was satisfactory. SQLite couldn't process the amount of data I was pumping in at that rate (but it still performed pretty well - maybe a few snapshots per minute)

At the time I did investigate other data stores and the only good candidates I ran across used incompatible open source licenses, so I was stuck doing it myself. Fun excuse to learn how to write and optimize btrees for throughput :-)



Yeah, most databases probably will have pathological behaviors against your requirements (especially on tail-latency, which you would care about). Many implement similar tools would put a lightweight compression on top and just dump these snapshots to disk and then run a post-processing for queries. Dumping snapshots is also preferred because you can insert checksums and checkpoints for partial data recovery if there are failures.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: