It’s possible to build this around protobuf. Google has a rich internal protobuf ecosystem that does this and supports querying large amounts of protobuf data without specifying schemas. They are only selectively open sourced. Have a look at riegeli if you are interested.
https://github.com/google/riegeli