Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You point out a good gotcha that most people don't notice - many host-specific errors are often faster than the standard response and more traffic will route to those hosts with a performance-based heuristic. A well configured LB would have hosts with responses that take them out of rotation for known fault conditions - but at scale that's hard for companies to validate.


At scale, with also having scaled systems engineers, it's not, like, impossibly hard. A sidecar like Envoy can be configured to emit health stats which can then be read by the load balancer to consider a given server unhealthy. Again, at scale, but each team is already responsible for a dashboard with health metrics for their service, so the load balancer team doesn't have to try and determine everybodies health metrics, only their own.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: