More

deanward81 · on Nov 21, 2021

Stack Overflow runs on 9 web servers with (iirc) 48 logical cores (2 x 12-core Xeons) and 64GB RAM. Those servers are shared by a few apps (Talent/Job, Ads, Chat, Stack Exchange/Overflow itself) but the main app uses, on average, ~5% CPU. Those machines handle roughly 5000 requests/sec and were running .NET 5 as of September 2021 (when I moved on). That’s backed by 2 v. large SQL clusters (each consisting of a primary read/write, a secondary read-only in the primary DC and a secondary in the failover DC). Most traffic to a question page directly hits SQL - cache hit ratio tends to be low so caching in Redis for those hits tends to be not useful. As somebody mentioned below, being just a single network hop away yields really low latency (~0.016ms in this case) - that certainly helps being able to scale on little hardware - typically only 10 - 20 concurrent requests would be running in any instance at any one time because the overall request end-to-end would take < 10ms to run.

Back in full framework days we had to do a fair bit of optimisation to get great performance out of .NET, but as of .NET Core 3.1 the framework _just gets out the way_ - most memory dumps and profiling subsequent to that clearly pinpoint problem areas in your own app rather than being muddied by framework shennanigans.

Source: I used to work on the Platform Engineering team at Stack Overflow :)

sp33der89 · on Nov 21, 2021

That's some great info, thank you!

jorams · on Nov 21, 2021

> cache hit ratio tends to be low

That's surprising to read. Is that because of the sheer volume of question pages? I don't think I've ever been on an SO page that couldn't have been served straight from cache.

tomc1985 · on Nov 21, 2021

Is it? Most people come to SO from Googling their random tech problems/questions. Not sure how much value there is in caching my random Rails questions, etc

setr · on Nov 21, 2021

I would expect SO usage to follow a distribution like Zipfs — most visits hit a small subset of common Q/A, and there’s a ridiculously long tail of random questions getting a few visits where caching would do next to nothing. I’m fairly positive I’ve seen some post showing this was true for atleast answer-point distributions.

Though I guess it’s possible for a power distribution for page-likely-to-be-hit to still be useless for caching, because I think you could still get that distribution if 99% of hits are on nearly-unique pages; with a long enough tail, you’d still have only relatively few pages worth bothering to cache, but by far most visits are in the tail

deanward81 · on June 15, 2021

^^ this! No need to install anything on my Apple devices. Long term plan is to have an instance of AirDrop Anywhere running perpetually on a Raspberry Pi and have a web interface accessible over the LAN for non-Apple devices to connect to when sharing files.

That said, this is all just a bit of fun so monetising it isn’t really on the cards anyway :)

rektide · on June 15, 2021

It's pretty vulgar that Apple people continue to feel like, well, I have a protocol that works on my devices, so what if the rest of the world can't participate? "Works for me!" has it's merits, oh sure, but some willingness to see the faults & deficiencies would be a nice balance.

nexuist · on June 15, 2021

Is it any different from Linux users pushing software that depends on Linux syscalls to function and scoffing at Microsoft's attempts to achieve platform parity with WSL? I can't tell you how many times I've heard someone in computing say "just use linux!" instead of attempting to solve whatever problem a user is having with their current hardware and OS. We all want to believe that our personal choices are the best ones anyone can make for themselves.

rektide · on June 15, 2021

Very little in open source hard-relies on Linux sys-calls. Trying to make this a "both sides" argument is farcical.

Saying that a platform shouldn't lock you in to specific proprietary approaches has a real & substantial non-personal advantage to it that mere personal preference doesn't encompass, as it seeks technical ecosystems capable of organic growth. I don't find your comparison quite valid.

deanward81 · on June 15, 2021

This is not correct, you can receive files from non-Apple devices without any Apple keys. I’m still working on sending so haven’t uncovered the demons lurking there…

deanward81 · on June 15, 2021

Well, that’s why I said it’s unlikely :)! Without purchasing additional hardware it’s not practical to be able to run AirDrop directly on non-Apple devices.

There’s no need to extract any Apple keys to be able to receive files, but the public root key appears to be needed able to send files to an Apple device.

OWL, OpenDrop and their latest project PrivateDrop (https://github.com/seemoo-lab/privatedrop) are linked heavily throughout the series - their reverse engineering of the protocol have been absolutely invaluable in building something that works, more or less sanely on non-Apple devices! Huge kudos to them!

tyingq · on June 15, 2021

Ah, sure. I didn't mean it as criticism, just to say it's possible with the right wifi chipset.

deanward81 · on Jan 25, 2021

Ouch, good catch, fixing now

deanward81 · on Jan 25, 2021

It's in the remediations section, but maybe the wording isn't clear:

*> Hardening code paths that allow access into our dev tier. We cannot take our dev tier off of the internet because we have to be able to test integrations with third-party systems that send inbound webhooks, etc. Instead, we made sure that access can only be gained with access keys obtained by employees and that features such as impersonation only allow de-escalation—i.e. it only allows lower or equal privilege users to the currently authenticated user. We also removed functionality that allowed viewing emails, in particular account recovery emails.*

There was no "unauthenticated" access into dev - the access key here is what allows login at all to our dev environment, but the attacker was able to bypass that protection.

akersten · on Jan 25, 2021

Thanks, yeah I missed that on account of misunderstanding the nature of the access (bug vs token shenanigans)

deanward81 · on Oct 11, 2019

We’re spread over 9 servers but only due to ephemeral port and handle exhaustion issues. Each server is fronted by 4 HAProxy frontends that each handle ~18k connections.

Since Nick’s post we’ve moved from StackExchange.NetGain to the managed websocket implementation in .NET Core 3 using Kestrel and libuv. That sits at around 2.3GB RAM and 0.4% CPU. Memory could be better (it used to be < 1GB with NetGain) but improvements would likely come from tweaks to how we configure GC under .NET Core which we haven’t really investigated yet.

We could run on far fewer machines but we have 9 sitting there serving traffic for the sites themselves so no harm in spreading the load a little!

toast0 · on Oct 11, 2019

Ephemeral port exhaustion is easy to handle if you control HAProxy and the origins.

You'll need the source [1] option on your server lines, and you also need to adjust to allow more connections, one of these will do: have the origin server listen on more ports, add more ips to the origin server and listen to those too, add more ips to the proxy and use those to connect as well.

I'm not sure about handle exhaustion? I've run into file descriptor limits, those are usually simple to set (until you run into a code enforced limit)

[1] https://cbonte.github.io/haproxy-dconv/1.7/configuration.htm...

deanward81 · on Oct 12, 2019

I might be missing some details there to be honest; I'm more familiar with the application side of things than the infrastructure :)

deanward81 · on July 23, 2019

We use websockets for pushing things like comments, answers and question edits as they are added as well as for notifications like inbox messages and reputation updates.

Our websocket process currently has ~500k concurrent connections across 9 web servers.

Nick has some information on his blog (https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...) although it’s a little outdated - we no longer use Netgain, we use the .NET Core websockets bits instead.

winrid · on July 26, 2019

Thanks!

deanward81 · on June 7, 2017

Chrome 59 has a coverage tool built-in: https://developers.google.com/web/updates/2017/04/devtools-r...

JepZ · on June 7, 2017

Note: The color-coding is likely to change in future Chrome releases. <-- Well, someone eventually found out that some people cannot distinguish red and green, but it was too late :-D

I am a little surprised to find such an issue in Google software as it is a topic for first semester CS undergraduates ;-)

uniclaude · on June 7, 2017

Wait what? CS students learn color about color blindness now? I really wasn't in the good classes!

khedoros1 · on June 7, 2017

I didn't learn anything graphics-related in my CS degree. Seems like more of a UX or software development topic than a CS topic.

wongarsu · on June 8, 2017

I learned it in both my computer graphics course and my user-interaction design course as part of my CS degree (German university). None of these courses are required, but they are popular enough that the most important lessons were pretty universal knowledge among students.

bphogan · on June 7, 2017

I taught a ton of a11y stuff in my intro to web classes. Even did screen reader demos. I wasn't the only one. There are lots of folks teaching this stuff. Not a critical mass, but I'm so thankful it's happening. :)

Fiahil · on June 7, 2017

I think it's a joke.

abritinthebay · on June 7, 2017

I learned that in my 1st year HCI module at University nearly 20 years ago. It shouldn't be a joke. It should be normal.

onli · on June 7, 2017

I actually learned that as a CS student.

kbenson · on June 8, 2017

If it was actually part of the curriculum I sure hope that it was in some design centered elective class. IMO if your CS degree spent time teaching that as part of the core curriculum, they missed an opportunity to put more Math, PL theory, and interesting algorithms in there, because there's more than can be sanely covered in any one curriculum.

Since it's accessibility based, it's more laudable than teaching CS students how to center a div, but it's not like it really requires a mentor of some sort to express the nuances of, right? Or even if it does, it's still design.

onli · on June 8, 2017

It's some years now, and there were at least two courses in which it could have been. The one was elected, HCI. The other was not, it was an introduction into graphics and audio, and since you need to understand basics of human perception to understand compression in that area (jpg, mp3), they talked about stuff like that.

A good CS degree definitely has the space to teach some basics in that area. To mention Gestaltgesetze, to explain human perception a bit, and give an introduction into usability. You do not get a useful developer in the end otherwise

kbenson · on June 8, 2017

> You do not get a useful developer in the end otherwise

Not all developers do stuff with UI, and of those that do not all do anything with a UI that is actually graphical beyond a terminal.

> since you need to understand basics of human perception to understand compression in that area (jpg, mp3), they talked about stuff like that.

That is a good reason to teach it, and counters my overly assertive original comment.

awjr · on June 7, 2017

Seen the same on maps where green is a good route and red is a dangerous route. Switched to a blue -> light blue -> yellow -> orange -> red -> black scale with great success. I also believe chemistry uses blue to mean safe, not green.

danellis · on June 7, 2017

Off-topic, but:

> Full-page screenshots. Take a screenshot of the entire page, from the top of the viewport to the bottom.

I'm surprised to see the Chrome developers not know what 'viewport' means.

spookyuser · on June 7, 2017

Even more off topic but I'm surprised this isn't a consumer feature. There are so many sketchy chrome extensions that try and do this.

kuschku · on June 7, 2017

Even more off topic, but after Firefox (before Chrome) had introduced this feature in dev tools, they also recently introduced it as extension (and it might even work on Chrome)

spiderfarmer · on June 7, 2017

That's interesting! Do you know if it is possible to access this data from an extension? I would love an extension that crawls through the website and combines the data for all pages.

jzig · on June 7, 2017

On a minified app it only seems to show line-by-line coverage of the minified files and not the source maps :(

csswizardry · on June 7, 2017

You can pretty-print the file through the Sources panel.

deanward81 · on May 31, 2017

At the time the algorithm was running on one of the servers in the data centre and until recently we didn't have anything in the DC kitted out with GPUs (see http://blog.marcgravell.com/2016/05/how-i-found-cuda-or-rewr... for the reason behind that).

I'm not sure if we're planning to use GPUs for future iterations of algorithm testing; the GPUs are installed in hardware serving production traffic so it's unlikely to be used for experimental stuff like this. That said, a bunch of us just got Dell XPS-15s with the NVIDIA GTX 1050 so we can probably have a go on our local machines to get a feel for performance characteristics... watch this space, I guess!