Thanks Max! This was a really interesting article and closely matches my own experience with how the agents have been progressing
one of the takeaways I get when reading skilled engineers' experiences with these tools is that they essentially offer leverage, and the more skill someone already has the higher their ceiling will be
i feel similarly. suppose ai makes people more productive:
1. companies that are not doing well (slow growth, losing to competition etc) or are in a monopoly and are under pressure to save in the short term are going to use the added productivity to reduce their opex
2. companies that are doing well (growth, in competitive markets) will get even more work done and can't hire enough people
my hunch is block is not doing as well as they seem to be
obviously he's going to posture his company as growing and doing well, but clearly not enough for the board and shareholders given their headcount growth from zirp
some companies are in the position to go for moonshots and block hasn't panned out
They did not, you get the same date range and the same graph shape going to FRED and pressing the "1Y" option, and the series includes the first two months of 2026 so it's 12 months: https://fred.stlouisfed.org/graph/?g=1SGzm
However, the chart settings were actually modified to hide/deemphasize the earlier decline: the the index date was changed. 2025-02-20=100 in their graph, default of 2020-02-01=100 would have the chart start at 64 and rise to 71.44.
Sure, I assumed status quo everyone is talking about is basically the several years before that graph. I still think it's relatively bad compared to that despite the modest improvement.
What's not shown in a graph of job postings is the demand side. With all the layoffs, out of work college grads, people staying put in jobs they are unhappy with, etc., I'd wager that demand per job is still at a historically high level compared to what we have been accustomed to
That's the most recent time. But I've bounced around all the LLMs - they're all superficially amazing. But if you understand their output they often wrong in both subtle and catastrophic ways.
As I said, maybe I'm wrong. I hope you have fun using them.
Yes. And, again, they look amazing and make you feel like you're 10x.
But then I look at the code quality, hideous mistakes, blatant footguns, and misunderstood requirements and realise it is all a sham.
I know, I know. I'm holding it wrong. I need to use another model. I have to write a different Soul.md. I need to have true faith. Just one more pull on the slot machine, that'll fix it.
"CEO Dario Amodei predicted last March that in six months AI would be writing 90% of code, and when that didn’t happen"
I mean, a lot of developers have 90% of their code being written by AI (myself and my friends at the labs included). Obviously YMMV depending on your codebase and individual skill.
"Software engineers will at times overestimate their capabilities, as demonstrated by the METR study that found that developers believed they were 24% faster when using LLMs, when in fact coding models made them 19% slower.
This, naturally, makes them quite defensive of the products they use, and whether or not they’re actually seeing improvements."
I wonder what he thinks about the new METR update that showed a net speedup as a lower bound (due to participants literally not wanting to even tackle tasks with AI due to how slow it would be), with the returning devs having the greatest improvements in speedup?
"for one of Anthropic’s greatest lies: that AI can “work uninterrupted” for periods of time, leaving the reader or listener to fill in the (unsaid) gap of “...and actually create useful stuff.”"
We're probably at the beginning of the S curve for long-running tasks that create useful stuff (https://ladybird.org/posts/adopting-rust/) but it clearly needs hand-holding and a way to self-verify work.
"No amount of DarioMath about how a model “costs this much and makes this much revenue” changes the fact that profitability is when a company makes more money than it spends."
Feels like he's being dishonest here because the economics of the labs are unique (and precarious). Each model (revenue - cost to train and serve) is profitable. Labs invest in the next model to maintain their advantage, otherwise people will stop using their latest models. This probably doesn't go on in perpetuity (which is what Ed should've analyzed more). To his benefit, he's right that CC subscriptions are currently being subsidized.
[Insert quotes of Dario saying models will be smarter than most humans or Nobel laureates]
I mean, he's not wrong in certain definitions of "smart". They're already well above the average human in terms of testable world knowledge, math, coding, science, etc... but obviously fall short in other ways compared to humans.
reply