It's much more than a few million? Being declared a supply chain risk means that no company that wants to do business with the government can buy Anthropic. And no company that wants to do business with those businesses can buy Anthropic either. This rules out pretty much all American corporations as customers?
How is it virtue signalling when sticking by these principles risks their entire business being destroyed by either being declared a supply chain risk or nationalized?
Pretty sure Amodei makes noise about mass unemployment because he is very bothered by the technology that the entire industry (of which Anthropic just one player) is racing to build as fast as possible?
Why do you think he is not bothered at all, when they publish post after post in their newsroom about the economic effects of AI?
They stand to benefit from every one of those effects and already do. They have a stake in the game bigger than any other parties' because they sell both the illness and a cure.
Amodei's noise is little more than half-hearted advertising even if it's not intended to have that reading (although who can even tell at this point). His newsroom publishes a report on a mass-scale data breach perpetrated using their model with conclusions delivered in a demonstrably detached, almost casual tone: yeah, the world is like this now but it's a good thing we have Claude to protect you from Claude, so you better start using Claude before Claude gets you. They released a new, more powerful Claude, immediately after that breach. No public discussion, nothing. This is not the behavior of people who are bothered by it.
How did they evil-ize? The new Responsible Scaling Policy is still the most transparent out of all the labs. And there are the separate principles they’ve stipulated for the Pentagon, under which they’re facing threat of nationalization or being declared a supply chain risk
Cursor and others have a subagent feature, which sounds like what you wanted. However, there has to be some decision making around how to divide up a prompt into tasks. This is decided by the (parent) model currently.
The best-of-N feature is a bit like rolling N dice instead of one. But it can be quite useful if you use different models with different strengths and weaknesses (e.g. Claude/GPT-5/Gemini), rather than assigning all to N instances of Claude, for example. I like to use this feature in ask mode when diving into a codebase, to get an explanation a few different ways.
Most attention implementations can work across an arbitrarily long context.
The limiting factors are typically:
1. Often there are latency/throughput requirements for model serving which become challenging to fulfill at a certain context length.
2. The model has to be _trained_ to use the desired context length, and training becomes prohibitively expensive at larger contexts.
(2) is even a big enough problem that some popular open source models that claim to support large context lengths in fact are trained on smaller ones and use "context length extension" hacks like YaRN to trick the model into working on longer contexts at inference time.
reply