Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

ChatGPT is really very simple. Imagine you could analyze a million books and identify all the words within them -- not the meanings of the words, just the actual letters they contain.

Now, when someone asks you about the history of France (or why the sky is blue), you could simply pluck out of your library the most common strings of word that seem to follow the words that were in your question!

It's like a kid in the 80's who thinks the answer to an essay question is to copy it from an encyclopedia, only the "encyclopedia" is very large and contains multiple sources.

So, the big take away needs to be that there is absolutely no understanding, no cognizance of any kind, no language comprehension going on. The answers look good because they contain all the same words as the most popular answers people have already written which the system scanned.

So ChatGPT turns out to great for parsing and summarizing documents, if that's something you need. But, since it doesn't know fact from fiction, it cannot apply logic or math, and it cannot perform reasoning or analysis, it's not good for finding out facts or discerning truth.

Another great failing of LLM software is that the user being spoken to is generic. The answers are not modeled for you, they're the same models for everyone. But a human teacher does their job by being exactly the opposite of this -- someone who is finely tuned to the needs and understandings of their audience. A good journalist or writer does the same.



> you could simply pluck out of your library the most common strings of word that seem to follow the words that were in your question!

This is not sufficient to explain how LLMs are able to synthesize novel, coherent poems or song lyrics. What you're describing seems closer to a markov model. So far I have yet to see a good explanatin of _why_ transformer models seem to have this emergent behavior as you scale it up. It's easy to see by construction that the stacked attention layers will work well to predict the masked token in "Capital of France is [MASK]", or how you can throw an additional head on top for sentiment analysis (which I recall was implemented by literally adding a dummy token at the start into which information about the entire sentence gets embedded). But it's not obvious (to me at least) that this would somehow generalize into being able to do the things chatGPT does.


I don't really see how it's different from, say, a large convolutional neural network learning progressively higher-order features as you progress through the layers of the network. At the lowest layers it's learning simple edge filters, which get combined into shapes, which get combined into filters that activate on faces, which get combined in ways that can be recognized as "family portrait", and so on. Of course, transformers have some unique advantages in terms of having very large context windows, being very parallelizable, etc.

When ChatGPT or any generative model produces output from a prompt, it's sampling from the (frozen) statistical structure it has learned. It makes sense that as you increase model capacity and the volume of training data, you can capture statistical patterns that occur at a very high level of abstraction. So, instead of just predicting the next token based on token-to-token patterns, it can predict the next token based on something resembling concept-to-concept patterns. (This somewhat demystifies the poetry generation / style transfer stuff that it can do. If you ask for a breach of contract complaint written as a sonnet, it can sample from both the patterns it's learned from legal documents and the patterns it's learned from poetry.)

What I wonder is how far this track of scaling up can take us. At the end of the day, interacting with ChatGPT is not that different from "interacting" with y=3x+7 by plugging in a value for x. It's just a much, much larger function.


Thanks, this is a nice explanation! Also found a paper related to emergence in LLMs: https://arxiv.org/pdf/2206.07682.pdf


Ah, but in fact it does. Code and poetry seem different to people but tokens are tokens to the computer, and it knows not one from the other.

The reason you get a poetry-style answer when you ask for one is that the mix of words and styles can be taken from different parts of the corpus. This is how you can have it write (bad) poetry about something for which only prose was scanned.


I'd propose that you come up with a prompt to test your hypothesis and try it out. I think seeing is believing here, I don't know how ChatGPT works but after experimenting with it I am sure it's not simply looking for runs of similar words to the prompt in it's training set.

Here's an example to consider: https://news.ycombinator.com/item?id=33874110


I think that the levels of Associativity between Linguistic Terms goes beyond a mere "mix of words and styles". When pulling on one part of the Prompt causes portions of the Output to shift or vanish, or change genre or word choice or composition or focal point or focal distance - - -

then, that's the very opposite of "mix" -That's "Stable Diffusion" in text2img, or the "Reversal of Entropy" in Physics terms..

I'm still blown away that computers have become inage "Imaginers", decades after Humans first started doing 3D through Computers. It's really all quite unbelievable.


I don't think this is an accurate depiction and I think there's a lot more going on there than you're saying there is. The models also involve Codex, which is able to effectively do pattern matching with more context you give it so that it can emit novel code. I've seen this happen numerous times with Copilot, where it works very well the more open documents you have, especially if those documents are related. True, it doesn't have a full semantic understanding of your codebase, but it's good enough to usually generate the right code once it's been "seeded" with something like an updated type definition that you then need to plumb through various parts of a codebase.


Yes, I use Copilot, too for that very reason. But we need to be very careful about words like "semantic" and "understanding" as the method is neither.

We'll get the best use out of these technologies if we don't ascribe to them magical qualities they don't have.

Poke around. You'll find out it just statistical math with tokens (letters and punctuation). No meaning is ascribed to anything.


I...did say that it doesn't have a full semantic understanding of your code. Copilot uses tree-sitter under the covers to be able to figure out several things that you can get by analyzing syntax alone, which is actually quite far. It lets you identify what kinds of declarations (e.g., a type) and what kind of expressions (e.g., pattern matching on that type) exist, provided there's a grammar for it, and then lets that influence the suggested code. This isn't perfect because indeed, you need a full language service to actually understand things, especially in resolving ambiguities in symbols (like shadowing in some languages), but, like I said...it's good enough.


Yes, Copilot has the advantage that programming language syntax is highly constrained and regularized, allowing it to slot-in your own variable names and functions into new code as you write it. This is the best feature of Copilot. It's not part of ChatGPT because that's not meant for writing code. And I wouldn't call it "understanding." But, it is useful.


> So, the big take away needs to be that there is absolutely no understanding, no cognizance of any kind, no language comprehension going on.

Okay, but what is understanding, cognizance, comprehension? And so on for further precision of definitions, until you reach that vague nebulous area that is consciousness. We simply haven't figured that out, period.

So we cannot really say that this AI does not have those qualities, unless its output is obviously showing it. Which ChatGPT does not, most of the time.


I wrote this[0] to better explain my wildly unpopular answer above. Everything is absolutely correct here, but it's not a popular view right now. Often very smart people like those on HN can be fooled by the "magic" of tech that might not really exist in the way they think it does.

[0] https://medium.com/gitconnected/behind-the-curtain-understan...


> Another great failing of LLM software is that the user being spoken to is generic. The answers are not modeled for you, they're the same models for everyone. But a human teacher does their job by being exactly the opposite of this -- someone who is finely tuned to the needs and understandings of their audience. A good journalist or writer does the same.

ChatGPT can do this if you just ask. E.g., explain as if I'm 5 and it will use simple words and phrases. Explain like I'm an expert and it will use technical jargon and expect a deeper shared knowledge with which it can draw from.

You shouldn't write so authoritatively when clearly you haven't explored its capabilities.


Alas, I have. You can ask the tool to parrot styles that it has already seen, but it cannot tailor the output to fill-in the gaps nor persuade someone (or a group) who is at a particular level of understanding. In human writing, these are the most common use cases.

The reason that I write in this manner is that I'd like to scrub out the notion that computers or software are magical. It isn't helpful to our line of business -- unless you're pitching VC. Most people don't know how stupidly simple computers really are, and unfortunately that includes many developers and tech folks.


But it literally can. I just tried. Go make up a group with X knowledge and tell chatgpt to explain something in terms only X group can understand. It will do it. Tell it to go in-depth and it will do it.

There are limitations to the model but making wrong statements about its capabilities does not help your point.

This is just one recent example. You may object that it's too common a question, but I've made many others and it responds competently:

> Please explain group theory in terms that children in elementary with rudimentary mathematics knowledge can understand. Go in-depth in your explanations

Sure, I'd be happy to explain group theory to children with elementary-level math knowledge. Group theory is a branch of mathematics that studies the properties of groups, which are sets of objects that can be combined together in certain ways.

One of the key ideas in group theory is that groups have a set of rules, called group axioms, that dictate how the objects in the group can be combined. For example, one group axiom is that when you combine two objects in the group, the result must also be an object in the group. This means that if you have a group of numbers, and you add two numbers together, the result must also be a number in the group.

Another important idea in group theory is the concept of symmetry. Symmetry is when an object looks the same, no matter how you rotate it or flip it. For example, a square has symmetry because it looks the same whether you turn it 45 degrees to the left or right, or flip it over. Groups can have symmetry too, and this can help us understand the patterns and regularities in the way that the objects in the group behave.

To understand group theory, it's helpful to think about examples of groups. One common example is the group of whole numbers, which are the numbers we use to count things. In this group, you can add two numbers together (like 3 + 5), and the result will be a whole number. You can also multiply two whole numbers together (like 3 x 5), and the result will be a whole number. This group has a symmetry property, because if you multiply two numbers together, the result will be the same no matter which order you put the numbers in (for example, 3 x 5 is the same as 5 x 3).

Another example of a group is the set of all possible moves in a game of chess. In chess, you can move pieces like the rook, the knight, and the queen in different ways on the board. These different moves form a group, because you can combine two moves (like moving the rook to one square and then moving the knight to another square) and the result will be a legal move in the game.

Group theory is a fascinating and important area of math, and it has applications in many different fields, from physics and chemistry to computer science and cryptography. While it can be challenging to understand, with a little bit of practice and perseverance, even young students can learn the basics of group theory and appreciate its beauty and power.


This is true but does not refute OP's point. There is an entire subreddit for ELI5 that is probably part of its training data. So if you ask for ELI5, that pattern IS part of the training corpus.


It is more complex than that, it isn't just scraping /r/ELI5 lol

Ask it a question about JavaScript and tell it to respond as a 1920s gangster and it will happily oblige and do a great job. And just to make sure we are on the same page, JavaScript was not available in the 20s


It is just scraping and reorganizing words while including words from your prompt in the equation. I already mentioned that the style (gangster) and vocabulary (JavaScript) can come from different scanned documents. Cool? Yes. Is it learning, or understanding? No.


If it's not learning then how can you tell it abstractly why it's code is wrong then have it fix it? Without telling it specifically "change this to this"


Within the limits of the prompt length that it can handle, it includes the prior history of the conversation as part of the prompt.

(Arguably, you could say it is “learning” but the learning is limited to the same conversation, with a very sharply limited depth.)


OP question mentions specific network types, while reply uses a vague "million-book library" metaphor, but at least mentions LLM? Also arbitrarily sorts human { teacher, journalist, writer } as "people who do custom work with each client" which makes a bit of sense for a Teacher in a small-class environment. But Writers and Journalists write at SCALE, and while each may speak with gusto to individuals just as well as they do their job, tailoring custom content to individuals is not their job.

Not to mention that individual-tailored model files are very handy with AI, share-able as those diles may be.

I call ChatGPT on you!!! And also do not know anything about your question, OP...


I wouldn't deign to have ChatGPT write my comments. They're all me!

Obviously, an author's audience can include more than one person. But every good piece of human writing is tailored for some audience. That was my point. You've heard the expression, "Know your audience?" ChatGPT cannot.


ChatGPT's "audience" can range from "hate speech for anti-bias training" to "Teenage Mutant Ninja Turtles Deliver Impassioned Speech to The World"

I would also like to use the No True Scotsman fallacy, and assert that no Human can ever really "know" or "write for" anybody's perfect preferences.


I wasn't asking for perfect preferences. I was asking for minimal cognizance, which the tool cannot provide. ChatGPT is more like a parrot than tech dreamers like to admit.


I'll grant you that ChatGPT is more Parrot than All-Seeing Oracle, but no Parrot (or even Google) will take nginx errors for input, and give me back chmod and chown commands to fix the permission issues.

AND! the commands it gave me were tailored to match my user and any directories/websites i told it I was working on ! !


Ask ChatGPT to write a song lyric for your favorite band in the style of your favorite poet, with a specific unusual topic.

Then claim that the model is “really very simple”


At the end of the day, all computing is very simple. What's impressing you here is the size of the dataset and the speed of the retrieval, brought about by advances in hardware. Also, we like the well-formed writing which is created by filtering and massaging the parroted text into a new set of sentences.

It is impressive, and people who don't know how it really works will think it's capable of all kinds of things it's not. Of course, this has been true forever. Just ask anyone who writes code about what non-devs think software can do.


Have you used ChatGPT?

Your understanding of GPTs and neural nets in general is consistently flawed.

You are describing a Markov model or at best an SVM model.

Would you make the same types of claims about Stable Diffusion, that it is “piecing together pieces of existing images?” That is not how these models work. Similar to the temporal memory that ChatGPT will appear to have during a conversation, “inpainting” with Stable Diffusion produces entirely novel output with context. Have you tried it out?


I have and I wrote extensively about it after seeing the unpopularity of my original comment. There's a link to my long Medium story about it in another comment.

Would I say that DALL-E, etc. also piece together existing images? Yes, of course, since that is the only way that technically it could work. All of computing is "input-process-output." There is no "create" step in any of computing.

The pieces that DALL-E pieces together are pixels of a certain color and brightness and placement relative to other pixels. Together, we perceive this as an image. If they weren't arranged that way, we'd just see noise or gray.

The same is true with ChatGPT. Unless they're put in a particular order, words just make word salad. It is the order that gives them their perceived meaning in a sentence. ChatGPT crafts "new" sentences by rearranging words from similar sentences it has already seen, just like DALL-E creates new paintings by rearranging pixels from paintings it has already seen.

A recent video from Noam Chomsky explains this from his perspectives on learning and language:

https://www.youtube.com/watch?v=PBdZi_JtV4c


> it cannot apply logic

Have you tried?


The tool can parrot back to you "logical" arguments which it has already scanned. It cannot figure out the logic behind them, nor seen when it's wrong, nor correct any errors, nor draw new logical conclusions.


Are you sure about that?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: