I think my point may not have gotten across clearly, so let me clarify. I agree with most of what you said, which IMHO comes under the spirit of supervised machine learning paradigm, which is the most advanced tool we have pragmatically available to use today.
Looking beyond, if we find a model+learner which can discover a low dimensional latent space (through certain indirectly specified biases) then that low dimensional formulation of the domain can guide us towards interesting questions worth asking. To have a factorized low-dimensional formulation is roughly what it means to "understand" a subject, so such a toolkit would be enormously useful.
Many people (including some experts, rightly or wrongly) believe that neural networks might be that model class, and (clever tweaks of) gradient descent might be an acceptable learner.
All I'm saying is that the current hype about AI fails to separate the potential of the latter class from the currently available successful tools of the former class.
Looking beyond, if we find a model+learner which can discover a low dimensional latent space (through certain indirectly specified biases) then that low dimensional formulation of the domain can guide us towards interesting questions worth asking. To have a factorized low-dimensional formulation is roughly what it means to "understand" a subject, so such a toolkit would be enormously useful.
In my experience, at least, the value in this kind of activity is relatively low. Sometimes you find something, usually you just find a low-rank version of nothing in particular. Unless some significant developments appear in this space (and they very well might), the best low-dimensional approximations will be the means, variances, and correlations computed by business analysts, which are tremendously valuable quantities to keep in mind at all times.
> In my experience, at least, the value in this kind of activity is relatively low. Sometimes you find something, usually you just find a low-rank version of nothing in particular. Unless some significant developments appear in this space (and they very well might), the best low-dimensional approximations will be the means, variances, and correlations computed by business analysts, which are tremendously valuable quantities to keep in mind at all times.
Definitely. Humans find and reject "low rank" correlations all the time. For example, North America and South America are both part of the "new world", but no sociologist would use that to make general predictions about the people on the continent. People both like steak and cookies, and would conceivably fall into a "low rank" "humans like this category", but they couldn't be more dissimilar in many respects.
Conversely, some doctors have an encyclopedic knowledge of various ailments, and either memorize or infer from experience a particular ailment from a huge collection of possible ailments. If the number of possible ailments is large, their "understanding" is not low rank, it's the opposite.
Low rank does not necessarily mean "understanding", it's just a local minimum in some function of a random variable.
> Humans find and reject "low rank" correlations all the time ... steak and cookies ...
Sure, but the appropriateness of a low rank approximation depends on what you want to predict. Eg: for predicting basketball success, {height, wingspan} might be a very useful "low rank" description, while for predicting obesity the relevant low rank description might be the body mass index (BMI) or something like it. Wingspan = chest width + 2 * arm length and BMI = weight/height^2 could be composite features "discovered" by such a model+inference toolkit.
> Conversely, some doctors have an encyclopedic knowledge of various ailments, and either memorize or infer from experience a particular ailment from a huge collection of possible ailments. If the number of possible ailments is large, their "understanding" is not low rank, it's the opposite.
That's exactly why I would not call that understanding, especially if they have to memorize a book full of ailments and symptoms. "Understanding" would entail a (causal) model of underlying physiological problems and the symptoms they give rise to, with a recipe for inferring in reverse.
> Low rank does not necessarily mean "understanding", it's just a local minimum in some function of a random variable.
I don't understand that statement at all.
PS: I don't have any particular affinity to neural networks over other ML models, and don't mean to escalate the hype :-)
composite features "discovered" by such a model+inference toolkit.
This kind of "feature discovery" is sort of what random forests, gradient boosting, and neural networks already do. You can even do a neutered version of it with full quadratic interactions of all the inputs in a (regularized) linear regression model.
For a linear version of this task as an unsupervised problem, you will want to look at PCA. The problem right now is not that we don't have enough unsupervised data analysis techniques, it's that they are hard to extract useful information out of. Interpreting t-SNE, for example, is notoriously difficult. This is why I said advances in the space are necessary. As it stands, you can run all sorts of low dimensional mappings and embeddings on a given data set, and spend an afternoon finding a whole lot of nothing that your business analysts didn't already know.
Interesting aside about doctors though. Many people in alternative medicine would argue that there are obvious associations between various ailments, causes, and/or treatments that are under-recognized and under-studied.
Maybe if a machine produced those same associations instead of an alternative medicine practitioner, they might be more interesting to medical researchers.
You have to be careful. If you only feed data that has the associations, both humans and machines will find them. This is the entire premise behind introducing double blind studies. Turns out you have to actively try and falsify claims if you want solid results. Not just try and confirm them. :)
The biggest counter-example is with "hack" to AI. If you can change two pixels in an image, in imperceptible ways to a human, and an AI chooses "cat" instead of "dog", it shows there is something terribly different between how an AI operates compared to a human.
Perhaps "bugs" like that can be fixed with more/less neurons, training data, or neuron organization, but it doesn't change the fact that fundamentally, a neural network's definition of understanding is quite different from our own.
Based on this, I suspect that even if neural networks are the long-term answer, we're nowhere close to any sort of general AI that mimics human understanding.
That said, we can certainly make many useful tools in the interim.
Meh, look at how many people see visions in shadows and whatnot. The very constellations can almost be seen this way.
That is, yes, you can fool machines. But you can also fool people. That you fool the two in different ways isn't that surprising. Nor is the fact that you would want to. Just look at camouflage. Basically the same exact thing, just in the physical world. (And, notably, not something monopolized by humans.)
Ah, I think I'm arguing a different point, then. I fully grant they are thinking in different ways. For that matter, though, so do different humans. There is a great Feynman story about how he and a colleague both counted "in their heads" using completely different methods. To surprising results.
So, I am not arguing that they think the same way. I just don't know if that really matters. Regardless of if they think the same or another way, there will be ways to fool them. They could still be "thinking", though.
This is why I mentioned anomaly detection on live data. My understanding is that is one of the prototypical unsupervised learning fields. I see how I framed it such that I could have meant a labeled dataset to train a detector. My apologies.
I do think there is something possibly there. But it is just that, a possibility. My hunch is it is a low one that will require a lot of cost to reach. Think of it as akin to making money off of interest rates. If you have a ton of capital, it can work. For most people, it won't.
Looking beyond, if we find a model+learner which can discover a low dimensional latent space (through certain indirectly specified biases) then that low dimensional formulation of the domain can guide us towards interesting questions worth asking. To have a factorized low-dimensional formulation is roughly what it means to "understand" a subject, so such a toolkit would be enormously useful.
Many people (including some experts, rightly or wrongly) believe that neural networks might be that model class, and (clever tweaks of) gradient descent might be an acceptable learner.
All I'm saying is that the current hype about AI fails to separate the potential of the latter class from the currently available successful tools of the former class.