I have recently written a paper on understanding transformer learning via the lens of coinduction & Hopf algebra.
https://arxiv.org/abs/2302.01834
The learning mechanism of transformer models was poorly understood however it turns out that a transformer is like a circuit with a feedback.
I argue that autodiff can be replaced with what I call in the paper Hopf coherence which happens within the single layer as opposed to across the whole graph.
Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.
I'm working on a next gen Hopf algebra based machine learning framework.
The learning mechanism of transformer models was poorly understood however it turns out that a transformer is like a circuit with a feedback.
I argue that autodiff can be replaced with what I call in the paper Hopf coherence which happens within the single layer as opposed to across the whole graph.
Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.
I'm working on a next gen Hopf algebra based machine learning framework.
Join my discord if you want to discuss this further https://discord.gg/mr9TAhpyBW