>The S is for "stochastic" -- i.e. you get a different 2D projection every time ...

>The S is for "stochastic" -- i.e. you get a different 2D projection every time you run it on the same inputs.

That's not the part that's "stochastic"; sensitivity to initial conditions is just nonconvex optimization in action. You get the same thing with most other local embeddings.

The stochastic bit is that the model is based on optimizing "the asymmetric probability, pij , that i would pick j as its neighbor"[0]. Those probabilities and the associated positions in 2D space are not estimated stochastically (e.g. with Monte Carlo sampling) or anything, though.

[0] https://www.cs.nyu.edu/~roweis/papers/sne_final.pdf