The thing with x87 is that it's easy to write compilable, correct-looking code, and much harder to write correct compilable code, even for trivial sequences of a dozen or so operations.
Whereas in most asm dialects, register AX is always register AX (word length aliasing aside), that's not the case for x87: the object/value at ST3 in one operation may be ST1 or ST5 in a couple of instructions' time.
Whereas in most asm dialects, register AX is always register AX (word length aliasing aside), that's not the case for x87: the object/value at ST3 in one operation may be ST1 or ST5 in a couple of instructions' time.