I built a similar thing to Grant's work a couple months ago and prototyped what this would look like against OpenAI's APIs [1]. TL;DR is that depending on how confusing your schema is, you might expect up to 5-10x the token usage for a particular prompt but better prompting can definitely reduce this significantly.
[1] https://github.com/newhouseb/clownfish#so-how-do-i-use-this-...