By that line of reasoning, GitHub copilot would have to be GPL. Until somebody fights about this in court we don't really know. But even in the worst case the CC-BY-SA is one of the easier licenses to fulfill, not much worse than the MIT-licensed code contained in the dataset.