>Frontier Math (25% on high compute, previous 2%) This is so insane that I can't...

upghost · on Dec 23, 2024

Nope, makes sense to me. Seems unreasonable to conclude the dataset is not compromised now.

cowl · on Dec 24, 2024

the question is whether that 25% jump is also because of the compromised first test.

bwfan123 · on Dec 24, 2024

viewed from a skeptical lens of incentives:

openai and epochai are both startups with every incentive to peddle this narrative. when no one else can independently verify.