mattias.blog
The holdout test
Right now, most of our tests are “public”. The AI can see them, learn from them, and optimize for them. This works for basic functionality. But it creates a risk. The AI might generate code that passes all your tests but doesn’t actually solve the problem. Like writing an if statement for every number between 1 and 2000 instead of using a proper algorithm.