Build for the Model's Ceiling, Not Its Cost
Why squeezing tokens early loses the next cycle and how maxing context wins now and when models get cheaper. Insight from Granola's CTO: AI Agent Optimization Strategy.
I was speaking with the CTO of Granola recently and asked him a simple question. “How do you consistently get the strongest performance out of your agents?” And he replied with an answer that reframed a lot for me.
To give a little bit of context, I’ve been building agent workflows to be maximally efficient. For example, I would maximise token efficiency at all costs. So trying to get the minimal amount of token usage for generating a response, and some of the time this works, but the result is not consistent.
He said that he prefers to max out the model’s context window as much as possible. “It doesn’t have to reach to the full amount,” he said, “but if it was on a scale, you’d look to medium to high context window.” Because you’re prioritising giving the user the best experience from the model’s capability.
Granola have just raised $60M so that they can give that best user experience to most of the users for low cost or even free. And they’re happy to spend that amount to improve the experience, improve the brand, improve their reputation while growth is still happening.
There’s also an important point that I came too realise, through implementing the strategy. Not only are you maximising for the current model, but you are also maximising the models’ capabilities. This means that when eventually a new model comes out (which will not be too long), you already maximally fit for the current model. The token usage becomes more efficient because over time, models become more efficient. The teams that are building these models are always looking to improve and make them more efficient, the token window becomes less expensive over time.
Actually making these really bold bets in the early stage of this nascent industry and also in the early stages of a company actually provides a really hidden, powerful benefit of:
-
Delivering an exceptional user experience immediately.
-
Positioning your product to become dramatically more efficient as model costs drop.
The founders who build like this will dominate the next cycle. They will control more of the market. attract better customers. and develop a stronger brand because they made the right bet early. The founders who optimise for minimal token usage will save a little money now and lose a lot of ground later.
AI capability is accelerating. If you want to lead, build for where the curve is going, not where it is today.