METR just released another report, titled “Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.” The pull quote is:

Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower.

I’ve been pondering this one all morning. I generally like their methodology. Obviously you always want ever-more controlled experiments slicing every possible input, but as a real-world trial, the experimental design I think was very good, and the numbers seem fairly robust. Figure 6 is possibly the most informative of what’s going on.

But the real takeaway isn’t the impact on productivity, but the gap between reported vs measured productivity. This aligns well with my personal experience. I often feel faster, but if I checked the clock I’m betting I wasn’t really. It’s easy to discount all the time you sit around waiting that isn’t quite enough time to do something else (and you can’t anyway because the AI is messing with your code). That suggests that faster inference may shift these numbers in the future. But also suggests that we can’t rely on self-reporting to determine if more AI was “worth it.” And that teams that deploy a lot of AI should be prepared for a reduction in productivity in at least the short term.

Those who know me know this matches my priors. I think AI is very useful in programming, but that its use is wildly over-sold. So I should be careful of being too quick to accept a finding that matches my biases. But I think it’s additional evidence that “AI all the things” is not a strategy for short-term wins. Integrating AI tools, in my opinion, is a useful investment for the future, and I do think most development groups should be experimenting with the technology so they can make informed decisions. But if you’re banking on big, measurable FY2025 improvements in some metric other than “use AI,” I think you’re kidding yourself.

I believe the code I write with AI assistants is better code.1 I polish it more. The model helps me find more corner cases. I find and fix bugs earlier. And I hope that leads in the long run to more productivity. But I probably do spend more time on individual changes. So yeah, this finding rings true for me.


  1. This is distinct from “vibe-coding.” I mean code where I mostly write it myself, with AI as one of the tools I use. [return]