This is probably the first article I have read on the economics of using AI for development. Vikram Sreekanti and Joseph E. Gonzalez talk about their experience of using Devin for a month.
When Devin works, the economics of using it are pretty good. You currently pay $500 for 250 ACUs, and the small tasks that Devin succeeded at took 1-5 ACUs ($2-10). Paying a few dollars to fix small bugs and save even just one hour per-bug is a great tradeoff — one that we would make any day of the week. The issue is that there’s a very narrow set of tasks that are long enough to require an engineer to context switch and short enough to be in Devin’s working window.
When Devin doesn’t work, the economics start to look suspect. The 3 bigger tasks we tried averaged about 20 ACUs and 2 of the 3 didn’t yield usable results. While $40 would be extremely cheap for implementing these larger tasks, our (to be fair, limited) sample indicates that these larger tasks consume a disproportional number of ACUs — these tasks weren’t 5-10x harder than the smaller ones that succeeded. More importantly, they often fail, so you get nothing for your $40.
The last statement is crucial. If you pay a developer $40 and they don’t deliver, you have the option to go back and say, “Hey, this isn’t what I wanted. I expected…”—and still get value for your money.
But with AI, if you spend $40 and it doesn’t deliver then that money is gone. Poof!
That said, I don’t want to get carried away. What if, a year from now, AI actually starts delivering!