Category: Artificial Intelligence

  • Economics of using AI for development

    This is probably the first article I have read on the economics of using AI for development. Vikram Sreekanti and Joseph E. Gonzalez talk about their experience of using Devin for a month

    When Devin works, the economics of using it are pretty good. You currently pay $500 for 250 ACUs, and the small tasks that Devin succeeded at took 1-5 ACUs ($2-10). Paying a few dollars to fix small bugs and save even just one hour per-bug is a great tradeoff — one that we would make any day of the week. The issue is that there’s a very narrow set of tasks that are long enough to require an engineer to context switch and short enough to be in Devin’s working window.

    When Devin doesn’t work, the economics start to look suspect. The 3 bigger tasks we tried averaged about 20 ACUs and 2 of the 3 didn’t yield usable results. While $40 would be extremely cheap for implementing these larger tasks, our (to be fair, limited) sample indicates that these larger tasks consume a disproportional number of ACUs — these tasks weren’t 5-10x harder than the smaller ones that succeeded. More importantly, they often fail, so you get nothing for your $40.

    The last statement is crucial. If you pay a developer $40 and they don’t deliver, you have the option to go back and say, “Hey, this isn’t what I wanted. I expected…”—and still get value for your money.

    But with AI, if you spend $40 and it doesn’t deliver then that money is gone. Poof!

    That said, I don’t want to get carried away. What if, a year from now, AI actually starts delivering!

  • Model Autophagy Disorder

    Interesting read on Livescu.

    …when AI models generate things—text, images, sound—and then those generated products are used to train a subsequent model, the new model actually gets worse at generating images and texts. Over a few generations it can fail completely, producing only a string of gibberish or a single same image over and over again.

    And this is how AI goes ‘MAD’. Later on in the article the author describes a funny little analogy on how to discriminate between rich data and poor data.

  • Comparative advantage

    Noah Smith explaining what is comparative advantage while arguing that we will still have jobs when AI takes over the world—with some caveat.

    Comparative advantage actually means “who can do a thing better relative to the other things they can do”. So for example, suppose I’m worse than everyone at everything, but I’m a little less bad at drawing portraits than I am at anything else. I don’t have any competitive advantages at all, but drawing portraits is my comparative advantage. 

    The key difference here is that everyone — every single person, every single AI, everyone — always has a comparative advantage at something!

    To help illustrate this fact, let’s look at a simple example. A couple of years ago, just as generative AI was getting big, I co-authored a blog post about the future of work with an OpenAI engineer named Roon. In that post, we gave an example illustrating how someone can get paid — and paid well — to do a job that the person hiring them would actually be better at doing:

    Imagine a venture capitalist (let’s call him “Marc”) who is an almost inhumanly fast typist. He’ll still hire a secretary to draft letters for him, though, because even if that secretary is a slower typist than him, Marc can generate more value using his time to do something other than drafting letters. So he ends up paying someone else to do something that he’s actually better at

    (In fact, we lifted this example from an econ textbook by Greg Mankiw, who in turn lifted it from Paul Samuelson.) 

    Note that in our example, Marc is better than his secretary at every single task that the company requires. He’s better at doing VC deals. And he’s also better at typing. But even though Marc is better at everything, he doesn’t end up doing everything himself! He ends up doing the thing that’s his comparative advantage — doing VC deals. And the secretary ends up doing the thing that’s his comparative advantage — typing. Each worker ends up doing the thing they’re best at relative to the other things they could be doing, rather than the thing they’re best at relative to other people.

  • Better answers and right answers

    Benedict Evans talks about how AI is getting better at giving better answers, but still lags behind when it comes to giving the right answer.

    Here’s a practical example of the kind of thing that I do quite often, that I’d like to be able to automate. I asked ChatGPT 4o how many people were employed as elevator operators in the USA in 1980. The US Census collected this data and published it: the answer is 21,982

    First, I try the answer cold, and I get an answer that’s specific, unsourced, and wrong. Then I try helping it with the primary source, and I get a different wrong answer with a list of sources, that are indeed the US Census, and the first link goes to the correct PDF… but the number is still wrong. Hmm. Let’s try giving it the actual PDF? Nope. Explaining exactly where in the PDF to look? Nope. Asking it to browse the web? Nope, nope, nope…

    I faced this issue when I asked AI “what is the composition of nifty 500 between large caps mid caps and small caps“. ChatGPT came close by the getting the right document to refer, but ended up picking the wrong value. I needed the right answer. As Benedict Evans calls it, a deterministic task and not a probablistic task.

    The useful critique of my ‘elevator operator’ problem is not that I’m prompting it wrong or using the wrong version of the wrong model, but that I am in principle trying to use a non-deterministic system for a a deterministic task. I’m trying to use a LLM as though it was SQL: it isn’t, and it’s bad at that.

    But don’t write off AI so soon. Benedict Evans goes on to talk about how disruption happens.

    Part of the concept of ‘Disruption’ is that important new technologies tend to be bad at the things that matter to the previous generation of technology, but they do something else important instead. Asking if an LLM can do very specific and precise information retrieval might be like asking if an Apple II can match the uptime of a mainframe, or asking if you can build Photoshop inside Netscape. No, they can’t really do that, but that’s not the point and doesn’t mean they’re useless. They do something else, and that ‘something else’ matters more and pulls in all of the investment, innovation and company creation. Maybe, 20 years later, they can do the old thing too – maybe you can run a bank on PCs and build graphics software in a browser, eventually – but that’s not what matters at the beginning. They unlock something else.

  • AI ‘may’ not take away software jobs

    Dustin Ewers arguing AI will create more software jobs rather than taking away.

    AI tools create a significant productivity boost for developers. Different folks report different gains, but most people who try AI code generation recognize its ability to increase velocity. Many people think that means we’re going to need fewer developers, and our industry is going to slowly circle the drain.

    This view is based on a misunderstanding of why people pay for software. A business creates software because they think that it will give them some sort of economic advantage. The investment needs to pay for itself with interest. There are many software projects that would help a business, but businesses aren’t going to do them because the return on investment doesn’t make sense.

    When software development becomes more efficient, the ROI of any given software project increases, which unlocks more projects. That legacy modernization project that no one wants to tackle because it’s super costly. Now you can make AI do most of the work. That project now makes sense. That cool new software product idea that might be awesome but might also crash and burn. AI can make it cheaper for a business to roll the dice. Cheaper software means people are going to want more of it. More software means more jobs for increasingly efficient software developers.

    Economists call this Jevons Paradox.

    This gives me hope.

    Bonus: I first learnt about Jevons Paradox while reading Kim Stanley Robinson’s The Ministry For The Future.

  • There’s a new mistake-maker in town

    An insightful article by Bruce Schneier on how humans have built guardrails to manage mistakes made by humans. But we are not equipped to manage the weird mistakes made by AI.

    Humanity is now rapidly integrating a wholly different kind of mistake-maker into society: AI. Technologies like large language models (LLMs) can perform many cognitive tasks traditionally fulfilled by humans, but they make plenty of mistakes. It seems ridiculous when chatbots tell you to eat rocks or add glue to pizza. But it’s not the frequency or severity of AI systems’ mistakes that differentiates them from human mistakes. It’s their weirdness. AI systems do not make mistakes in the same ways that humans do.

    Much of the friction—and risk—associated with our use of AI arise from that difference. We need to invent new security systems that adapt to these differences and prevent harm from AI mistakes.

  • Agentic AI

    Gary Marcus on AI Agents

    I do genuinely think we will all have our own AI agents, and companies will have armies of them. And they will be worth trillions, since eventually (no time soon) they will do a huge fraction of all human knowledge work, and maybe physical labor too. 

    But not this year (or next, or the one after that, and probably not this decade, except in narrow use cases). All that we will have this year are demos.

    Funny.

    And I am hoping it plays out the way Gary is describing it. I get to keep my job a little longer. And build a retirement corpus.