• Comprehension debt

    Jason Gorman explaing the challenge of comprehension debt with AI generated code.

    When teams produce code faster than they can understand it, it creates what I’ve been calling “comprehension debt”. If the software gets used, then the odds are high that at some point that generated code will need to change. The “A.I.” boosters will say “We can just get the tool to do that”. And that might work maybe 70% of the time. 

    But those of us who’ve experimented a lot with using LLMs for code generation and modification know that there will be times when the tool just won’t be able to do it. 

    “Doom loops”, when we go round and round in circles trying to get an LLM, or a bunch of different LLMs, to fix a problem that it just doesn’t seem to be able to, are an everyday experience using this technology. Anyone claiming it doesn’t happen to them has either been extremely lucky, or is fibbing.

    It’s pretty much guaranteed that there will be many times when we have to edit the code ourselves. The “comprehension debt” is the extra time it’s going to take us to understand it first.

    And we’re sitting on a rapidly growing mountain of it.

    On a very similar note, Steve Krouse explains how vibe code is legacy code because nobody understands it.

    Filed under
  • Who can build the product?

    There’s an interesting discussion on Hacker News about the news that Boeing has started working on a 737 MAX replacement. My favorite comments.

    Comment by scrlk.

    an oil industry proverb: a healthy oil company has a geologist in charge, a mature one has an engineer in charge, a declining one has an accountant in charge, and a dying one has a lawyer in charge.

    Comment by nostrademons.

    It’ll be interesting to see if they still can design and build a new ground-up airplane design. The last all-new design was the 787, initiated in 2003 and launched in 2009, and its design was fraught with problems. Before then was the 777 in the early 90s (pre-McDonnell takeover), and the 757/767 in the early 80s.

    There’s a phenomena that ofter occurs with large organizations where once their markets mature, everybody who can build a product end-to-end leaves or gets forced out, leaving only people with highly specialized maintenance skillsets. The former group has no work to do, after all, so why should the company keep them around? But then if the market ecosystem shifts, and a new product is necessary, they no longer have the capacity to build ground-up new products. All those people have left, and won’t come anywhere near the company.

    Steve Jobs spoke eloquently about this phenomena in an old interview:

    Filed under
  • AI can complete tasks, not jobs. For now.

    Ethan Mollick reflecting on the recent report by OpenAI which evaluates AI model performance on real-world economically valuable tasks

    Does that mean AI is ready to replace human jobs?

    No (at least not soon), because what was being measured was not jobs but tasks. Our jobs consist of many tasks. My job as a professor is not just one thing, it involves teaching, researching, writing, filling out annual reports, supporting my students, reading, administrative work and more. AI doing one or more of these tasks does not replace my entire job, it shifts what I do. And as long as AI is jagged in its abilities, and cannot substitute for all the complex work of human interaction, it cannot easily replace jobs as a whole…

    …and yet some of the tasks that AI can do right now have incredible value.

    Filed under
  • AI and radiologists

    Deena Mousa explaining how radiology combines digital images, clear benchmarks, and repeatable tasks, but replacing humans with AI is harder than it seems.

    First, while models beat humans on benchmarks, the standardized tests designed to measure AI performance, they struggle to replicate this performance in hospital conditions. Most tools can only diagnose abnormalities that are common in training data, and models often don’t work as well outside of their test conditions. Second, attempts to give models more tasks have run into legal hurdles: regulators and medical insurers so far are reluctant to approve or cover fully autonomous radiology models. Third, even when they do diagnose accurately, models replace only a small share of a radiologist’s job. Human radiologists spend a minority of their time on diagnostics and the majority on other activities, like talking to patients and fellow clinicians. 

    Now where have I heard this before? Oh yes, here.

    Coding can be a challenge, but I’ve never had spent more than two weeks trying to figure out what is wrong with the code. Once you get the hang of the syntax, logic, and techniques, it’s a pretty straightforward process—most of the time. The real problems are usually centered around what the software is supposed to do. The hardest part about creating software is not writing code—it’s creating the requirements, and those software requirements are still defined by humans.

    Filed under
  • I don’t know

    Ibrahim Diallo sharing tips on how to lead a room full of experts.

    By definition, leading is knowing the way forward. But in reality, in a room full of experts, pretending to know everything makes you look like an idiot.

    Instead, “I don’t know, but let’s figure it out” becomes a superpower. It gives your experts permission to share uncertainty. It models intellectual humility. And it keeps the focus on moving forward rather than defending ego. It’s also an opportunity to let your experts shine.

    Saying “I don’t know” is truly a super power. Every time I have said it, the person in the front has excitedly shared all their knowledge with me.

    Filed under
  • AI and junior developers

    I read this post by Can Elma on how AI is helping senior developers but not junior developers. While the post has some interesting takes, there’s an even more interesting discussion on it on Hacker News. Two of my favourite comments.

    Comment by kaydub.

    Because juniors don’t know when they’re being taken down a rabbit hole. So they’ll let the LLM go too deep in its hallucinations.

    I have a Jr that was supposed to deploy a terraform module I built. This task has been hanging out for a while so I went to check in on them. They told me the problem they’re having and asked me to take a look.

    Their repo is a disaster, it’s very obvious claude took them down a rabbit hole just from looking. When I asked, “Hey, why is all this python in here? The module has it self contained” and they respond with “I don’t know, claude did that” affirming my assumptions.

    They lack the experience and they’re overly reliant on the LLM tools. Not just in the design and implementation phases but also for troubleshooting. And if you’re troubleshooting with something that’s hallucinating and you don’t know enough to know it’s hallucinating you’re in for a long ride.

    Meanwhile the LLM tools have taken away a lot of the type of work I hated doing. I can quickly tell when the LLM is going down a rabbit hole (in most cases at least) and prevent it from continuing. It’s kinda re-lit my passion for coding and building software. So that’s ended up in me producing more and giving better results.

    Comment by bentt.

    The best code I’ve written with an LLM has been where I architect it, I guide the LLM through the scaffolding and initial proofs of different components, and then I guide it through adding features. Along the way it makes mistakes and I guide it through fixing them. Then when it is slow, I profile and guide it through optimizations.

    So in the end, it’s code that I know very, very well. I could have written it but it would have taken me about 3x longer when all is said and done. Maybe longer. There are usually parts that have difficult functions but the inputs and outputs of those functions are testable so it doesn’t matter so much that you know every detail of the implementation, as long as it is validated.

    This is just not junior stuff.

    Filed under
  • Benefit of the AI bubble

    Faisal Hoque arguing that there are three bubbles in AI. He concludes his post by explaining the benefits of bubbles.

    Far from being a threat, the AI bubble might be the best thing that could happen to pragmatic adopters. Consider what speculative excess delivers: billions in venture capital funding R&D you’d never justify to your board; the world’s brightest minds abandoning stable careers to join AI startups, working on tools that you’ll eventually be able to use; infrastructure being built at a scale no rational actor would attempt, driving down future costs through overcapacity.

    While investors bet on which companies will dominate AI, you can cherry-pick proven tools at competitive prices. While speculators debate valuations, you will be implementing solutions with clear ROI. When the correction comes, you’ll also be able to benefit from fire-sale prices on enterprise tools, seasoned talent seeking stability, and battle-tested technologies that survived the shakeout.

    The dotcom bubble gave us broadband infrastructure and trained web developers. The AI bubble will leave behind GPU clusters and ML engineers. The smartest response isn’t to avoid the bubble or try to time investments in it perfectly. It is to let others take the capital risk while you harvest the operational benefits. The bubble isn’t your enemy. If you play your cards strategically, it can be a major benefactor.

    Filed under
  • Workslop

    This post on Harvard Business Review by Kate Niederhoffer, Gabriella Rosen Kellerman, Angela Lee, Alex Liebscher, Kristina Rapuano and Jeffrey T. Hancock explaining workslop.

    We define workslop as AI generated work content that masquerades as good work, but lacks the substance to meaningfully advance a given task.

    Here’s how this happens. As AI tools become more accessible, workers are increasingly able to quickly produce polished output: well-formatted slides, long, structured reports, seemingly articulate summaries of academic papers by non-experts, and usable code. But while some employees are using this ability to polish good work, others use it to create content that is actually unhelpful, incomplete, or missing crucial context about the project at hand. The insidious effect of workslop is that it shifts the burden of the work downstream, requiring the receiver to interpret, correct, or redo the work. In other words, it transfers the effort from creator to receiver.

    If you have ever experienced this, you might recall the feeling of confusion after opening such a document, followed by frustration—Wait, what is this exactly?—before you begin to wonder if the sender simply used AI to generate large blocks of text instead of thinking it through. If this sounds familiar, you have been workslopped.

    Filed under
  • Map

    Joshua Stevens has created a map—which I believe would have been created—if human civilisation started from Australia.

    Filed under
  • AI and the next technological revolution

    Jerry Neumann comparing AI with previous revolutionary technologies like microprocessor and containers—shipping containers, not software containers—and arguing that money will be made on the applications sitting on top of AI rather than on AI itself.

    This doesn’t mean AI can’t start the next technological revolution. It might, if experimentation becomes cheap, distributed and permissionless—like Wozniak cobbling together computers in his garage, Ford building his first internal combustion engine in his kitchen, or Trevithick building his high-pressure steam engine as soon as James Watt’s patents expired. When any would-be innovator can build and train an LLM on their laptop and put it to use in any way their imagination dictates, it might be the seed of the next big set of changes—something revolutionary rather than evolutionary. But until and unless that happens, there can be no irruption.

    Filed under