naveegator.in

AGI. Are we there yet? Part 2

August 25, 2025

Dwarkesh Patel arguing why he doesn’t think AGI is right around the corner.

But the fundamental problem is that LLMs don’t get better over time the way a human would. The lack of continual learning is a huge huge problem. The LLM baseline at many tasks might be higher than an average human’s. But there’s no way to give a model high level feedback. You’re stuck with the abilities you get out of the box. You can keep messing around with the system prompt. In practice this just doesn’t produce anything even close to the kind of learning and improvement that human employees experience.

The reason humans are so useful is not mainly their raw intelligence. It’s their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task.

How do you teach a kid to play a saxophone? You have her try to blow into one, listen to how it sounds, and adjust. Now imagine teaching saxophone this way instead: A student takes one attempt. The moment they make a mistake, you send them away and write detailed instructions about what went wrong. The next student reads your notes and tries to play Charlie Parker cold. When they fail, you refine the instructions for the next student.

This just wouldn’t work. No matter how well honed your prompt is, no kid is just going to learn how to play saxophone from just reading your instructions. But this is the only modality we as users have to ‘teach’ LLMs anything.
Who’s making money in AI right now?

August 24, 2025
Dave DeGraw ranting about his frustration with vibe coded PRs and asking the most important question.
Is anyone making money on AI right now? I see a pipeline that looks like this:
- “AI” is applied to some specific, existing area, and a company spins up around it because it’s so much more “efficient”
- AI company gets funding from venture capitalists
- AI company give funding to AI service providers such as OpenAI in the form of paying for usage credits
- AI company evaporates
This isn’t necessarily all that different than the existing VC pipeline, but the difference is that not even OpenAI is making money right now.
Ha!
Acknowledge and repair

August 23, 2025

Matheus Lima highlighting a lesser known—but important—skill for managers.

Let me tell you something that will happen after you become a manager: you’re going to mess up. A lot. You’ll give feedback that lands wrong and crushes someone’s confidence. You’ll make a decision that seems logical but turns out to be completely misguided. You’ll forget that important thing you promised to do for someone on your team. You’ll lose your temper in a meeting when you should have stayed calm.

The real question isn’t whether you’ll make mistakes; it’s what you do after.

You acknowledge and repair. I can personally vouch for this.
Software engineering vs traditional engineering disciplines

August 23, 2025

This comment from potatolicious on Hacker News about how AI has removed the deterministic expectations.

…I was trained as a classical engineer (mechanical), but pretty much just write code these days. But I did have a past life as a not-SWE.

Most classical engineering fields deal with probabilistic system components all of the time. In fact I’d go as far as to say that inability to deal with probabilistic components is disqualifying from many engineering endeavors.

Process engineers for example have to account for human error rates. On a given production line with humans in a loop, the operators will sometimes screw up. Designing systems to detect these errors (which are highly probabilistic!), mitigate them, and reduce the occurrence rates of such errors is a huge part of the job.

Likewise even for regular mechanical engineers, there are probabilistic variances in manufacturing tolerances. Your specs are always given with confidence intervals (this metal sheet is 1mm thick +- 0.05mm) because of this. All of the designs you work on specifically account for this (hence safety margins!). The ways in which these probabilities combine and interact is a serious field of study.

Software engineering is unlike traditional engineering disciplines in that for most of its lifetime it’s had the luxury of purely deterministic expectations. This is not true in nearly every other type of engineering.

If anything the advent of ML has introduced this element to software, and the ability to actually work with probabilistic outcomes is what separates those who are serious about this stuff vs. demoware hot air blowers.
Reverse and forward engineering with AI

August 23, 2025

Birgitta Böckeler explaining how AI has changed reverse and forward engineering.

I say AI-accelerated reverse engineering and then AI-accelerated forward engineering, where the reverse engineering only includes a description of the application, not actually building it, so the forensics kind of. You do forensics on the existing application and the existing code to recreate a good description of what it does. Because we can now use generative AI in the forward engineering, there’s a new incentive for us to actually create this description in textual form in a lot of detail.

In the past, maybe we wouldn’t even do it at that level of detail. We would still, in the forward engineering, maybe have stories as the placeholder for a conversation because we want to build a new, fresh application. Now that we can use AI for the forward engineering, there is an incentive there to have very detailed descriptions. It maybe even changes the equation of cost benefit when we think about feature parity. That’s also one of the hypotheses. That with AI-accelerated reverse and forward engineering, feature parity might become less of a sticking point.
Replacing junior programmars

August 22, 2025

Matt Garman—Amazon’s CEO—talking about why firing all your junior programmers for AI is a bad idea.

With Kiro and part of what we’ve done in Agentic coding first kind of mentality is that you actually start with a spec of the thing that you want to build and then you work with the tool to actually go and build parts of that spec and as you’re vibe coding you it it can automatically change parts of that spec but you still have that spec as the core thing you can always go back to and change aspects of it or functions of it or whatever it is. Um, and we have seen the light bulb go on because it actually one of the cool things about that is that you can actually guide more junior developers as to like what are great coding practices, how do we think about this?

[…]

I was at a group a leadership group and people were telling me they’re like we think that with AI we can replace all of our junior people in our company. I was like that’s the like one the dumbest thing I’ve ever heard. like they’re probably the least expensive employees you have. They’re the most leaned into your AI tools and like how’s that going to work when you go like 10 years in the future and you have no one that has built up or learned anything. Um and so it’s you know I I my view is like you absolutely want to keep hiring kids out of college and teaching them the right ways to go build software and decompose problems and think about it um just as much as you ever have.
AGI. Are we there yet?

August 20, 2025

A very pessimistic take by Marcus Hutchins on the current state of AI. The author touches upon a variety of topics which I have read independent of each other.

A logical problem I previously used to tests early LLMs was one called “The Wolf, The Goat, And The Cabbage”. The problem is simple. You’re walking with a wolf, a goat, and a cabbage. You come to a river which you need to cross. There is a small boat which only has enough space for you and one other item. If left unattended, the wolf will eat the goat, and the goat will eat the cabbage. How do you get all 3 safely across?

The correct answer is you take the goat across, leaving behind the wolf and the cabbage. You then return and fetch the cabbage, leaving the goat alone on the other side. Because the goat and cabbage cannot be left alone together, you take the goat back, leaving just the cabbage. Now, you can take the wolf across, leaving the wolf and the cabbage alone on the other side, finally returning to fetch the goat.

Any LLM could effortlessly answer this problem, because it has thousands of instances of the problem and the correct solution in its training data. But it was found that by simply swapping out one item but keeping the same constraints, the LLM would no longer be able to answer. Replacing the wolf with a lion, would result in the LLM going off the rails and just spewing a bunch of nonsense.

This made it clear the LLM was not actually thinking or reasoning through the problem, simply just regurgitating answers and explanations from its training data. Any human, knowing the answer to the original problem, could easily handle the wolf being swapped for a lion, or the cabbage for a lettuce. But LLMs, lacking reasoning, treated this as an entirely new problem.

Over time this issue was fixed. It could be that the LLM developers wrote algorithms to identify variants of the problem. It’s also possible that people posting different variants of the problem allowed the LLM to detect the core pattern, which all variants follow, allowing it to substitute words where needed.

This is when someone found you could just break the problem, and the LLM’s pattern matching along with it. Either by making it so none of the objects could be left unattended, or all of them could. In some variants there was no reason to cross the river, the boat doesn’t fit anyone, was actually a car, or has enough space to carry all the items at once. Humans, having actual logic and reasoning abilities could easily identify the broken versions of the problems and answer accordingly, but the LLMs would just output incoherent gibberish.

But of course, as more and more ways to disprove LLM reasoning were found, the developers just found ways to fix them. I strongly suspect these issues are not being fixed by any introduction of actual logic or reasoning, but by sub-models built to address specific problems. If this is the case, I’d argue we’re moving away from AGI and back towards building problem specific ML models, which is how “AI” has worked for decades.

Bonus: Check the wikipedia page of Marcus Hutchins.
Inevitabilism

August 18, 2025

Tom Renner explaining inevitabilism.

People advancing an inevitabilist world view state that the future they perceive will inevitably come to pass. It follows, relatively straightforwardly, that the only sensible way to respond to this is to prepare as best you can for that future.

This is a fantastic framing method. Anyone who sees the future differently to you can be brushed aside as “ignoring reality”, and the only conversations worth engaging are those that already accept your premise.

“We are entering a world where we will learn to coexist with AI, not as its masters, but as its collaborators.” – Mark Zuckerberg

“AI is the new electricity.” – Andrew Ng

“AI will not replace humans, but those who use AI will replace those who don’t.” – Ginni Rometty

These are some big names in the tech world, all framing the conversation in a very specific way. Rather than “is this the future you want?”, the question is instead “how will you adapt to this inevitable future?”. Note also the threatening tone present, a healthy psychological undercurrent encouraging you to go with the flow, because you’d otherwise be messing with scary powers way beyond your understanding.
Blogging is a superpower

August 18, 2025

Simon Willison talking to Corey Quinn on AI’s Security Crisis. During the podcast Simon touches upon how his frequent blogging is the reason he has become valuable in the AI space. The bold emphasis is added by me.

So I’m a blogger, right? I blog I’ve my blog’s like 22 years old now, and having a blog is a superpower because nobody else does it, right?

The, those of us who who write frequently online are vanishing you, right? Everyone else moved to LinkedIn posts or tweet tweets or whatever. And the impact that you can have from a blog entry is so much higher than that. You’ve got more space. It lives on your own domain. You get to stay in complete control of your destiny.

And so at the moment, I’m blogging two or three things a day, and a lot of these are very short form. It’s a link to something and a couple of paragraphs about why I think that thing’s interesting. A couple of times a week, I’ll post a long form blog entry, the amount of influence you can have on the world if you write frequently about it.

I get invited to like dinners at Weird mansions in Silicon Valley to talk about AI because I have a blog. It doesn’t matter how many people read it, it matters the quality of the people that read it, right? If you are. Active in a space and you have a hundred readers, but those a hundred readers work for the companies that are influential in that space.

That’s incredibly valuable. So yeah, I, I feel like that’s really my, my, my ultimate sort of trick right now. My, my life hack is I blog and people don’t blog. They, they should blog. It’s, it’s, it’s good for you.
Eight years as shareholder of Pidilite Industries

August 17, 2025

Pidilite remains a key investment in my equity portfolio. Slowly and steadily I have increased my investment (Figure 1) in it over the last eight years. There’s a temporary pause on new investments as I have other financial commitments, but as soon as they are taken care of I would resume.

Figure 1
(more…)