You can basically hand it a design, one that might take a FE engineer anywhere from a day to a week to complete and Codex/Claude will basically have it coded up in 30 seconds. It might need some tweaks, but it's 80% complete with that first try. Like I remember stumbling over graphing and charting libraries, it could take weeks to become familiar with all the different components and APIs, but seemingly you can now just tell Codex to use this data and use this charting library and it'll make it. All you have to do is look at the code. Things have certainly changed.
I figure it takes me a week to turn the output of ai into acceptable code. Sure there is a lot of code in 30 seconds but it shouldn't pass code review (even the ai's own review).
For now. Claude is worse than we are at programming. But its improving much faster than I am. Opus 4.6 is incredible compared to previous models.
How long before those lines cross? Intuitively it feels like we have about 2-3 years before claude is better at writing code than most - or all - humans.
I keep seeing this. The "for now" comments, and how much better it's getting with each model.
I don't see it in practice though.
The fundamental problem hasn't changed: these things are not reasoning. They aren't problem solving.
They're pattern matching. That gives the illusion of usefulness for coding when your problem is very similar to others, but falls apart as soon as you need any sort of depth or novelty.
I haven't seen any research or theories on how to address this fundamental limitation.
The pattern matching thing turns out to be very useful for many classes of problems, such as translating speech to a structured JSON format, or OCR, etc... but isn't particularly useful for reasoning problems like math or coding (non-trivial problems, of course).
I'm pretty excited about the applications for AI overall and it's potential to reduce human drudgery across many fields, I just think generating code in response to prompts is a poor choice of a LLM application.
Have you actually tried the latest agentic coding models?
Yesterday I asked claude to implement a working web based email client from scratch in rust which can interact with a JMAP based mail server. It did. It took about 20 minutes. The first version had a few bugs - like it was polling for mail instead of streaming emails in. But after prompting it to fix some obvious bugs, I now have a working email client.
Its missing lots of important features - like, it doesn't render HTML emails correctly. And the UI looks incredibly basic. But it wrote the whole thing in 2.5k lines of rust from scratch and it works.
This wasn't possible at all a couple of years ago. A couple of years ago I couldn't get chatgpt to port a single source file from rust to typescript without it running out of context space and introducing subtle bugs in my code. And it was rubbish at rust - it would introduce borrow checker problems and then get stuck, trying and failing to get it to compile. Now claude can write a whole web based email client in rust from scratch, no worries. I did need to manually point out some bugs in the program - claude didn't test its email client on its own. There's room for improvement for sure. But the progress is shocking.
I don't know how anyone who's actually pushed these models can claim they haven't improved much. They're lightyears ahead of where they were a few years ago. Have you actually tried them?
Honestly, I really did do this for a while, mostly in response to comments like this, with some degree of excitement.
I've been disappointed every time.
I do use the LLMs for summarization and "a better google" and am constantly confronted with how inaccurate they are.
I haven't tried with code in the past couple months because to be completely honest, I just don't care.
I enjoy my craft, I enjoy puzzling and thinking through better ways of doing things, I like being confronted with a tedious task because it pushes me towards finding more optimal approaches.
I haven't seen any research that justifies the use of LLMs for code generation, even in the short term, and plenty that supports my concerns about mid to long term impact on quality and skills.
It is certainly already better than most humans, even better than most humans who occasionally code. The bar is already quite high, I'd say. You have to be decent in your niche to outcompete frontier LLM Agents in a meaningful way.
I'm only allowed 4.5 at work where I do this (likely to change soon but bureaucracy...). Still the resulting code is not at a level I expect.
i told my boss (not fully serious) we should ban anyone with less than 5 years experience from using the ai so they learn to write and recognize good code.
Honestly you could just come up with a basic wireframe in any design software (MS paint would work) and a screen shot of a website with a design you like and tell it "apply the aesthetic from the website in this screenshot to the wireframe" and it would probably get 80% (probably more) of the way there. Something that would have taken me more than a day in the past.
I've been in web design since images were first introduced to browsers and modern designs for the majority of sites are more templated than ever. AI can already generate inspiration, prototypes and designs that go a long way to matching these, then juice them with transitions/animations or whatever else you might want.
The other day I tested an AI by giving it a folder of images, each named to describe the content/use/proportions (e.g., drone-overview-hero-landscape.jpg), told it the site it was redesigning, and it did a very serviceable job that would match at least a cheap designer. On the first run, in a few seconds and with a very basic prompt. Obviously with a different AI, it could understand the image contents and skip that step easily enough.
I have never once seen this actually work in a way that produces a product I would use. People keep claiming these one-shot (or nearly one-shot) successes, but in the mean time I ask it to modify a simple CSS rule and it rewrites the enter file, breaks the site, and then can't seem to figure out what it did wrong.
It's kind of telling that the number of apps on Apple's app store has been decreasing in recent years. Same thing on the Android store too. Where are the successful insta-apps? I really don't believe it's happening.
I've recently tried using all of the popular LLMs to generate DSP code in C++ and it's utterly terrible at it, to the point that it almost never even makes it through compilation and linking.
Can you show me the library of apps you've launched in the last few years? Surely you've made at least a few million in revenue with the ease with which you are able to launch products.
There's a really painful Dunning-Kruger process with LLMs, coupled with brutal confirmation bias that seems to have the industry and many intelligent developers totally hoodwinked.
I went through it too. I'm pretty embarrassed at the AI slop I dumped on my team, thinking the whole time how amazingly productive I was being.
I'm back to writing code by hand now. Of course I use tools to accelerate development, but it's classic stuff like macros and good code completion.
Sure, a LLM can vomit up a form faster than I can type (well, sometimes, the devil is always the details), but it completely falls apart when trying to do something the least bit interesting or novel.
Absolutely. I also think there's a huge number of wannabe developers who don't have the patience to actually learn development. Those people desperately want this AI development dream to be true so they pretend and convince themselves that it is. They talk about how well it works on internet forums, but you ask for the product and it's crickets. It's all wishful thinking.
The number of non-technical people in my orbit that could successfully pull up Claude code and one shot a basic todo app is zero. They couldn’t do it before and won’t be able to now.
You go to chatGPT and say "produce a detailed prompt that will create a functioning todo app" and then put that output into Claude Code and you now have a TODO app.
This is still a stumbling block for a lot of people. Plenty of people could've found an answer to a problem they had if they had just googled it, but they never did. Or they did, but they googled something weird and gave up. AI use is absolutely going to be similar to that.
Maybe I’m biased working in insurance software, but I don’t get the feeling much programming happens where the code can be completely stochastically generated, never have its code reviewed, and that will be okay with users/customers/governments/etc.
Even if all sandboxing is done right, programs will be depended on to store data correctly and to show correct outputs.
Insurance is complicated, not frequently discussed online, and all code depends on a ton of domain knowledge and proprietary information.
I'm in a similar domain, the AI is like a very energetic intern. For me to get a good result requires a clear and detailed enough prompt I could probably write expression to turn it into code. Even still, after a little back and forth it loses the plot and starts producing gibberish.
But in simpler domains or ones with lots of examples online (for instance, I had an image recognition problem that looked a lot like a typical machine learning contest) it really can rattle stuff off in seconds that would take weeks/months for a mid level engineer to do and often be higher quality.
You don't need to draw the line between tech experts and the tech-naive. Plenty of people have the capability but not the time or discipline to execute such a thing by hand.
Not really. What the FE engineer will produce in a week will be vastly different from what the AI will produce. That's like saying restaurants are dead because it takes a minute to heat up a microwave meal.
It does make the lowest common denominator easier to reach though. By which I mean your local takeaway shop can have a professional looking website for next to nothing, where before they just wouldn't have had one at all.
I think exceptional work, AI tools or not, still takes exceptional people with experience and skill. But I do feel like a certain level of access to technology has been unlocked for people smart enough, but without the time or tools to dive into the real industry's tools (figma, code, data tools etc).
The local takeaway shop could have had a professional looking website for years with Wix, Squarespace, etc. There are restaurant specific solutions as well. Any of these would be better than vibe coding for a non-tech person. No-code has existed for years and there hasn't been a flood of bespoke software coming from end users. I find it hard to believe that vibe-coding is easier or more intuitive than GUI tooling designed for non-experts...
I think the idea that LLM's will usher in some new era where everyone and their mom are building software is a fantasy.
I more or less agree specifically on the angle that no-code has existed, yet non-technical people still aren't executing on technical products. But I don't think vibe-coding is where we see this happening, it will be in chat interfaces or GUIs. As the "scafolding" or "harnesses" mature more, and someone can just type what they want, then get a deployed product within the day after some back and forth.
I am usually a bit of an AI skeptic but I can already see that this is within the realm of possibility, even if models stopped improving today. I think we underestimate how technical things like WIX or Squarespace are, to a non-technical person, but many are skilled business people who could probably work with an LLM agent to get a simple product together.
People keep saying code was never the real skill of an engineer, but rather solving business logic issues and codifying them. Well people running a business can probably do that too, and it would be interesting to see them work with an LLM to produce a product.
> I think we underestimate how technical things like WIX or Squarespace are, to a non-technical person, but many are skilled business people who could probably work with an LLM agent to get a simple product together.
In the same vein, I think you underestimate how much "hidden" technical knowledge must be there to actually build a software that works most of the time (not asking for a bug-free program). To design such a program with current LLM coding agents you need to be at very least a power user, probably a very powerful one, in the domain of the program you want to build and also in the domain of general software.
Maybe things will improve with LLM and agents and "make it work" will be enough for the agent to create tests, try extensively the program, finding bugs and squashing them and do all the extra work needed, who know. But we are definitely not there today.
Yeah I've thought for a while that the ideal interface for non-tech users would be these no-code tools but with an AI interface. Kinda dumb to generate code that they can't make sense of, with no guard rails etc.
Wouldn’t we have more restaurants if there was no microwave ovens? But microwave oven also gave rise to many frozen food industry. Overall more industrializations.