The problem, unfortunately, is the scale. It's always scale. Humans make all the kinds of mistakes that we ascribe to LLMs, but LLMs can make them much faster and at much larger scale.
Models have gotten ridiculously better, they really have, but the scale has increased too, and I don't think we're ready to deal with the onslaught.
Scale is very different, but I wonder if human trust isn't the real issue. We trust technology too much as a group. We expect perfection, but we also assume perfection. This might be because the machines output confident sounding answers and humans default to trusting confidence as an indirect measure for accuracy, but I think there is another level where people just blindly trust machines because they are so use to using them for algorithms that trend towards giving correct responses.
Even before LLMs where in the public's discourse, I would have business ask about using AI instead of building some algorithm manually, and when I asked if they had considered the failure rate, they would return either blank stares or say that would count as a bug. To them, AI meant an algorithm just as good as one built to handle all edge cases in business logic, but easier and faster to implement.
We can generally recognize the AIs being off when they deal in our area of expertise, but there is some AI variant of Gell-Mann Amnesia at play that leads us to go back to trusting AI when it gives outputs in areas we are novices in.
I find the concept of LLM "brain surgery" fascinating, precisely because of how opaque the network is. One of the first things I did back when llama.cpp first got vision model support was hack the code to zero out (or otherwise modify) random numbers in the image embedding generated by the projector and then ask the LLM to describe the image. It was absolutely fascinating.
It would go from a normal description of the item in the picture to suddenly seeing people clapping in the background that were not there, or making up some other stuff. I kinda stopped after a while, but I should pick that back up and do a more coherent experiment to see if I can find any correlation between vector dimensions and "meaning."
Maybe not, but in the past some here see that the blog is the product that is being promoted here.
Even in this thread alone https://news.ycombinator.com/item?id=47314929 some commenters here are clearly annoyed with the way AI is being shoved in each place where they do not want it.
I don't care, but I can see why many here are getting tired of it.
Heck yeah! Love the VisiData shoutout. Echoing other people's desire for a web UI, mostly so I don't have to be the sole Maintainer of the Truth as the only resident household technomancer.
EDIT: alternatively, exposing the data/functionality via MCP or similar would allow me to connect this to an agent using Home Assistant Voice, so anybody in the house could ask for changes or add new information.
This is super interesting. I do have a GitHub issue for LLM-powered data entry: "Add a landscaping project to do the backyard. Still ideating, thinking a budget of $40k."
Funny enough, Saul and I recently hacked on getting visidata's Ibis integration updated, so you can use visidata for poking around databases of any size, really. You might like that, but also visidata has non-ibis support for SQLite I believe.
Maybe your point was that, but that's not the point the person you replied to was addressing. Nobody was arguing about the specific definition of "modern."
The original commenter made a very clear claim: that the most recent "peak" system was the Xbox, which was discontinued in 2005, and that everything after that has been a rehash.
If the pace of quality titles is such that people have to go fishing through multiple decades to find what they believe to be a convincing-looking list of titles to compete with the OG Xbox, that is an indication that yes, even with more games coming out, fewer good games are coming out.
Also, unlike OpenAI, Anthropic's prompt caching is explicit (you set up to 4 cache "breakpoints"), meaning if you don't implement caching then you don't benefit from it.
Models have gotten ridiculously better, they really have, but the scale has increased too, and I don't think we're ready to deal with the onslaught.
reply