One of my predictions for the year is that there will be an AQ Khan of AIs. But in one way, LLM proliferation is already done. I kind of knew this early last year when I downloaded Vicuna - back when everybody was talking about Vicuna, and I had it answering questions from my Mac. It has been a while and I’m kind of back at it.
I say kinda because I’m horrible at marketing myself and when I halfway know something and get my hands dirty, I say that I know before I think I really know. This is the normal contingent thinking of working with open source software and knowing multiple packages that claim to do the same thing. I guess that’s why we’re called ‘developers’ and not ‘productioners’.
I had a great meeting via Lunchclub which I haven’t used in 3 years, with Mike L. He’s ebullient and I almost couldn’t get in a word edgewise, but we finally found out rhythm. The biggest idea out of many, or the most sticky idea so far, is that of AI overreach.
The Broligarchs are shooting for domination and lifting themselves and their ilk over the dosh point by exploiting the retail market. We are already accepting, in these AI systems which claim to be reaching for AGI and then scaling to super intelligence, a host of ridiculous hallucinations. The turnaround time for finding the sort of bugs that would get any actual human shamed and cancelled is down to a couple of days. The bias of reinforcement learning is easily detected by academics, especially professors at better universities who can instantly tell when their students cheat. So what if Sora can show me Will Smith eating Cheerios? Is that really what we need these for, as crappy as they are?
I’ve been using ChatGPT and love the app. I hate coding copilots. I’m happy with tools like ruff
. I don’t want the machine tell me what it thinks I meant to say. But I do use it Chat like a search engine that speaks English. Most of the time, I am moderately convinced but I know better than to ask it analytical questions.
Duck Update
Now that I am just now using ollama
, I’m ready to do that integration with DuckDB and see what that yields. A simple NLP ability might get me an automation that takes non-human readable field names or code values and make it all human readable. This is a huge problem in the database migration world, and just that alone would be very valuable. I definitely want to play in the realm of principal component analysis and make slowly changing dimensions into smartly changing dimensions. But also unlock and unleash large customer databases with alternative rollups.
I still don’t fully understand the client server model with regard to DuckDB and MotherDuck and the options of materialization available between those two machines and the dynamic state of an Iceberg file. And of course AWS has just thrown a massive curveball with S3 Tables. But the feature set of DuckDB is everything I ever dreamed that Mondrian might have been.
I’m working on a business model that is value-based pricing for DuckDB-based OLAP conversions. I think I can take a big bite out of Snowflake, and just about everything downscale of 10TB analytical applications. I think I can write (or get written) a community adapter for Dodeca, now that I’ve seen what DuckDB can do with gsheets
. I’m grinning like Dozer. Still, I have to think about Rill and Evidence and evaluate them. The mass of stuff written for iPython is dauntingly deep and wide.
Small Business
Still, what bugs me is the amount of metadata and content that has been vacuumed up by every equivalent Google - ie the industrial scrapers of social media and everything else. They are leapfrogging over SMB, and that’s where all the interesting fun is.
For example. Here in SoCal, I have been using Dreamhost for over 20 years. I admit that every year I get a new crazy idea, buy a new domain, and then forget what I was thinking when it comes time to renew it. But what’s wonderful about Dreamhost is that they have some Minio(?) equivalent of S3 object storage now available for a lower cost than AWS S3.
So what I’ve done for the big corporates, I can do for smart small businesses with open source tools that I package well, which doesn’t need AGI or super intelligence, just some level of expertise that levels the ordinary guy up a notch. I think I can do that better than what say Tumblr or Wordpress offered when they were the go-to providers of do it yourself web publishing.
SMB should be able to work smarter. So I’m thinking - what is an expert system for a chain of restaurants? What is an expert system for optometry? What is an expert system for a car dealership? Let’s remember the Moneyball story. There is a world of higher dimensional analysis that never got its OLAP and we can deliver that without the Broligarchs. Or maybe that’s wishful thinking on my part. I’m still going to build it. We’ll see who comes.
The other aspect of my direction has to do with the lame quality of the metadata that is vacuumed up in the first place. That’s some tech I’d like to seed and sell. More on that next time…
Prompts for today:
In the 80s AI research focused around so-called 'expert systems' backwards chaining and forward chaining. Explain.
How does backward or forward chaining relate to reinforcement learning?