Kermadec is an ocean deep in rough proximity to NewZealand. The aim of the Kermadec project is to make social media, specifically blogging, easier for journalists and researchers who want to work with small datasets, meant to be consumed by readers online.
The aim is to provide a deeper amount of data than non-data scientists generally deal with in the context of a Small Data paradigm. It’s kind of something I’ve always wanted to do. The anchor baby of this thought process is outlined in the Small Data Manifesto by the good people over at DuckDB, which is with Mother Duck, my goto database.
For the first time in a very long time, I am not thinking about big data at scale. I realize this is no longer a place for lowly developers like me. On the other hand, it is very well within my reach to have a 64GB RAM 2TB SSD workstation. This is as big as ‘big data’ was just before the invention of horizontal scale-out databases. I seem to recall companies working in that space easily getting 5,000 customers worldwide who would pay tens of thousands of dollars for 5 seats and a server for data volumes easily handled by DuckDB today. Especially in the realm of HUMINT and trusted sources like journalists and researchers, that’s more than enough of a query space. If I’m wrong, then I’m wrong. But isn’t it miraculous how well millions of people around the world without any AI whatsoever are making decisions on spreadsheets?
I aim to master some sort of (maybe blockchain based immutable) secure data delivery system that can be attached to a Substack essay as easily as we now upload video. I want to make it trusted. I’m sure the guys at Bellingcat have suggestions. Basically, I am conceding some of the information decision-space to text and video to social media. But social media doesn’t do data. That’s the problem I’m trying to solve.
I also want to add the following:
This interview with Marc Andreessen is confirmation of something I’ve learned from other sources as regards the scope of censorship that is enabled by a confluence of interests enabled by technology we talk about as ‘big data at scale’. Just as construction technology in the era of Robert Moses allowed our experts to design city planning and take eminent domain in the mid 20th century, digital technology in the era of Mark Zuckerberg in this decade social media have been all about digital high-rises. Can anyone possibly believe that Zuckerberg gets his political information from reading Facebook? Big data at scale is discount information for the masses, not high quality information by experts for experts. Big data at scale is designed to be corralled and massaged, but not curated. Kermadec aims to be curated and small.