Southwall Update
Where to begin
I’m starting to see different edges of AI performance. The most recent insights I have are two.
Anthropic Token Budgets
The first is that Anthropic with 4.7 is that its internals have conditionals that trim the amount of inference compute it will give your tasks. So I suspect that there will be paramentric calls that one can make. It means also that stricter guidelines have to be in prompts. More recently I have been making interactive stuff with Claude Code rather than more elaborate planned out prompts, and I think I’m falling back to a bad habit, because I can see the MCP calls are timing out. I’ll figure that out, but now I am having for the first time, the experience of a model having trouble doing what it used to do in under 12 minutes. So finally there is some indication that token budgets are stressed from hosted hardware constraints.
The second is Nate Jones’ assessment of Apple’s AI strategy which is that they’re going to enable SMB and devs to run fully capable AI homelabs. This makes perfect sense and given all of the actual power constraints of new data centers in the pipeline, my skeptical view is coming to fruition. The weird thing is that it makes me want to buy IBM mainframe hardware, and I say that because one of my NUCs fell over yesterday. But I don’t have three phase 400V coming into my house, so I will settle for another Mac Mini. But wouldn’t it be great if the new Apple CEO revives XServe? What I wouldn’t give for a 19 in rack mountable chunk of big Apple iron.
MD Util
I’m making progress on my local-first, code-first toolkit that I am co-coding with Claude. I discovered Nimtable last week and I think I like it, even though it is running under Docker.
What mdu can do today
mdu init
Scaffolds a governed project:
md-util.yaml # manifest: namespace, prefix, targets
ontology/manifest.ttl # namespace + owl:imports
ontology/ # one .ttl per entity
db/ddl/ # generated DDL (per target)
db/sql/ # SQL templates
data/inbox/ # raw assets (read-only to mdu)
data/outbox/ # all emitted artifacts
doc/ log/ .context/ # design docs, conversation logs, status reports
.md-util.db # DuckDB store: ingest history + lineage
It also writes a sane .gitignore (.envrc, data/*, .md-util.db).
mdu ingest
Auto-detects format from extension and produces a draft SHACL Turtle file in ontology/<name>.ttl.
SourceNotes.csv Type inference, null detection, 10k-row sampling.json / .jsonl Nested objects flattened with _ joins.xmlAuto-detects repeating elements; flattens nesting.sql / .ddlCREATE TABLE parsing, NOT NULL, PRIMARY KEY.db / .duckdb Schema via duckdb CLI subprocess postgres Live schema via PGGOLD_* env vars
Type inference picks the narrowest safe XSD type and falls back to xsd:string with a comment when columns are mixed.
mdu emit
Projects SHACL shapes to one or more declared targets:
TargetOutput postgres, bigquery, databricksdb/ddl/<name>_<dialect>.sqlparquetdata/outbox/<name>_schema.jsonvortexdata/outbox/<name>_vortex.sql
mdu dict
Renders data/outbox/data_dictionary.md from every .ttl in ontology/ — TOC, anchor links, sortable property tables. This is the artifact you publish to humans.
mdu ice
DuckDB → Iceberg → S3 → Glue export. Subcommands:
ice schema— extract a table’s schema (Iceberg or Glue types)ice export— write Parquet tos3://…/warehouse/<table>/and register the table in Glueice list— list tables in DuckDB or in a Glue databaseice create-db— create the Glue catalog database (idempotent)ice add-table— manually register an existing S3 prefix as an Iceberg table in Glue
Defaults: GLUE_DATABASE=mdcb_iceberg, ICEBERG_WAREHOUSE=s3://mdcb-iceberg/warehouse/.
mdu ducklake
Schema, export, list, attach — for the DuckLake catalog format. This is the “in-Postgres metadata, Parquet on disk” pattern.
mdu nimtable
Reads a Nimtable YAML (with --resolve-env and --validate-only flags) and either validates or builds a target. Useful when the table description lives in a Nimtable spec rather than being inferred from data.
mdu register
Records sources in md-util.yaml and the .md-util.db lineage store.
mdu validate (stub, Phase 2)
Validates sample data against SHACL shapes via DuckDB. Not yet a hard guard for production data.
What mdu does not do (today)
These are the seams you have to bridge with shell, DuckDB, or another tool:
It does not download data. No HTTP client, no Dataverse API, no S3 put. You stage files into
data/inbox/yourself.It does not execute DDL. The output of
emitis a.sqlfile you apply withpsql, BigQuery CLI, Databricks SQL, etc.It does not move rows.
mdu ice exportis the closest thing to data movement, and even that is a DuckDB-driven Parquet write, not a generic ETL pipeline.It does not transform. No projection, filter, or join logic. Bring your own SQL or
pyarrow/polarsstep.It does not version data. Iceberg gives you table-level snapshotting once you
ice export, butmduitself does not track row-level deltas.It does not visualize. Output is Turtle, SQL, JSON, or Markdown — never a chart.
Best Visualization
I’m looking again at viz, and trying to figure out whether or not I should be bothered with notebooks and their built-ins. Notebooks always gave me a little bit of pain as they didn’t git well. So I’m biased towards real actual code repos. OTOH you can’t beat their instant installation. My favorite is Metabase. So I’m looking on that and Rill, Grafana and the new DuckDB Dives.
More later.

