Southwall Update

Where to begin

Apr 28, 2026

I’m starting to see different edges of AI performance. The most recent insights I have are two.

Anthropic Token Budgets
The first is that Anthropic with 4.7 is that its internals have conditionals that trim the amount of inference compute it will give your tasks. So I suspect that there will be paramentric calls that one can make. It means also that stricter guidelines have to be in prompts. More recently I have been making interactive stuff with Claude Code rather than more elaborate planned out prompts, and I think I’m falling back to a bad habit, because I can see the MCP calls are timing out. I’ll figure that out, but now I am having for the first time, the experience of a model having trouble doing what it used to do in under 12 minutes. So finally there is some indication that token budgets are stressed from hosted hardware constraints.

The second is Nate Jones’ assessment of Apple’s AI strategy which is that they’re going to enable SMB and devs to run fully capable AI homelabs. This makes perfect sense and given all of the actual power constraints of new data centers in the pipeline, my skeptical view is coming to fruition. The weird thing is that it makes me want to buy IBM mainframe hardware, and I say that because one of my NUCs fell over yesterday. But I don’t have three phase 400V coming into my house, so I will settle for another Mac Mini. But wouldn’t it be great if the new Apple CEO revives XServe? What I wouldn’t give for a 19 in rack mountable chunk of big Apple iron.

MD Util
I’m making progress on my local-first, code-first toolkit that I am co-coding with Claude. I discovered Nimtable last week and I think I like it, even though it is running under Docker.

What `mdu` can do today

`mdu init`

Scaffolds a governed project:

md-util.yaml            # manifest: namespace, prefix, targets
ontology/manifest.ttl   # namespace + owl:imports
ontology/               # one .ttl per entity
db/ddl/                 # generated DDL (per target)
db/sql/                 # SQL templates
data/inbox/             # raw assets (read-only to mdu)
data/outbox/            # all emitted artifacts
doc/  log/  .context/   # design docs, conversation logs, status reports
.md-util.db             # DuckDB store: ingest history + lineage

It also writes a sane .gitignore (.envrc, data/*, .md-util.db).

`mdu ingest`

Auto-detects format from extension and produces a draft SHACL Turtle file in ontology/<name>.ttl.

SourceNotes.csv Type inference, null detection, 10k-row sampling.json / .jsonl Nested objects flattened with _ joins.xmlAuto-detects repeating elements; flattens nesting.sql / .ddlCREATE TABLE parsing, NOT NULL, PRIMARY KEY.db / .duckdb Schema via duckdb CLI subprocess postgres Live schema via PGGOLD_* env vars

Type inference picks the narrowest safe XSD type and falls back to xsd:string with a comment when columns are mixed.

`mdu emit`

Projects SHACL shapes to one or more declared targets:

TargetOutput postgres, bigquery, databricksdb/ddl/<name>_<dialect>.sqlparquetdata/outbox/<name>_schema.jsonvortexdata/outbox/<name>_vortex.sql

`mdu dict`

Renders data/outbox/data_dictionary.md from every .ttl in ontology/ — TOC, anchor links, sortable property tables. This is the artifact you publish to humans.

`mdu ice`

DuckDB → Iceberg → S3 → Glue export. Subcommands:

ice schema — extract a table’s schema (Iceberg or Glue types)
ice export — write Parquet to s3://…/warehouse/<table>/ and register the table in Glue
ice list — list tables in DuckDB or in a Glue database
ice create-db — create the Glue catalog database (idempotent)
ice add-table — manually register an existing S3 prefix as an Iceberg table in Glue

Defaults: GLUE_DATABASE=mdcb_iceberg, ICEBERG_WAREHOUSE=s3://mdcb-iceberg/warehouse/.

`mdu ducklake`

Schema, export, list, attach — for the DuckLake catalog format. This is the “in-Postgres metadata, Parquet on disk” pattern.

`mdu nimtable`

Reads a Nimtable YAML (with --resolve-env and --validate-only flags) and either validates or builds a target. Useful when the table description lives in a Nimtable spec rather than being inferred from data.

`mdu register`

Records sources in md-util.yaml and the .md-util.db lineage store.

`mdu validate` (stub, Phase 2)

Validates sample data against SHACL shapes via DuckDB. Not yet a hard guard for production data.

What `mdu` does not do (today)

These are the seams you have to bridge with shell, DuckDB, or another tool:

It does not download data. No HTTP client, no Dataverse API, no S3 put. You stage files into data/inbox/yourself.
It does not execute DDL. The output of emit is a .sql file you apply with psql, BigQuery CLI, Databricks SQL, etc.
It does not move rows. mdu ice export is the closest thing to data movement, and even that is a DuckDB-driven Parquet write, not a generic ETL pipeline.
It does not transform. No projection, filter, or join logic. Bring your own SQL or pyarrow/polars step.
It does not version data. Iceberg gives you table-level snapshotting once you ice export, but mdu itself does not track row-level deltas.
It does not visualize. Output is Turtle, SQL, JSON, or Markdown — never a chart.

Best Visualization
I’m looking again at viz, and trying to figure out whether or not I should be bothered with notebooks and their built-ins. Notebooks always gave me a little bit of pain as they didn’t git well. So I’m biased towards real actual code repos. OTOH you can’t beat their instant installation. My favorite is Metabase. So I’m looking on that and Rill, Grafana and the new DuckDB Dives.

More later.

Tessellations

Discussion about this post

Ready for more?

Tessellations

Southwall Update

Where to begin

What mdu can do today

mdu init

mdu ingest

mdu emit

mdu dict

mdu ice

mdu ducklake

mdu nimtable

mdu register

mdu validate (stub, Phase 2)

What mdu does not do (today)