Minecraft

Minecraft as an Abstraction Layer over an RDBMS

Abstract

Claim: Vanilla Minecraft implements a coherent data system whose entities, storage, execution model, and migration pipeline correspond to standard RDBMS constructs. It is not a DBMS and does not expose SQL, but it satisfies the core mapping: world state ↔ relational data; tick loop ↔ serial scheduler; NBT/registries ↔ schema; DataFixerUpper ↔ migrations; redstone/events ↔ triggers; hoppers/golems ↔ ETL jobs. This thesis formalizes the mapping, identifies gaps, and specifies additions required to expose first-class query semantics. The recent addition of the Copper Golem (item sorter) further narrows the gap.

1) Data Model

Fundamental records
- BlockState: typed cell with properties (e.g., facing, age). Domain-constrained attributes.
- BlockEntity (tile entity): key–value payload (NBT) attached to coordinates.
- Entity: dynamic record with UUID primary key; attributes + NBT.
- ItemStack: {item_id, count, nbt}; appears in inventories.
Keys
- Spatial PK: (dimension_id, x, y, z) for blocks/block entities.
- Entity PK: uuid.
- Inventory FK: (owner_type, owner_id, slot) → ItemStack.
Referential relations
- Container contains items: Container(owner_id) → Inventory(slot → ItemStack).
- Entity ↔ World: Entity.position references spatial partition (chunk).
- Recipes/loot tables/trades reference item IDs (registry IDs) as static FK sets.
Types and enums
- Global registries define canonical IDs for blocks, items, biomes, etc. This is the type catalog.

2) Physical Storage

Partitioning
- Regions (.mca) → Chunks (16×16 columns) → Sections (vertical slices).
- Partition key: (dimension_id, chunk_x, chunk_z). Equivalent to range/hash partitioning.
Pages/serialization
- Chunk sections serialize block palettes and dense arrays; block entities and entities serialize as sparse NBT blobs.
Caching and lifecycle
- Loaded chunks = in-memory working set. Unloaded chunks = persisted partitions. Autosave flush = checkpoint.

3) Execution and Concurrency

Scheduler
- Global tick loop provides a single-threaded, deterministic update order for most world mutations. This approximates serial execution.
Transaction boundaries
- A tick is the atomic step for game rules. Multi-chunk mutations are not atomic across chunk unload/load.
Isolation
- Effectively single-writer (server thread). Client inputs are queued; plugins/AI act as stored procedures invoked per tick.

4) Constraints and Invariants

Domain constraints
- Block placement, fluid rules, light updates, entity bounding boxes. Enforced during writes.
Derived data maintenance
- Lighting graphs, heightmaps, pathfinding caches updated on writes; equivalent to maintaining secondary structures after UPDATE/INSERT.

5) Query, Indexing, and Triggers (Exposed Mechanics)

Predicate selection
- Command selectors (@e, @s, NBT/path predicates) = WHERE over entity tables.
- /data and /execute store = projection/assignment operators.
Indices
- Spatial index: chunk map for entity and block lookup.
- Catalog indices: registries and tags (#tag) for item/block class membership.
- Redstone comparators on containers = numeric exposure of inventory state; usable as index signals.
Triggers
- Block updates, scheduled ticks, game events, sculk sensors, observer blocks = trigger system on write/neighbor change.
Views/materialization
- Maps, scoreboards, bossbars, advancements = persisted summaries/materialized state driven by triggers/commands.

6) ETL, Workflows, and Background Jobs

Pull/push pipelines
- Hoppers, droppers, water streams, minecarts = transport ETL with capacity/throughput limits (tick-bounded).
- Villager work/restock cycles = periodic jobs updating inventories/trade tables.
- Agentic sorters (e.g., golems when available) = autonomous routing jobs.
Throughput and backpressure
- Hopper transfer rate and chunk loading create deterministic ceilings; unloaded partitions pause jobs (job suspension on partition eviction).

7) Schema and Migration

Schema definition
- Registries + NBT shapes define the effective schema for blocks/entities/items.
Migrations
- DataFixerUpper transforms old serialized data into current schema on load. This is a versioned migration pipeline.
Compatibility policy
- Forward write in current version; read-convert legacy on demand. Equivalent to lazy online migration.

8) Reliability Characteristics

Durability
- Periodic saves; crash at unsafe points can lose sub-tick state but generally preserves last checkpointed partitions.
Consistency model
- Strong consistency within the server tick; eventual consistency across load/unload boundaries and asynchronous IO.
Recovery
- Region-level persistence with partial-chunk locality; no write-ahead log exposed to players; backups externalized.

9) Formal Relational Mapping (Canonical Tables)

Illustrative logical schema:

sql-- Spatial partitions
CREATE TABLE chunk (
  dimension INTEGER,
  cx INTEGER, cz INTEGER,
  PRIMARY KEY (dimension, cx, cz)
);

-- Block entities (sparse)
CREATE TABLE block_entity (
  dimension INTEGER, x INTEGER, y INTEGER, z INTEGER,
  type TEXT, nbt JSON,             -- NBT serialized to JSON in this abstraction
  PRIMARY KEY (dimension, x, y, z),
  FOREIGN KEY (dimension, cx, cz) REFERENCES chunk(dimension, cx, cz)
    GENERATED ALWAYS AS ( (x>>4) AS cx, (z>>4) AS cz ) -- conceptual
);

-- Entities
CREATE TABLE entity (
  uuid TEXT PRIMARY KEY,
  dimension INTEGER, x REAL, y REAL, z REAL,
  type TEXT, nbt JSON
);

-- Inventories (containers or entities)
CREATE TABLE inventory (
  owner_type TEXT, owner_id TEXT, slot INTEGER,
  item_id TEXT, count INTEGER, item_nbt JSON,
  PRIMARY KEY (owner_type, owner_id, slot)
);

-- Catalogs
CREATE TABLE registry_item (kind TEXT, id TEXT PRIMARY KEY, props JSON);
CREATE TABLE tag_membership (tag TEXT, id TEXT, PRIMARY KEY (tag, id));

Selectors and comparators map to:

Redstone comparator on a chest corresponds to:

A hopper move is an INSERT … ON CONFLICT DO UPDATE with capacity checks.

10) Optimization Surrogates

Locality: Chunk partitioning enforces spatial locality; mechanics encourage co-locating related records (bases).
Batching: Tick loop batches writes; hopper cooldowns throttle producers.
Caching: Client caches chunk meshes/maps; server maintains light/path caches; both reduce recomputation.
Denormalization: Many mechanics store derived flags (e.g., lit campfires, crop age) for constant-time checks.

11) Missing Capabilities vs RDBMS

No general query language for items/blocks in survival gameplay.
No ACID across multi-chunk composite operations.
No planner, statistics, or cost-based optimization.
Limited indexing primitives exposed to players (spatial only; tags as static catalogs).
Persistence lacks WAL/point-in-time recovery primitives at the player layer.

12) Exposing Queries with Minimal Additions (Vanilla-compatible design)

Addressable containers: Stable container IDs and labels (schema names).
Request interface: “Request block” with predicate slots (item id/tag, quantity); emits ready signal on fulfillment.
Catalog block: Global index that enumerates all labeled containers and stock counts; comparator output encodes match count.
Agent routing tables: Persistent rules per agent for priority and destinations.
Transaction fence: “Work order” token processed atomically within a chunk set; commits or aborts on failure.

13) Practical Consequences

Current mechanics already support RDBMS-style discipline: schema by convention, partition-aware layout, trigger-driven automation, background jobs, and migration safety.
The gap to user-level queries is narrow and can be closed with catalog, request, and routing primitives without changing the tick-based execution model.

14) Conclusion

Minecraft is a consistent, operational abstraction of an RDBMS:

Data = world state structured by registries and NBT.
Storage = partitioned, paged region/chunk system.
Execution = serial scheduler with trigger graph.
Migrations = DataFixerUpper.
ETL = in-world item transport and agents.

It omits SQL, ACID across multi-chunk scopes, and a cost-based planner. Adding cataloged containers, a request API, and agent routing persistence would expose first-class query semantics while preserving survival gameplay constraints. The Copper Golem's introduction as an item sorter exemplifies this potential.

The following is a SQL query that achieves the same result as a Copper Golem sorting items into labeled chests:

BEGIN IMMEDIATE;

-- Staging for calculated moves
DROP TABLE IF EXISTS temp.tmp_moves;
CREATE TEMP TABLE temp.tmp_moves (
  item_id  INTEGER NOT NULL,
  chest_id INTEGER NOT NULL,
  qty      INTEGER NOT NULL CHECK (qty > 0)
);

-- Plan moves: allocate inbox quantities into candidate chests without exceeding capacity
INSERT INTO temp.tmp_moves (item_id, chest_id, qty)
WITH
  i AS (
    SELECT item_id, quantity AS inbox_qty
    FROM inbox_items
    WHERE quantity > 0
  ),
  cf AS (
    SELECT
      c.chest_id,
      c.capacity - COALESCE(SUM(ci.quantity), 0) AS free,
      CASE WHEN COUNT(ci.item_id) = 0 THEN 1 ELSE 0 END AS is_empty
    FROM chests c
    LEFT JOIN chest_items ci ON ci.chest_id = c.chest_id
    GROUP BY c.chest_id
    HAVING free > 0
  ),
  -- chests that already hold the item (preferred)
  has_item AS (
    SELECT DISTINCT
      i.item_id,
      cf.chest_id,
      cf.free,
      0 AS grp
    FROM i
    JOIN cf
    JOIN chest_items ci
      ON ci.chest_id = cf.chest_id
     AND ci.item_id  = i.item_id
  ),
  -- if NONE hold the item, fallback to truly empty chests
  fallback AS (
    SELECT
      i.item_id,
      cf.chest_id,
      cf.free,
      1 AS grp
    FROM i
    JOIN cf ON cf.is_empty = 1
    WHERE NOT EXISTS (
      SELECT 1 FROM chest_items x
      WHERE x.item_id = i.item_id
    )
  ),
  candidates AS (
    SELECT * FROM has_item
    UNION ALL
    SELECT * FROM fallback
  ),
  alloc AS (
    SELECT
      c.item_id,
      c.chest_id,
      c.free,
      it.inbox_qty,
      SUM(c.free) OVER (
        PARTITION BY c.item_id
        ORDER BY c.grp, c.chest_id
        ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
      ) AS cum_prev
    FROM candidates c
    JOIN i it USING (item_id)
  ),
  moves AS (
    SELECT
      item_id,
      chest_id,
      CASE
        WHEN inbox_qty - COALESCE(cum_prev, 0) <= 0
             THEN 0
        WHEN inbox_qty - COALESCE(cum_prev, 0) >= free
             THEN free
        ELSE inbox_qty - COALESCE(cum_prev, 0)
      END AS qty
    FROM alloc
  )
SELECT item_id, chest_id, qty
FROM moves
WHERE qty > 0;

-- Apply moves to destination chests (UPSERT)
INSERT INTO chest_items (chest_id, item_id, quantity)
SELECT chest_id, item_id, qty
FROM temp.tmp_moves
ON CONFLICT (chest_id, item_id) DO UPDATE
SET quantity = quantity + excluded.quantity;

-- Deduct from inbox
UPDATE inbox_items AS ib
SET quantity = quantity - (
  SELECT COALESCE(SUM(m.qty), 0)
  FROM temp.tmp_moves m
  WHERE m.item_id = ib.item_id
)
WHERE ib.item_id IN (SELECT item_id FROM temp.tmp_moves);

-- Cleanup empty inbox rows
DELETE FROM inbox_items WHERE quantity <= 0;

DROP TABLE IF EXISTS temp.tmp_moves;

COMMIT;

Supplement: “If u put a dropper that connects both chests he just works forever” --kittenself 9/25

1. Phenomenon (game-level)

Source chest (INBOX) → Copper Golem moves items → Destination chest.
A dropper/hopper loop pushes items from Destination back to Source.
Net: a closed item-circuit. Throughput bounded by the slower of [golem move rate, dropper ejection rate]. Work never terminates while at least one item remains in the loop and the loop is powered.

2. RDBMS mapping

Tables:
- A = inbox(items)
- B = storage(items)
Jobs/Triggers:
- J1: “Sorter” job: move rows from A to B (INSERT … ON CONFLICT DO UPDATE; DELETE from A).
- T1: Trigger on B INSERT: INSERT the same item back into A (feedback).
Result: A cyclic dependency A ↔ B. This is equivalent to unguarded trigger recursion / feedback loop in ETL.

3. Dynamics (formal)

Let r\_g = average items/tick moved by J1 (A→B).  
Let r\_d = average items/tick injected by T1 (B→A).

If r_d > 0, the system does not converge. Workload is perpetual.
Steady-state oscillation occurs with stock shuttling between A and B.
Queueing implication: server utilization ρ ≈ min(1, r_d / r_g) for J1; if r_d ≥ r_g then A never drains below a floor; if r_d ≪ r_g you still get periodic work bursts tied to the dropper pulse.

4. Bug or feature?

RDBMS perspective: footgun. Unbounded recursion via triggers is a design error unless explicitly intended for streaming/feedback. You normally enforce acyclicity or recursion guards.
Minecraft/system-simulation perspective: feature. The world is a general-purpose process graph. Cycles are legal (like redstone clocks). The engine won’t infer your intent or break cycles.

5. Operational risks (both worlds)

Waste: perpetual CPU/tick budget spent with zero net progress.
Head-of-line blocking: the sorter’s capacity is reserved for the loop, delaying real work.
Observability noise: comparator/metrics stay “hot,” masking true backlogs.
Failure sensitivity: any stall (chunk unload, power loss) will resume into the same loop; no natural quiescence.

6. Control patterns (how to make it sane)

Acyclicity (preferred): never connect Destination → Source. Treat the storage graph as a DAG.
Recursion guard (DB analogy: trigger depth guard): only allow T1 when A is empty below a threshold.
- Game: comparator on Source chest; if signal ≥ k for T ticks, cut power to the dropper.
Quotas/rate limiting: gate the dropper with a slow clock so r_d ≪ r_g. You still get infinite work in principle, but tiny duty cycle.
Idempotency flag:
- DB: add processed_flag and have J1 ignore processed=1; T1 sets processed=1 on reinserts.
- Game: not natively possible for item stacks; approximate by filtering on a different item type/token that the golem won’t re-move.
Completion handshake:
- DB: deferred constraint requiring A=0 before T1 can fire.
- Game: wire a logic interlock—dropper only enabled when Source is empty (comparator=0) AND a one-shot request pulse is present.

7. When to keep the loop (legitimate uses)

Heartbeat/monitor: intentional, low-frequency pulse to verify sorter liveness and alert on failure (comparator edge as “alive”).
Benchmarking: measure r_g under load without consuming real inventory (use a fixed N dummy items).
Visual/aesthetic: keep the agent animated; accept the cost.

8. When to break the loop (most cases)

Any production storage where latency and throughput matter.
Any base where tick budget is tight or chunk boundaries cause periodic stalls.

9. Minimal SQL sketch (for contrast):

-- BAD: feedback trigger (B->A) creates infinite work for J1
CREATE TRIGGER b_to_a_feedback
AFTER INSERT ON B
BEGIN
  INSERT INTO A(item_id, qty)
  VALUES (NEW.item_id, NEW.qty);
END;

-- BETTER: guard recursion (only when A is below threshold)
CREATE TRIGGER b_to_a_feedback_guarded
AFTER INSERT ON B
WHEN (SELECT COALESCE(SUM(qty),0) FROM A) = 0
BEGIN
  INSERT INTO A(item_id, qty) VALUES (NEW.item_id, NEW.qty);
END;