Minecraft
Minecraft as an Abstraction Layer over an RDBMS
Abstract
Claim: Vanilla Minecraft implements a coherent data system whose entities, storage, execution model, and migration pipeline correspond to standard RDBMS constructs. It is not a DBMS and does not expose SQL, but it satisfies the core mapping: world state ↔ relational data; tick loop ↔ serial scheduler; NBT/registries ↔ schema; DataFixerUpper ↔ migrations; redstone/events ↔ triggers; hoppers/golems ↔ ETL jobs. This thesis formalizes the mapping, identifies gaps, and specifies additions required to expose first-class query semantics. The recent addition of the Copper Golem (item sorter) further narrows the gap.
1) Data Model
- Fundamental records
BlockState
: typed cell with properties (e.g., facing, age). Domain-constrained attributes.BlockEntity
(tile entity): key–value payload (NBT) attached to coordinates.Entity
: dynamic record with UUID primary key; attributes + NBT.ItemStack
:{item_id, count, nbt}
; appears in inventories.
- Keys
- Spatial PK:
(dimension_id, x, y, z)
for blocks/block entities. - Entity PK:
uuid
. - Inventory FK:
(owner_type, owner_id, slot)
→ItemStack
.
- Spatial PK:
- Referential relations
- Container contains items:
Container(owner_id) → Inventory(slot → ItemStack)
. - Entity ↔ World:
Entity.position
references spatial partition (chunk). - Recipes/loot tables/trades reference item IDs (registry IDs) as static FK sets.
- Container contains items:
- Types and enums
- Global registries define canonical IDs for blocks, items, biomes, etc. This is the type catalog.
2) Physical Storage
- Partitioning
- Regions (
.mca
) → Chunks (16×16 columns) → Sections (vertical slices). - Partition key:
(dimension_id, chunk_x, chunk_z)
. Equivalent to range/hash partitioning.
- Regions (
- Pages/serialization
- Chunk sections serialize block palettes and dense arrays; block entities and entities serialize as sparse NBT blobs.
- Caching and lifecycle
- Loaded chunks = in-memory working set. Unloaded chunks = persisted partitions. Autosave flush = checkpoint.
3) Execution and Concurrency
- Scheduler
- Global tick loop provides a single-threaded, deterministic update order for most world mutations. This approximates serial execution.
- Transaction boundaries
- A tick is the atomic step for game rules. Multi-chunk mutations are not atomic across chunk unload/load.
- Isolation
- Effectively single-writer (server thread). Client inputs are queued; plugins/AI act as stored procedures invoked per tick.
4) Constraints and Invariants
- Domain constraints
- Block placement, fluid rules, light updates, entity bounding boxes. Enforced during writes.
- Derived data maintenance
- Lighting graphs, heightmaps, pathfinding caches updated on writes; equivalent to maintaining secondary structures after UPDATE/INSERT.
5) Query, Indexing, and Triggers (Exposed Mechanics)
- Predicate selection
- Command selectors (
@e
,@s
, NBT/path predicates) = WHERE over entity tables. /data
and/execute store
= projection/assignment operators.
- Command selectors (
- Indices
- Spatial index: chunk map for entity and block lookup.
- Catalog indices: registries and tags (
#tag
) for item/block class membership. - Redstone comparators on containers = numeric exposure of inventory state; usable as index signals.
- Triggers
- Block updates, scheduled ticks, game events, sculk sensors, observer blocks = trigger system on write/neighbor change.
- Views/materialization
- Maps, scoreboards, bossbars, advancements = persisted summaries/materialized state driven by triggers/commands.
6) ETL, Workflows, and Background Jobs
- Pull/push pipelines
- Hoppers, droppers, water streams, minecarts = transport ETL with capacity/throughput limits (tick-bounded).
- Villager work/restock cycles = periodic jobs updating inventories/trade tables.
- Agentic sorters (e.g., golems when available) = autonomous routing jobs.
- Throughput and backpressure
- Hopper transfer rate and chunk loading create deterministic ceilings; unloaded partitions pause jobs (job suspension on partition eviction).
7) Schema and Migration
- Schema definition
- Registries + NBT shapes define the effective schema for blocks/entities/items.
- Migrations
- DataFixerUpper transforms old serialized data into current schema on load. This is a versioned migration pipeline.
- Compatibility policy
- Forward write in current version; read-convert legacy on demand. Equivalent to lazy online migration.
8) Reliability Characteristics
- Durability
- Periodic saves; crash at unsafe points can lose sub-tick state but generally preserves last checkpointed partitions.
- Consistency model
- Strong consistency within the server tick; eventual consistency across load/unload boundaries and asynchronous IO.
- Recovery
- Region-level persistence with partial-chunk locality; no write-ahead log exposed to players; backups externalized.
9) Formal Relational Mapping (Canonical Tables)
Illustrative logical schema:
sql-- Spatial partitions
CREATE TABLE chunk (
dimension INTEGER,
cx INTEGER, cz INTEGER,
PRIMARY KEY (dimension, cx, cz)
);
-- Block entities (sparse)
CREATE TABLE block_entity (
dimension INTEGER, x INTEGER, y INTEGER, z INTEGER,
type TEXT, nbt JSON, -- NBT serialized to JSON in this abstraction
PRIMARY KEY (dimension, x, y, z),
FOREIGN KEY (dimension, cx, cz) REFERENCES chunk(dimension, cx, cz)
GENERATED ALWAYS AS ( (x>>4) AS cx, (z>>4) AS cz ) -- conceptual
);
-- Entities
CREATE TABLE entity (
uuid TEXT PRIMARY KEY,
dimension INTEGER, x REAL, y REAL, z REAL,
type TEXT, nbt JSON
);
-- Inventories (containers or entities)
CREATE TABLE inventory (
owner_type TEXT, owner_id TEXT, slot INTEGER,
item_id TEXT, count INTEGER, item_nbt JSON,
PRIMARY KEY (owner_type, owner_id, slot)
);
-- Catalogs
CREATE TABLE registry_item (kind TEXT, id TEXT PRIMARY KEY, props JSON);
CREATE TABLE tag_membership (tag TEXT, id TEXT, PRIMARY KEY (tag, id));
Selectors and comparators map to:
Redstone comparator on a chest corresponds to:
A hopper move is an INSERT … ON CONFLICT DO UPDATE
with capacity checks.
10) Optimization Surrogates
- Locality: Chunk partitioning enforces spatial locality; mechanics encourage co-locating related records (bases).
- Batching: Tick loop batches writes; hopper cooldowns throttle producers.
- Caching: Client caches chunk meshes/maps; server maintains light/path caches; both reduce recomputation.
- Denormalization: Many mechanics store derived flags (e.g., lit campfires, crop age) for constant-time checks.
11) Missing Capabilities vs RDBMS
- No general query language for items/blocks in survival gameplay.
- No ACID across multi-chunk composite operations.
- No planner, statistics, or cost-based optimization.
- Limited indexing primitives exposed to players (spatial only; tags as static catalogs).
- Persistence lacks WAL/point-in-time recovery primitives at the player layer.
12) Exposing Queries with Minimal Additions (Vanilla-compatible design)
- Addressable containers: Stable container IDs and labels (schema names).
- Request interface: “Request block” with predicate slots (item id/tag, quantity); emits ready signal on fulfillment.
- Catalog block: Global index that enumerates all labeled containers and stock counts; comparator output encodes match count.
- Agent routing tables: Persistent rules per agent for priority and destinations.
- Transaction fence: “Work order” token processed atomically within a chunk set; commits or aborts on failure.
13) Practical Consequences
- Current mechanics already support RDBMS-style discipline: schema by convention, partition-aware layout, trigger-driven automation, background jobs, and migration safety.
- The gap to user-level queries is narrow and can be closed with catalog, request, and routing primitives without changing the tick-based execution model.
14) Conclusion
Minecraft is a consistent, operational abstraction of an RDBMS:
- Data = world state structured by registries and NBT.
- Storage = partitioned, paged region/chunk system.
- Execution = serial scheduler with trigger graph.
- Migrations = DataFixerUpper.
- ETL = in-world item transport and agents.
It omits SQL, ACID across multi-chunk scopes, and a cost-based planner. Adding cataloged containers, a request API, and agent routing persistence would expose first-class query semantics while preserving survival gameplay constraints. The Copper Golem's introduction as an item sorter exemplifies this potential.
The following is a SQL query that achieves the same result as a Copper Golem sorting items into labeled chests:
BEGIN IMMEDIATE;
-- Staging for calculated moves
DROP TABLE IF EXISTS temp.tmp_moves;
CREATE TEMP TABLE temp.tmp_moves (
item_id INTEGER NOT NULL,
chest_id INTEGER NOT NULL,
qty INTEGER NOT NULL CHECK (qty > 0)
);
-- Plan moves: allocate inbox quantities into candidate chests without exceeding capacity
INSERT INTO temp.tmp_moves (item_id, chest_id, qty)
WITH
i AS (
SELECT item_id, quantity AS inbox_qty
FROM inbox_items
WHERE quantity > 0
),
cf AS (
SELECT
c.chest_id,
c.capacity - COALESCE(SUM(ci.quantity), 0) AS free,
CASE WHEN COUNT(ci.item_id) = 0 THEN 1 ELSE 0 END AS is_empty
FROM chests c
LEFT JOIN chest_items ci ON ci.chest_id = c.chest_id
GROUP BY c.chest_id
HAVING free > 0
),
-- chests that already hold the item (preferred)
has_item AS (
SELECT DISTINCT
i.item_id,
cf.chest_id,
cf.free,
0 AS grp
FROM i
JOIN cf
JOIN chest_items ci
ON ci.chest_id = cf.chest_id
AND ci.item_id = i.item_id
),
-- if NONE hold the item, fallback to truly empty chests
fallback AS (
SELECT
i.item_id,
cf.chest_id,
cf.free,
1 AS grp
FROM i
JOIN cf ON cf.is_empty = 1
WHERE NOT EXISTS (
SELECT 1 FROM chest_items x
WHERE x.item_id = i.item_id
)
),
candidates AS (
SELECT * FROM has_item
UNION ALL
SELECT * FROM fallback
),
alloc AS (
SELECT
c.item_id,
c.chest_id,
c.free,
it.inbox_qty,
SUM(c.free) OVER (
PARTITION BY c.item_id
ORDER BY c.grp, c.chest_id
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
) AS cum_prev
FROM candidates c
JOIN i it USING (item_id)
),
moves AS (
SELECT
item_id,
chest_id,
CASE
WHEN inbox_qty - COALESCE(cum_prev, 0) <= 0
THEN 0
WHEN inbox_qty - COALESCE(cum_prev, 0) >= free
THEN free
ELSE inbox_qty - COALESCE(cum_prev, 0)
END AS qty
FROM alloc
)
SELECT item_id, chest_id, qty
FROM moves
WHERE qty > 0;
-- Apply moves to destination chests (UPSERT)
INSERT INTO chest_items (chest_id, item_id, quantity)
SELECT chest_id, item_id, qty
FROM temp.tmp_moves
ON CONFLICT (chest_id, item_id) DO UPDATE
SET quantity = quantity + excluded.quantity;
-- Deduct from inbox
UPDATE inbox_items AS ib
SET quantity = quantity - (
SELECT COALESCE(SUM(m.qty), 0)
FROM temp.tmp_moves m
WHERE m.item_id = ib.item_id
)
WHERE ib.item_id IN (SELECT item_id FROM temp.tmp_moves);
-- Cleanup empty inbox rows
DELETE FROM inbox_items WHERE quantity <= 0;
DROP TABLE IF EXISTS temp.tmp_moves;
COMMIT;
Supplement: “If u put a dropper that connects both chests he just works forever” --kittenself 9/25
1. Phenomenon (game-level)
- Source chest (INBOX) → Copper Golem moves items → Destination chest.
- A dropper/hopper loop pushes items from Destination back to Source.
- Net: a closed item-circuit. Throughput bounded by the slower of [golem move rate, dropper ejection rate]. Work never terminates while at least one item remains in the loop and the loop is powered.
2. RDBMS mapping
- Tables:
- A = inbox(items)
- B = storage(items)
- Jobs/Triggers:
- J1: “Sorter” job: move rows from A to B (INSERT … ON CONFLICT DO UPDATE; DELETE from A).
- T1: Trigger on B INSERT: INSERT the same item back into A (feedback).
- Result: A cyclic dependency A ↔ B. This is equivalent to unguarded trigger recursion / feedback loop in ETL.
3. Dynamics (formal)
Let r\_g = average items/tick moved by J1 (A→B).
Let r\_d = average items/tick injected by T1 (B→A).
- If r_d > 0, the system does not converge. Workload is perpetual.
- Steady-state oscillation occurs with stock shuttling between A and B.
- Queueing implication: server utilization ρ ≈ min(1, r_d / r_g) for J1; if r_d ≥ r_g then A never drains below a floor; if r_d ≪ r_g you still get periodic work bursts tied to the dropper pulse.
4. Bug or feature?
- RDBMS perspective: footgun. Unbounded recursion via triggers is a design error unless explicitly intended for streaming/feedback. You normally enforce acyclicity or recursion guards.
- Minecraft/system-simulation perspective: feature. The world is a general-purpose process graph. Cycles are legal (like redstone clocks). The engine won’t infer your intent or break cycles.
5. Operational risks (both worlds)
- Waste: perpetual CPU/tick budget spent with zero net progress.
- Head-of-line blocking: the sorter’s capacity is reserved for the loop, delaying real work.
- Observability noise: comparator/metrics stay “hot,” masking true backlogs.
- Failure sensitivity: any stall (chunk unload, power loss) will resume into the same loop; no natural quiescence.
6. Control patterns (how to make it sane)
- Acyclicity (preferred): never connect Destination → Source. Treat the storage graph as a DAG.
- Recursion guard (DB analogy: trigger depth guard): only allow T1 when A is empty below a threshold.
- Game: comparator on Source chest; if signal ≥ k for T ticks, cut power to the dropper.
- Quotas/rate limiting: gate the dropper with a slow clock so r_d ≪ r_g. You still get infinite work in principle, but tiny duty cycle.
- Idempotency flag:
- DB: add processed_flag and have J1 ignore processed=1; T1 sets processed=1 on reinserts.
- Game: not natively possible for item stacks; approximate by filtering on a different item type/token that the golem won’t re-move.
- Completion handshake:
- DB: deferred constraint requiring A=0 before T1 can fire.
- Game: wire a logic interlock—dropper only enabled when Source is empty (comparator=0) AND a one-shot request pulse is present.
7. When to keep the loop (legitimate uses)
- Heartbeat/monitor: intentional, low-frequency pulse to verify sorter liveness and alert on failure (comparator edge as “alive”).
- Benchmarking: measure r_g under load without consuming real inventory (use a fixed N dummy items).
- Visual/aesthetic: keep the agent animated; accept the cost.
8. When to break the loop (most cases)
- Any production storage where latency and throughput matter.
- Any base where tick budget is tight or chunk boundaries cause periodic stalls.
9. Minimal SQL sketch (for contrast):
-- BAD: feedback trigger (B->A) creates infinite work for J1
CREATE TRIGGER b_to_a_feedback
AFTER INSERT ON B
BEGIN
INSERT INTO A(item_id, qty)
VALUES (NEW.item_id, NEW.qty);
END;
-- BETTER: guard recursion (only when A is below threshold)
CREATE TRIGGER b_to_a_feedback_guarded
AFTER INSERT ON B
WHEN (SELECT COALESCE(SUM(qty),0) FROM A) = 0
BEGIN
INSERT INTO A(item_id, qty) VALUES (NEW.item_id, NEW.qty);
END;