picogame declarative scene/map format — design proposal

Goal: describe a scene/stage/map as data, so one source of truth feeds the device game, the desktop simulator, and a future editor — and so the tedious, editor-able parts (asset wiring, sprite placement, map painting, tile semantics, layer/z-order, HUD, camera) stop being hand-coded Python.

The decisions (and why)

1. Two tiers: authoring JSON → baked runtime module

A single format can’t be both editor-friendly AND device-cheap. So:

Authoring = JSON (*.scene.json): neutral, diff-able, round-trippable by an editor or a human. Colors as [r,g,b], maps as 2-D grids — readable.
Runtime = a baked Python module (level1_scene.py → .mpy): compact and cheap to import. The tilemap grid becomes a bytes literal (1 byte/tile, ONE allocation), colors are pre-converted to wire RGB565, PNGs are pre-converted to PAL8/RGB565 atlases. Reuses the existing .mpy pipeline (tools/build_mpy.sh).

Why not JSON on device? json.load of a tilemap as an int array allocates a Python list of N boxed ints (~28 B each) — a 28×18 map = ~14 KB for the list alone, plus the JSON text. A bytes grid in a .mpy is ~500 B, one alloc. The two-tier split is non-negotiable for RAM on the RP2040.

A tools/scene_build.py does JSON → runtime module, invoking the existing asset converters (png2picogame, tinyjoypad_extract, cavern_pack) on referenced art. (For very large maps, the baker can emit a binary blob loaded with readinto instead — same loader, different storage — but the .mpy module is the default.)

2. The format describes DATA, not BEHAVIOR

It carries: assets, sprite/entity placement, tile semantics, layer order + fixed/HUD flag, camera setup. It does NOT carry gameplay logic (movement, AI, win conditions) — that stays in plain Python. This matches the ergonomics finding: gameplay-in-Python is comfortable; the boilerplate is scene/asset/map setup.

3. Tile properties are first-class (the big editor win)

Each tileset attaches semantic flags to tile values (solid, coin, hazard, spawn, trigger…). The editor paints tiles AND marks meaning; the loader builds fast lookup tables so the game asks level.is_solid(tx, ty) instead of hardcoding tile == 1. This generalizes the collision logic of pacman / digdug / mario / cavern.

4. One shared loader (device == simulator)

picogame_scene.load(...) builds the Scene from the runtime data using only the public picogame API, so the SAME loader runs on hardware and in the simulator — the format is validated by both automatically.

Authoring schema (JSON)

{
  "format": "picogame-scene", "version": 1,
  "size": [320, 240],
  "background": [8, 10, 24],            // -> wire rgb565 at bake time

  "assets": {                            // reusable, referenced by id
    "hero":  { "type": "bitmap",  "src": "hero.png", "format": "pal8",
               "frames": 6, "transparent": 0 },
    "tiles": { "type": "tileset", "src": "tiles.png", "tile": [16, 16], "frames": 5,
               "props": { "1": {"solid": true},
                          "2": {"coin": true},
                          "3": {"goal": true} } },
    "goomba":{ "type": "bitmap",  "src": "goomba.png", "frames": 2 }
  },

  "layers": [                            // ordered bottom -> top
    { "kind": "tilemap", "asset": "tiles", "cols": 80, "rows": 15, "pos": [0, 0],
      "map": "rle:..." },                // 2-D grid; RLE/base64 in authoring
    { "kind": "sprite", "asset": "hero", "name": "player",
      "pos": [40, 208], "anchor": [0.5, 1.0], "frame": 0,
      "data": { "lives": 3 } },
    { "kind": "group", "asset": "goomba", "anchor": [0.5, 1.0],
      "instances": [[224, 208], [480, 208], [704, 208]],
      "tag": "enemies" },
    { "kind": "particles", "capacity": 64, "size": 2, "gravity": 0.5, "fade": true,
      "name": "fx" },
    { "kind": "hudlabel", "name": "score", "pos": [4, 4],
      "fg": [255,255,255], "bg": [0,0,0] }   // fixed implied for hud kinds
  ],

  "camera": { "mode": "follow", "target": "player", "axis": "x",
              "bounds": [0, 0, 1280, 240] },

  "meta": { "editor": { "grid": 16, "name": "World 1-1" } }   // ignored by runtime
}

Notes:

assets keyed by id; layers reference by id (no duplication). src = source art the baker converts; or data for inline editor-drawn art; or blob for prebaked.
layer kinds: tilemap, sprite, group (many of one bitmap), particles, canvas, hudlabel/hud (implies fixed: true). Any layer may set "fixed": true.
name → addressable entity; tag/group → a list. Colors as [r,g,b].
camera is advisory data the game applies via set_view (follow target/axis, world bounds); games can still drive the camera themselves.
meta is free for the editor; the runtime loader ignores unknown keys (forward-compat).

Baked runtime module (what the device imports)

# level1_scene.py  (then -> level1_scene.mpy)
SCENE = {
  "bg": 0x2001,                          # pre-converted wire rgb565
  "assets": {
    "hero":  ("pal8", b"...", 12, 16, 6, 0, (0x0000, 0xF80F, ...)),  # data,w,h,frames,transp,palette
    "tiles": ("pal8", b"...", 16, 16, 5, None, (...)),
  },
  "tileprops": { "tiles": { "solid": b"\x00\x01\x00\x00\x00",        # indexed by tile value
                            "coin":  b"\x00\x00\x01\x00\x00" } },
  "layers": [
    ("tilemap", "tiles", 80, 15, 0, 0, b"\x01\x01..."),               # cols,rows,ox,oy,grid bytes
    ("sprite", "hero", "player", 40, 208, 128, 256, 0, {"lives":3}),  # anchor in 1/256
    ("group", "goomba", "enemies", 128, 256, ((224,208),(480,208),(704,208))),
    ("particles", "fx", 64, 2, 0.5, True),
    ("hudlabel", "score", 4, 4, 0xFFFF, 0x0000),
  ],
  "camera": ("follow", "player", "x", 0, 0, 1280, 240),
}

Tuples (not dicts) for layers/assets keep the .mpy small and parse-free; the loader unpacks positionally. Grid + tileprops are bytes (one alloc each).

Runtime loader API (shared device + sim)

import picogame_scene as pgs, terminalio
view = pgs.load(pg, level1_scene.SCENE, font=terminalio.FONT)
view.scene                  # the picogame.Scene (already populated + layered)
view.named["player"]        # the Sprite
view.group("enemies")       # list of Sprites
view.named["fx"]            # the particles layer
view.is_solid(tx, ty)          # tile-property query (from tileprops)
view.tile_has(tx, ty, "coin")
view.camera                 # (mode, target, axis, bounds) for the game to apply
# game then runs its own loop: read input, move view.named[...], view.scene.refresh()

What this buys

The editor reads/writes the JSON; the baker produces the .mpy; device + sim load the same data through one loader. Round-trip + live preview both work.
Map painting, entity placement, tile semantics, layering, HUD, camera setup leave the game code — only gameplay logic remains in Python.
Reuses everything already built: asset converters, .mpy pipeline, the fixed HUD layer, anchors, the simulator.

Decided: single baked module, Python loader (no multi-file / no frozen)

We considered a multi-file format (Python holds references, a separate .bin / C-owned buffer holds the “sharp” data) for RAM. Rejected after analysis:

On the filesystem (the only realistic deploy path — drag-drop to SD, no reflash), the asset bytes land in RAM either way (inline bytes, a bytearray from a .bin, or a C buffer are the same GC heap, same size). RP2040 has no memory-mapped filesystem, so “read in place” isn’t possible. Multi-file / C-owns- data gives no RAM win here.
The only thing that moves data OUT of RAM is freezing into flash — and in practice nobody compiles firmware per game, so that lever is off the table.
Therefore the multi-file split is a complication that doesn’t pay off. Dropped.

Decision: scene = ONE baked module (data inline) + the pure-Python loader. Simple, one file, drag-drop, sim-compatible. RAM is fine in practice (games sit at ~10–54 KB assets + ~20–30 KB strip buffers in ~140–190 KB heap). The ONE reason to ever split a .bin out is to avoid a giant Python source literal for a big asset (the squest 84 KB lesson) + a single clean allocation — keep that as an as-needed tactic for big single assets (cavern does it), NOT a format rule.

Remaining minor choices: authoring grid as rows+legend (current) vs 2-D int array; how much the loader enforces the camera vs. leaving it advisory (current: advisory).

Implemented field set (v2)

Authoring (editor exports these; tools/scene_build.py bakes them):

assets (shared bank): sprite/tileset/bitmap (src PNG + frame/tile, frames, transparent), rect, tileset_color; optional per-tile props (solid/coin/goal/hazard) and animations: {name:{frames,fps,loop}}.
sounds (shared bank): {id:{src}} wav references.
layers per level: multiple tilemap layers (a fg:true one draws over sprites); sprite (with name/anchor/frame/anim/data); group (tagged instances); hudlabel (camera-independent); particles.
per level also: zones:[{tag,x,y,w,h}], points:[{name,x,y}], camera, music.

Two top-level shapes: a single picogame-scene (assets inline) baked to one _scene module, or a picogame-project (assets bank + levels[]) baked to one _bank module + one _level module per level.

Runtime loader picogame_scene exposes: view.scene, view.named[...], view.group(tag), view.tick(dt) (animations), view.is_solid(tx,ty)/view.tile_has(...), view.in_zone(x,y,tag), view.point(name), view.sounds/view.play(id), view.camera; plus load_bank(pg,BANK)

load(...,bank=) for shared-bank level-by-level loading. All validated headless in the simulator (editor core.js → bake → load → render).