Skip to content

The firmware build

How the official PicoPad firmware is configured and tuned. This is the build that ships the native picogame engine — what’s compiled in, plus two RP2040-specific build choices: the compiler optimization and the display SPI clock.

You don’t need any of this to write games — it’s for building or understanding the firmware itself.

The firmware is a general-purpose CircuitPython build with the engine added — the goal is one image where everything the device can do works, not a stripped game-only ROM. So ulab, synthio, the full audio family, displayio, bitmaptools, vectorio, Wi-Fi, keypad and the rest stay on. A few modules are turned off — but only ones this device can’t use or that are provided another way:

ModuleStateWhy
picogame (+ fast DMA display)onthe native 2D engine
Wi-Fi / CYW43 stackonstatus LED + future multiplayer
ulab, synthio, audio, displayio, bitmaptools, vectorio, …ongeneral-purpose, they all fit
native _stageoffugame/stage compatibility is provided in Python by picogame-stage, so the C module is redundant
picodvi, _eveoffno HDMI/DVI or FT8xx hardware on this device
qriooffQR decode needs a camera the PicoPad doesn’t have; this also drops its ~32 KB quirc backend (QR generation via adafruit_miniqr is unaffected)

With all of the above the image lands around 88% of the 1.5 MB firmware region.

Compiler optimization: tuned −O2, not −O3

Section titled “Compiler optimization: tuned −O2, not −O3”

CircuitPython’s rp2 port defaults to -O3. On the PicoPad’s Cortex-M0+ (no SIMD, no FPU, a small 16 KB XIP flash cache) most of what -O3 adds over -O2 is dead weight for this code:

  • the auto-vectorizer (-ftree-loop-vectorize / -ftree-slp-vectorize) — there are no SIMD instructions to target, so it can’t help;
  • -fipa-cp-clone (function cloning) and the heavy loop unroll/peel/version passes — they grow flash without speeding up the engine’s tight scalar pixel loops.

On the M0+, -O2 plus five cheap loop passes gives the same engine speed as -O3 (within about 1%) while using ~150 KB less flash:

OPTIMIZATION_FLAGS = -O2 -funswitch-loops -fpredictive-commoning -fgcse-after-reload \
-ftree-partial-pre -fsplit-paths

The MicroPython interpreter core (gc.o, vm.o) stays at -O3 regardless via CircuitPython’s SUPEROPT_* settings, so Python execution speed is unaffected.

On top of that, the engine’s hottest loop — the plain sprite blit, the most common operation — carries a #pragma GCC unroll 4: ~6% faster on the M0+ for +0.6 KB. It’s applied to that one loop, not globally (-funroll-loops across the whole firmware would overflow the flash region).

The board requests 62.5 MHz for the ST7789, not 60. The RP2040’s PL022 SPI peripheral can only divide clk_peri (125 MHz) by even integers, and the requested baudrate is a ceiling rounded down — so asking for 60 MHz silently gives 125/4 = 31.25 MHz, half speed, while 62.5 MHz is exactly 125/2 and runs at the full rate (~2× the display throughput). The full derivation, the overclocking story and the RP2350 case are in Clocks, SPI & display limits.

The engine lives in a CircuitPython fork; build the pajenicko_picopad board from it and flash the resulting .uf2 over BOOTSEL like any CircuitPython firmware. See Run on hardware for the device side and Fit it in RAM for the RAM budget.