Static arena planner
Every buffer is sized and placed at compile time. Zero heap, zero fragmentation, deterministic worst-case RAM you can print on the datasheet.
embedded edge-AI · runtime + compiler + board
Fluxcore takes a trained model and lowers it to bare-metal C that runs on parts you can hold under a loupe — Cortex-M0+, RISC-V, ESP32. No Linux, no accelerator, no cloud round-trip. Kilobytes of RAM. Milliwatts of power. Inference in microseconds.
Hand it a quantized model; get back a static .a and a header. Fluxcore plans
every tensor into a fixed arena at build time — no allocator runs on the device, ever.
$ flux compile kws.tflite --target cortex-m4f --arena 16k
› parsing graph ............ 14 ops, 9 tensors
› quantize ................. int8, per-channel
› planning arena ........... 11.0 KB / 16 KB ok
› emitting kws_model.c ..... 41 KB flash
› emitting kws_model.h
✓ built in 0.42s — 312 µs / inference @ 64 MHz
$ flux flash --board flx-01 --port /dev/ttyACM0
› erasing ... writing 41216 B ... verified
✓ running. say "fluxcore" to wake.
Every buffer is sized and placed at compile time. Zero heap, zero fragmentation, deterministic worst-case RAM you can print on the datasheet.
Post-training or QAT. Bring a .tflite, ONNX, or our own .flx graph. Fixed-point kernels hand-tuned for CMSIS-NN and the RISC-V P-ext.
#include "model.h", call flx_invoke(). No RTOS dependency, MISRA-clean output, reproducible builds keyed by graph hash.
Forty-plus parts across three ISAs in the support matrix. Kernels fall back to portable C so an unlisted part still runs — just slower.
| MCU | Core | SRAM | Clock | KWS latency | Accel |
|---|---|---|---|---|---|
| STM32G0B1 | Cortex-M0+ | 144 KB | 64 MHz | 1.9 ms | — |
| STM32U585 | Cortex-M33 | 786 KB | 160 MHz | 148 µs | Helium |
| nRF52840 | Cortex-M4F | 256 KB | 64 MHz | 312 µs | DSP |
| ESP32-S3 | Xtensa LX7 | 512 KB | 240 MHz | 96 µs | SIMD |
| CH32V307 | RISC-V RV32 | 64 KB | 144 MHz | 540 µs | P-ext |
| RP2040 | dual M0+ | 264 KB | 133 MHz | 880 µs | PIO |
An open-hardware dev board built to be probed. Castellated edges, every rail on a labelled test point, a mic + IMU + camera header so you can flash a model and watch it fire in one sitting.
RISC-V P-extension kernels. int8 conv now uses packed-SIMD on CH32V307 — KWS drops 540→210 µs.
Helium (MVE) backend. Cortex-M55/M85 vector path. Person-detect 96×96 under 12 ms.
Arena planner v2. Tensor lifetime graph-colouring cuts peak SRAM 18% on transformer-lite models.
FLX-01 rev C. Added 6-axis IMU + camera header; QSPI bumped to 4 MB. KiCad sources pushed.
“I had wake-word running on a coin cell over a weekend. The arena planner told me exactly how much RAM I had left — no guessing, no crashes at 3am.”
“We retrofitted 400 motors with bearing-anomaly detection on M0+ parts we already stocked. No gateway, no cloud bill. Latency is 140 µs and the audit liked the determinism.”
“The FLX-01 is the first dev board I didn't want to hide in a project box. Test points everywhere. It belongs on the bench, under the loupe.”