Futureproof is a live editor for GPU shaders, built on Zig, Neovim, and WebGPU.
(original shader by S. Guillitte)
It's designed for a quick feedback loop, recompiling shaders and marking errors live!
The name is tongue-in-cheek, because it builds on as many unproven new technologies as possible:
- It is written in Zig, which is a "better C"
- The editor is an embedded Neovim, which is a modernized Vim
- Graphics are done with WebGPU, which is an in-development next-gen API
The system also uses FreeType for font rasterization and GLFW for windowing, but those are both relatively mature, so I don't get any points for them.
Embedding Neovim
Neovim is run as a subprocess (nvim --embed
),
which communicates on stdin
and stdout
using the msgpack-rpc
format.
I wrote msgpack
and msgpack-rpc
libraries in Zig from the ground up,
which ended up being < 1 KLOC.
A listener thread monitors the subprocess's stdout
,
decoding messages then passing events and responses into a pair of queues.
The main loop claims and handles events before drawing each frame.
RPC calls are blocking: the call is encoded and passed to the subprocess's
stdin
, then we wait for the listener thread to return the response
on the queue.
Drawing the character grid
Neovim's abstract UI model is a monospaced grid of a particular width and height. Each cell in the grid has a character and attribute ID; the latter is a lookup into a table of colors, bold, underlines, etc.
The grid and attribute table are modified through API messages
(grid_line
, grid_cursor_goto
, hl_attr_define
, mode_change
, etc),
so our UI has to keep track of this state.
These API calls are evaluated by the main loop
and edit state in CPU memory,
but we want rendering to be done by the GPU.
As it turns out,
both Zig and GLSL can import C headers, so Futureproof uses a fun trick:
we define a set of C structs
,
populate them on the CPU (in Zig),
then copy them to the GPU and use them directly (in GLSL).
(This requires a GLSL extension
to properly pack the structs
)
We split the state into two pieces:
- Meta-data tables (attributes, font atlas, etc.) are in a uniform buffer, as they're relatively small (11 KB in total)
- The character grid is larger (1 MB), and so is passed as a storage buffer
The font is rasterized and packed into a texture on the CPU side. As part of the meta-data, we include the positions and bounding boxes of each character. The font ends up looking something like this:
The grid stores character and attribute indices packed into a uint32_t
per cell.
This limits us to 65535 characters, but that's acceptable.
In this architecture, the CPU side doesn't think about vertex positions at all!
It simply passes the character grid to a vertex shader and tells it to draw
total_tiles * 6
elements.
The vertex shader then calculates vertex positions, lays out tiles in the character grid, samples from the font texture, and so on.
Getting pixel-perfect font rendering was a comedy of (off-by-one) errors, including this particularly cursed GUI:
Platform integration
I developed Futureproof on macOS, and there were a few cases where I needed to hook into native APIs.
The wgpu-native
demo simply writes Objective-C in main.c
,
then compiles with -x objective-c
,
meaning the input source file is secretly Objective-C, rather than plain C.
This doesn't quite work with Zig, despite my best efforts:
the pass-through C compiler doesn't accept the -x
flag.
Instead, we can use the Objective-C Runtime API directly,
inspired by OS X app in Plain C.
In practice, this is surprisingly clean:
We're not defining our own classes, so we just use
sel_getUid
, objc_lookUpClass
, and objc_msgSend
.
Here's how we use it
to get a string from the pasteboard.
(Of course, this is fundamentally terrifying and shouldn't be used for production code)
Live shader previews
Futureproof attaches itself to the Neovim buffer, so it receives messages whenever the buffer changes. This lets us keep a mirror of the text on the Futureproof side of the RPC barrier.
If the text hasn't changed in some amount of time (200 ms by default),
we pass it to shaderc
and attempt to compile it from GLSL to SPIR-V.
Errors are parsed out of the return value and passed back into Neovim,
showing up in the location list and as signs in the left column.
After a shader is successfully compiled, the SPIR-V bytecode is passed into WebGPU and rendered with the rest of the GUI.
One unexpected challenge was keeping the GUI performant while rendering a shader in the same GPU queue. For challenging shaders like this seascape, the shader takes hundreds of milliseconds to render.
There's got to be a good way to handle this, but for Futureproof, I used a hack: when a shader takes too long, the preview image is split into tiles, each of which updates once per GUI frame. The tiles are stored in a separate texture, which is copied into the main texture once all tiles have been rendered.
This effectively slows down the shader by a factor of n^2
,
where n
is the number of tiles per side.
We pick n
to keep the GUI responsive,
with a median filter to stop single slow frames from messing things up.
I'd love to hear the correct way to handle this with WebGPU, but suspect that it will require support for multiple queues, which isn't yet in the standard.
Are these technologies ready for use?
Neovim
Yes, it's relatively stable.
I found one bug early in development, which received no feedback, so that's a minor red flag.
More recently, I discovered a more serious issue which makes it easy to deadlock Futureproof. This more concerning, but it may be a problem with how I architected the GUI and live error-marking system; I'm waiting to hear back from the Neovim development team about this issue.
Zig
Absolutely not! This is to be expected: it's at 0.7.0, so it's not supposed to be stable or bug-free.
I discovered multiple bugs, and understanding the standard library requires reading the source.
In addition, it's changing very quickly; updating to the latest nightly build breaks Futureproof regularly.
Still, Zig is fun!
Things that I particularly like:
- The general-purpose allocator, which can print memory leaks on program exit. This is like running in Valgrind all the time, and makes writing leak-free code part of normal development.
- C library interop is fantastic: you can natively import header files, then call into C libraries, and it all just works!
- The philosophy of passing explicit allocators around; particularly how arena allocators can then be used for easy memory management.
I want to like comptime
evaluation,
but it's a little too shaky right now:
I tried to use it for a
generic msgpack
struct packer,
and often ended up confused whether my code was wrong or the compiler was broken.
More generally, I fear that comptime
is too powerful:
without some kind of concept or trait system, it can be a free-for-all.
For example, using comptime
type variables to do generics
is extremely clever,
but it means that generating documentation will be a challenge:
after all, it's powerful enough to return entirely different APIs
depending on the type!
(Of course, this is also true for C++ templates, but they're hard enough to use that most people don't get too weird with them)
Similarly, error handling is a little rough: there's no way to attach data (e.g. a message) to an error while using the language-level error handling.
Finally, I'm a little iffy on the
"strings are simply arrays of u8
" philosophy.
Though static strings are guaranteed to be UTF-8,
the onus is on the programmer
(rather than the type system)
to enforce that strings from other sources be correctly encoded.
For a detailed look at where this can break,
see the discussion of std::fs::metadata
in
this post
WebGPU
WebGPU is fine, but the documentation is lacking. I had to reverse-engineer a lot of behavior from examples, reading the source code, and pre-existing knowledge of modern graphics APIs. As discussed above, the lack of multi-queue rendering was the only real frustration.
I was using the wgpu-native
bindings,
which seem a bit unloved compared to wgpu-rs
.
There's one obvious bug
which I encountered within 5 minutes,
and it's unclear how often it's synched with wgpu-rs
's releases.
It's frustrating needing shaderc
to go from text to SPIR-V,
particularly because a full download unzips to 1.9GB (!!).
It looks like naga
aims to replace it, so I'm optimistic about the future!
Future plans
This was a fun exercise, but I'm not planning to develop it any further.
(It's likely that I'll build a similar system in Rust, next time I want a framework for semi-interactive graphics programming)
The code is on Github, and forks are welcome; if one achieves critical momentum, I'd be happy to link it here.