Futureproof

Futureproof is a live editor for GPU shaders, built on Zig, Neovim, and WebGPU.

Marble shader
(original shader by S. Guillitte)

It's designed for a quick feedback loop, recompiling shaders and marking errors live!

The name is tongue-in-cheek, because it builds on as many unproven new technologies as possible:

It is written in Zig, which is a "better C"
The editor is an embedded Neovim, which is a modernized Vim
Graphics are done with WebGPU, which is an in-development next-gen API

The system also uses FreeType for font rasterization and GLFW for windowing, but those are both relatively mature, so I don't get any points for them.

Embedding Neovim

Neovim is run as a subprocess (nvim --embed), which communicates on stdin and stdout using the msgpack-rpc format.

I wrote msgpack and msgpack-rpc libraries in Zig from the ground up, which ended up being < 1 KLOC.

A listener thread monitors the subprocess's stdout, decoding messages then passing events and responses into a pair of queues. The main loop claims and handles events before drawing each frame.

RPC calls are blocking: the call is encoded and passed to the subprocess's stdin, then we wait for the listener thread to return the response on the queue.

Drawing the character grid

Neovim's abstract UI model is a monospaced grid of a particular width and height. Each cell in the grid has a character and attribute ID; the latter is a lookup into a table of colors, bold, underlines, etc.

The grid and attribute table are modified through API messages (grid_line, grid_cursor_goto, hl_attr_define, mode_change, etc), so our UI has to keep track of this state.

These API calls are evaluated by the main loop and edit state in CPU memory, but we want rendering to be done by the GPU. As it turns out, both Zig and GLSL can import C headers, so Futureproof uses a fun trick: we define a set of C structs, populate them on the CPU (in Zig), then copy them to the GPU and use them directly (in GLSL).

(This requires a GLSL extension to properly pack the structs)

We split the state into two pieces:

Meta-data tables (attributes, font atlas, etc.) are in a uniform buffer, as they're relatively small (11 KB in total)
The character grid is larger (1 MB), and so is passed as a storage buffer

The font is rasterized and packed into a texture on the CPU side. As part of the meta-data, we include the positions and bounding boxes of each character. The font ends up looking something like this:

Font texture

The grid stores character and attribute indices packed into a uint32_t per cell. This limits us to 65535 characters, but that's acceptable.

In this architecture, the CPU side doesn't think about vertex positions at all! It simply passes the character grid to a vertex shader and tells it to draw total_tiles * 6 elements.

The vertex shader then calculates vertex positions, lays out tiles in the character grid, samples from the font texture, and so on.

Getting pixel-perfect font rendering was a comedy of (off-by-one) errors, including this particularly cursed GUI:

Haunted Neovim GUI

Platform integration

I developed Futureproof on macOS, and there were a few cases where I needed to hook into native APIs.

The wgpu-native demo simply writes Objective-C in main.c, then compiles with -x objective-c, meaning the input source file is secretly Objective-C, rather than plain C.

This doesn't quite work with Zig, despite my best efforts: the pass-through C compiler doesn't accept the -x flag. Instead, we can use the Objective-C Runtime API directly, inspired by OS X app in Plain C.

In practice, this is surprisingly clean: We're not defining our own classes, so we just use sel_getUid, objc_lookUpClass, and objc_msgSend. Here's how we use it to get a string from the pasteboard.

(Of course, this is fundamentally terrifying and shouldn't be used for production code)

Live shader previews

Futureproof attaches itself to the Neovim buffer, so it receives messages whenever the buffer changes. This lets us keep a mirror of the text on the Futureproof side of the RPC barrier.

If the text hasn't changed in some amount of time (200 ms by default), we pass it to shaderc and attempt to compile it from GLSL to SPIR-V. Errors are parsed out of the return value and passed back into Neovim, showing up in the location list and as signs in the left column.

Error

After a shader is successfully compiled, the SPIR-V bytecode is passed into WebGPU and rendered with the rest of the GUI.

One unexpected challenge was keeping the GUI performant while rendering a shader in the same GPU queue. For challenging shaders like this seascape, the shader takes hundreds of milliseconds to render.

Seascape

There's got to be a good way to handle this, but for Futureproof, I used a hack: when a shader takes too long, the preview image is split into tiles, each of which updates once per GUI frame. The tiles are stored in a separate texture, which is copied into the main texture once all tiles have been rendered.

This effectively slows down the shader by a factor of n^2, where n is the number of tiles per side. We pick n to keep the GUI responsive, with a median filter to stop single slow frames from messing things up.

I'd love to hear the correct way to handle this with WebGPU, but suspect that it will require support for multiple queues, which isn't yet in the standard.

Are these technologies ready for use?

Neovim

Yes, it's relatively stable.

I found one bug early in development, which received no feedback, so that's a minor red flag.

More recently, I discovered a more serious issue which makes it easy to deadlock Futureproof. This more concerning, but it may be a problem with how I architected the GUI and live error-marking system; I'm waiting to hear back from the Neovim development team about this issue.

Zig

Absolutely not! This is to be expected: it's at 0.7.0, so it's not supposed to be stable or bug-free.

I discovered multiple bugs, and understanding the standard library requires reading the source.

In addition, it's changing very quickly; updating to the latest nightly build breaks Futureproof regularly.

Still, Zig is fun!

Things that I particularly like:

The general-purpose allocator, which can print memory leaks on program exit. This is like running in Valgrind all the time, and makes writing leak-free code part of normal development.
C library interop is fantastic: you can natively import header files, then call into C libraries, and it all just works!
The philosophy of passing explicit allocators around; particularly how arena allocators can then be used for easy memory management.

I want to like comptime evaluation, but it's a little too shaky right now: I tried to use it for a generic msgpack struct packer, and often ended up confused whether my code was wrong or the compiler was broken.

More generally, I fear that comptime is too powerful: without some kind of concept or trait system, it can be a free-for-all. For example, using comptime type variables to do generics is extremely clever, but it means that generating documentation will be a challenge: after all, it's powerful enough to return entirely different APIs depending on the type!

(Of course, this is also true for C++ templates, but they're hard enough to use that most people don't get too weird with them)

Similarly, error handling is a little rough: there's no way to attach data (e.g. a message) to an error while using the language-level error handling.

Finally, I'm a little iffy on the "strings are simply arrays of u8" philosophy. Though static strings are guaranteed to be UTF-8, the onus is on the programmer (rather than the type system) to enforce that strings from other sources be correctly encoded. For a detailed look at where this can break, see the discussion of std::fs::metadata in this post

WebGPU

WebGPU is fine, but the documentation is lacking. I had to reverse-engineer a lot of behavior from examples, reading the source code, and pre-existing knowledge of modern graphics APIs. As discussed above, the lack of multi-queue rendering was the only real frustration.

I was using the wgpu-native bindings, which seem a bit unloved compared to wgpu-rs. There's one obvious bug which I encountered within 5 minutes, and it's unclear how often it's synched with wgpu-rs's releases.

It's frustrating needing shaderc to go from text to SPIR-V, particularly because a full download unzips to 1.9GB (!!). It looks like naga aims to replace it, so I'm optimistic about the future!

Future plans

This was a fun exercise, but I'm not planning to develop it any further.

(It's likely that I'll build a similar system in Rust, next time I want a framework for semi-interactive graphics programming)

The code is on Github, and forks are welcome; if one achieves critical momentum, I'd be happy to link it here.

Bonus Shader

Here's a very bad Cornell Box Raytracer that I threw together, basically so I could have a project thumbnail:

Cornell box

Matt Keeter // Futureproof