AiFormParser

Super early, highly experimental project. This is a work in progress and nothing about it is stable: APIs, the YAML schema, the UI, and the processing pipeline can all change without notice, and things are expected to break. Do not rely on it for anything important yet. A live demo runs at aiformparser.olicorne.org, but it is very experimental and often down, slow, or mid-redeploy. If it does not load, that is expected; try again later or run it locally.

About

Version 0.2.0

AiFormParser converts paper clinical surveys into CSV by running OCR and a multimodal LLM entirely in the researcher's browser. The server only stores blank survey templates. While processing runs, the page holds a screen wake lock (capped at 1 hour) so your machine does not suspend mid-job; the screen may sleep normally once it finishes. See the project README and CLAUDE.md for the full specification.

Browser compatibility

The app needs a modern browser with WebAssembly (SIMD + threads, requires the page to be cross-origin isolated, which the server already configures). WebGPU is optional but gives a meaningful per-box speedup when the local GPU stack supports it. If both the browser-compatibility list and the README's install section disagree, treat this page as authoritative for browser/OS specifics and the README as authoritative for server-side setup.

Use a Chromium-based browser. Per the wllama compatibility notes, Firefox and Safari are not supported. Firefox cannot use WebGPU in wllama's default mode, and Safari requires a slow compat build that we deliberately disable via wllama.setCompat(null). A dismissible banner shows at the top of the page if you load it in a non-Chromium browser. Recommended browsers: Chromium, Chrome, Brave, Edge, Opera.

Checking what your browser exposes

Before debugging slow inference, confirm what your browser actually reports for GPU acceleration and WebGPU:

Linux

Ubuntu 22.04, integrated GPU only (no acceleration out of the box)

Tested on a Lenovo laptop with an Intel UHD Graphics 620 (Kaby Lake Refresh). Out of the box, Chromium reported every graphics feature as "Software only" and the LLM was painfully slow (vision encoder warmup around 70 seconds, per-box inference around 30 seconds). The reason was the userspace Vulkan driver was not installed and the user account was not in the render group, so the GPU device node /dev/dri/renderD128 could not be opened by Chromium.

What fixed it (run on the host, not inside the container):

sudo apt install mesa-vulkan-drivers vulkan-tools libvulkan1
sudo usermod -aG render $USER
# fully log out and log back in so the group membership reaches new sessions
groups          # should now include "render"
vulkaninfo --summary | head -20   # should list the Intel UHD device

Then enable, in chrome://flags:

Restart Chromium and check chrome://gpu: WebGPU should now read "Hardware accelerated". A caveat we hit: even with all of the above green, llama.cpp's WebGPU backend currently fails to load on Intel Gen9 hardware with an "Entry-point uses workgroup_size(288, 1, 1) that exceeds the maximum allowed (256, 256, 64)" error in the memset compute pipeline. The shader is sized for newer GPUs and exceeds the WebGPU spec minimum that Intel UHD 620 reports. Until that is fixed upstream, you have to turn the #enable-unsafe-webgpu flag back off and accept the CPU-only path on this hardware.

Ubuntu 24.04, external NVIDIA GPU

TODO: fill in once tested end to end on the 24.04 + NVIDIA box (which proprietary driver version, whether nvidia-vulkan needed installing separately, any chrome://flags changes, observed WebGPU performance vs CPU baseline).

macOS

Not tested yet.

Windows

Not tested yet.

Browsers tested

Brave v1.90.124 (Chromium 148.1.90.124)

Works end to end on the Ubuntu 22.04 host above, in CPU-only mode. WebGPU was not enabled in this test.

Chromium 148.0.7778.167 (snap install on Ubuntu 22.04)

Application loads and runs in CPU-only mode. We could not get WebGPU working on this install: even after the flags and driver setup described above, the WebGPU compute pipeline failed with the Gen9 workgroup-size error from llama.cpp's memset shader. CPU fallback still works.

Vendored libraries

sha256 fingerprints of what the server is currently serving from app/static/vendor/. Upstream version pins are in app/static/vendor/VERSIONS.md; scripts/update-vendor.sh refreshes prebuilt packages from npm/CDN, and scripts/build-wllama.sh rebuilds wllama (and llama.cpp) from source. After a local rebuild, the wllama (wasm) row should match the sha printed by scripts/build-wllama.sh.

Library File Size Modified sha256
pdf.js pdfjs/pdf.min.mjs 437.3 KiB 2026-06-01 11:36 UTC 71ab6d3ace0b
pdf.js worker pdfjs/pdf.worker.min.mjs 1.19 MiB 2026-06-01 11:36 UTC 9e3b3b7da076
tesseract.js tesseract/tesseract.esm.min.js 61.7 KiB 2026-06-01 11:36 UTC bc4227410670
tesseract.js worker tesseract/worker.min.js 108.7 KiB 2026-06-01 11:36 UTC f113187fae22
tesseract-core relaxed-simd tesseract/tesseract-core-relaxedsimd-lstm.wasm 2.73 MiB 2026-06-01 11:36 UTC 7985c92d4c64
tesseract-core simd tesseract/tesseract-core-simd-lstm.wasm 2.73 MiB 2026-06-01 11:36 UTC 34e8d50cac21
tesseract-core plain tesseract/tesseract-core-lstm.wasm 2.72 MiB 2026-06-01 11:36 UTC 66b17df6e20c
tessdata eng tesseract-lang/eng.traineddata.gz 1.89 MiB 2026-06-01 11:36 UTC 18c1ac52b75e
tessdata fra tesseract-lang/fra.traineddata.gz 595.1 KiB 2026-06-01 11:36 UTC 9800c70d1db2
js-yaml js-yaml/js-yaml.mjs 90.2 KiB 2026-06-01 11:36 UTC 5b4536e72a22
xlsx (SheetJS) xlsx/xlsx.mini.min.js 273.0 KiB 2026-06-01 11:36 UTC 0cb353f830d7
wllama (JS) wllama/index.min.js 339.2 KiB 2026-06-01 11:36 UTC 5a6b7722f900
wllama (wasm) wllama/multi-thread/wllama.wasm 6.97 MiB 2026-06-01 11:36 UTC a3e827b9fc35