parsing JustHTML’s success
After trying, failing and sharing my doomed efforts to port a Perl library over to TypeScript using AI tools, I eye enviously Emil Stenström’s account (via) of writing a HTML5 parser with a 17 point summary of the journey that concluded with a working implementation.
Some thoughts on where he succeeded where I didn’t:
- The models likely understand HTML5 more than bespoke, arcane routing algorithms tuned for ASCII diagrams.
- Stenström found a reliable way to keep the agents running in a loop, something I struggled with.
- He cites Gemini 3 Pro as pivotal for speed improvements, arriving at about the right time to give him a boost. This is similar to where I leaned into
GPT-Codex-High. - Building custom tools for fuzzing, profiling and scraping contributed to Stenström’s success. His 8,500+ passing tests are an order of magnitude higher than the piddly ~100 load-bearing tests that I assumed would suffice.