A focused pipeline to parse medical guidelines (PDF/HTML) into structured JSON for downstream clinical RAG or summarization. This implements models, parsers, normalization utils, and a CLI to ingest ...
The program takes 0.03 seconds, which is nearly a 10x performance improvement compared to the JavaScript implementation. The test file contains approximately 15,000 lines of code.