sobiajulu 2 hours ago
I built safe-docx at a legal AI startup where we use it in production for contract editing. One of our law firm customers asked for more transparency into how the tools works, so we open-sourced it. We ported it from Python to TypeScript so it runs anywhere JS runs with no native dependencies. It lets coding agents make format-preserving edits to existing Word documents — read and search without blowing up your context window, apply edits that preserve formatting, and export a clean copy or tracked-changes version.
Why not just have the agent unzip the .docx and edit the raw XML? I tried that. Across 25 Common Paper and Bonterms templates, the XML is a median of 12x the size of the actual text. An agent could write regex or DOM parsing to extract the text, but that's code it generates each session — and sometimes there's a bug it has to debug. safe-docx gives the agent a compact view that only shows formatting where it's changing, stable paragraph IDs that don't drift, and a JSON edit format you can tweak and reapply. MIT licensed — fork it and customize as you like.
I use it for: NDAs, order forms, equity docs, SOC 2 policy templates auditors hand you in .docx. One-off documents you just need to fill out once and move on.
Install:
gemini extensions install https://github.com/UseJunior/safe-docx
claude mcp add safe-docx -- npx -y @usejunior/safe-docx
npm install @usejunior/safe-docx
No .NET / Python / LibreOffice dependencies.What .docx edge cases should I prioritize next?
sobiajulu 2 hours ago
Results (template: doc.xml bytes / text bytes = ratio):
bonterms-mutual-nda: 50,759 / 1,077 = 47.1x
bonterms-professional-services-agreement: 60,276 / 1,627 = 37.0x
common-paper-ai-addendum-in-app: 91,790 / 9,000 = 10.2x
common-paper-ai-addendum: 114,349 / 9,478 = 12.1x
common-paper-amendment: 35,389 / 537 = 65.9x
common-paper-business-associate-agreement:197,553 / 12,716 = 15.5x
common-paper-cloud-service-agreement: 417,787 / 37,112 = 11.3x
common-paper-csa-click-through: 128,682 / 13,580 = 9.5x
common-paper-csa-with-ai: 690,102 / 54,237 = 12.7x
common-paper-csa-with-sla: 643,237 / 53,086 = 12.1x
common-paper-csa-without-sla: 529,720 / 46,116 = 11.5x
common-paper-data-processing-agreement: 350,375 / 30,530 = 11.5x
common-paper-design-partner-agreement: 160,847 / 14,585 = 11.0x
common-paper-independent-contractor: 161,157 / 19,519 = 8.3x
common-paper-letter-of-intent: 59,796 / 4,176 = 14.3x
common-paper-mutual-nda: 47,828 / 1,426 = 33.5x
common-paper-one-way-nda: 86,211 / 7,710 = 11.2x
common-paper-order-form-with-sla: 164,494 / 7,524 = 21.9x
common-paper-order-form: 122,607 / 5,751 = 21.3x
common-paper-partnership-agreement: 255,661 / 27,562 = 9.3x
common-paper-pilot-agreement: 220,683 / 20,371 = 10.8x
common-paper-professional-services: 472,579 / 39,360 = 12.0x
common-paper-software-license: 300,638 / 33,659 = 8.9x
common-paper-statement-of-work: 99,102 / 3,582 = 27.7x
common-paper-term-sheet: 36,410 / 1,627 = 22.4x
MEDIAN: 12.1x
Reproduction script and templates are in the open-agreements repo: https://github.com/open-agreements/open-agreements/blob/main...