logo

Show HN: do your startup paperwork with coding agents in .docx files

Posted by sobiajulu |2 hours ago |2 comments

sobiajulu 2 hours ago

Coding agents have made development dramatically faster, but paperwork is still manual. Every time I need to update a dollar amount in a SAFE, fill in company details on a cloud services agreement, or negotiate an NDA, I'm back in Word doing it by hand. For anyone at a startup, more of your time is now spent on .docx tasks precisely because the coding got faster.

I built safe-docx at a legal AI startup where we use it in production for contract editing. One of our law firm customers asked for more transparency into how the tools works, so we open-sourced it. We ported it from Python to TypeScript so it runs anywhere JS runs with no native dependencies. It lets coding agents make format-preserving edits to existing Word documents — read and search without blowing up your context window, apply edits that preserve formatting, and export a clean copy or tracked-changes version.

Why not just have the agent unzip the .docx and edit the raw XML? I tried that. Across 25 Common Paper and Bonterms templates, the XML is a median of 12x the size of the actual text. An agent could write regex or DOM parsing to extract the text, but that's code it generates each session — and sometimes there's a bug it has to debug. safe-docx gives the agent a compact view that only shows formatting where it's changing, stable paragraph IDs that don't drift, and a JSON edit format you can tweak and reapply. MIT licensed — fork it and customize as you like.

I use it for: NDAs, order forms, equity docs, SOC 2 policy templates auditors hand you in .docx. One-off documents you just need to fill out once and move on.

Install:

  gemini extensions install https://github.com/UseJunior/safe-docx
  claude mcp add safe-docx -- npx -y @usejunior/safe-docx
  npm install @usejunior/safe-docx
No .NET / Python / LibreOffice dependencies.

What .docx edge cases should I prioritize next?

sobiajulu 2 hours ago

Size ratio methodology: I measured 25 .docx templates from Common Paper (https://commonpaper.com) and Bonterms (https://bonterms.com) — open-source standard agreement templates anyone can download. I excluded our own templates to avoid cherry-picking. Short-form templates (amendments, term sheets) skew the high end because they're mostly formatting with little text; the median is representative of typical multi-page agreements.

Results (template: doc.xml bytes / text bytes = ratio):

  bonterms-mutual-nda:                      50,759 /   1,077 = 47.1x
  bonterms-professional-services-agreement:  60,276 /   1,627 = 37.0x
  common-paper-ai-addendum-in-app:           91,790 /   9,000 = 10.2x
  common-paper-ai-addendum:                114,349 /   9,478 = 12.1x
  common-paper-amendment:                    35,389 /     537 = 65.9x
  common-paper-business-associate-agreement:197,553 /  12,716 = 15.5x
  common-paper-cloud-service-agreement:     417,787 /  37,112 = 11.3x
  common-paper-csa-click-through:           128,682 /  13,580 =  9.5x
  common-paper-csa-with-ai:                690,102 /  54,237 = 12.7x
  common-paper-csa-with-sla:               643,237 /  53,086 = 12.1x
  common-paper-csa-without-sla:            529,720 /  46,116 = 11.5x
  common-paper-data-processing-agreement:  350,375 /  30,530 = 11.5x
  common-paper-design-partner-agreement:   160,847 /  14,585 = 11.0x
  common-paper-independent-contractor:     161,157 /  19,519 =  8.3x
  common-paper-letter-of-intent:            59,796 /   4,176 = 14.3x
  common-paper-mutual-nda:                  47,828 /   1,426 = 33.5x
  common-paper-one-way-nda:                 86,211 /   7,710 = 11.2x
  common-paper-order-form-with-sla:        164,494 /   7,524 = 21.9x
  common-paper-order-form:                 122,607 /   5,751 = 21.3x
  common-paper-partnership-agreement:      255,661 /  27,562 =  9.3x
  common-paper-pilot-agreement:            220,683 /  20,371 = 10.8x
  common-paper-professional-services:      472,579 /  39,360 = 12.0x
  common-paper-software-license:           300,638 /  33,659 =  8.9x
  common-paper-statement-of-work:           99,102 /   3,582 = 27.7x
  common-paper-term-sheet:                  36,410 /   1,627 = 22.4x
  MEDIAN:                                                       12.1x
Reproduction script and templates are in the open-agreements repo: https://github.com/open-agreements/open-agreements/blob/main...