Take a broken-named PDF form and make it programmatically fillable in three commands. Heuristic discovery, human-reviewable mapping, deterministic rename.
$ acroforge bootstrap form.pdf schema.yml 14 entries mapping.yml 98 entries $ $EDITOR mapping.yml $ acroforge relabel apply form.pdf mapping.yml 96 renamed 2 disambiguated $
Each command does one job and produces an artifact you can inspect before the next step touches the PDF. Pick the ones you need.
Resolves PDF-internal conflicts (multiple fields sharing one name) using the heuristic's proposals. Optional: skips itself when there's nothing to resolve.
Infers a starter schema and proposes a per-field mapping in one compile pass. Output is two YAML files you review.
Renders a copy of the PDF with each field labeled inline, colour-coded against the mapping. Visual reference while you edit.
Reads your edited mapping and permanently rewrites the AcroForm field names. Collisions auto-disambiguate. PDF is now usable.
Same PDF. Two starting points. One is a three-hour spelunk through page0_fieldN in a PDF viewer.
$ ruby fill_form.rb # open PDF in a viewer # click each of 98 fields # note position → meaning # transcribe to ruby hash # repeat per vendor
$ acroforge prepare form.pdf $ acroforge bootstrap form.pdf $ acroforge annotate form.pdf --mapping mapping.yml $ $EDITOR mapping.yml # reviewing against annotated.pdf $ acroforge relabel apply form.pdf mapping.yml done.
Discovery, review, rename, fill, plus the details that make those steps reliable on real-world PDFs.
For each cryptic field, the engine scans surrounding text with mode-aware weighted scoring across Grid-Lock, Inline Paragraph, and Standard Label layouts. The label finds you.
Every stage produces a YAML you can open in an editor. The mapping documents what was guessed, what was confident, and what you decided. Re-running propose preserves your edits.
annotate renders a copy of the PDF with each field labeled inline, colour-coded against your mapping. Green for confident proposals, amber for review-needed, gray for missing. Open it next to your editor.
Some PDFs ship with three fields all literally named date. prepare spots them and rewrites each to a unique heuristic-proposed name before the mapping is generated, so the YAML stays clean.
Ligatures (fi fl ff), curly quotes, en/em dashes, zero-width chars: NFKC plus a small substitution table normalize every extracted label so grep, search, and your mapping reviews all work on plain ASCII.
Declare canonical fields once. The engine canonicalises vendor variations into one key set. Validator enforces type contracts. apply! validates the whole mapping before touching the PDF, so it never half-renames anything.
The CLI is a thin shell over the public Ruby API. Choose the surface that fits your workflow.
# Resolve duplicate AcroForm field names, if any
acroforge prepare form.pdf
# Infer schema + propose mapping in one compile pass
acroforge bootstrap form.pdf
# Visual review: open annotated.pdf alongside mapping.yml
acroforge annotate form.pdf --mapping mapping.yml
# Apply your edited mapping to the PDF in place
acroforge relabel apply form.pdf mapping.ymlrequire "acroforge"
# Infer schema + propose mapping in one pass
schema = AcroForge::Schema.infer("form.pdf")
AcroForge::Schema.dump(schema, "schema.yml")
# Visual review file colour-coded against the mapping
AcroForge::Annotator.annotate("form.pdf",
mapping: "mapping.yml", out: "annotated.pdf")
# Apply mapping in place after review
AcroForge::Relabeler.apply!("form.pdf", "mapping.yml")Add AcroForge to your Gemfile, then walk through the quick start. Five minutes to a fillable PDF.