Skip to content
AcroForge/Ruby PDF toolkit/v0.1.0

AcroForms,
forged clean.

Take a broken-named PDF form and make it programmatically fillable in three commands. Heuristic discovery, human-reviewable mapping, deterministic rename.

$ acroforge bootstrap form.pdf
      schema.yml   14 entries
      mapping.yml  98 entries
$ $EDITOR mapping.yml
$ acroforge relabel apply form.pdf mapping.yml
      96 renamed
      2  disambiguated
$ 
01 / Pipeline

Small, focused commands. Each one reviewable.

Each command does one job and produces an artifact you can inspect before the next step touches the PDF. Pick the ones you need.

01

Prepare

prepare

Resolves PDF-internal conflicts (multiple fields sharing one name) using the heuristic's proposals. Optional: skips itself when there's nothing to resolve.

02

Bootstrap

bootstrap

Infers a starter schema and proposes a per-field mapping in one compile pass. Output is two YAML files you review.

03

Annotate

annotate

Renders a copy of the PDF with each field labeled inline, colour-coded against the mapping. Visual reference while you edit.

04

Apply

relabel apply

Reads your edited mapping and permanently rewrites the AcroForm field names. Collisions auto-disambiguate. PDF is now usable.

02 / Before / After

What you stop writing.

Same PDF. Two starting points. One is a three-hour spelunk through page0_fieldN in a PDF viewer.

Without AcroForge
$ ruby fill_form.rb
# open PDF in a viewer
# click each of 98 fields
# note position → meaning
# transcribe to ruby hash
# repeat per vendor
With AcroForge
$ acroforge prepare form.pdf
$ acroforge bootstrap form.pdf
$ acroforge annotate form.pdf --mapping mapping.yml
$ $EDITOR mapping.yml  # reviewing against annotated.pdf
$ acroforge relabel apply form.pdf mapping.yml
      done.
03 / Capabilities

What the engine does well.

Discovery, review, rename, fill, plus the details that make those steps reliable on real-world PDFs.

01

Spatial label discovery

For each cryptic field, the engine scans surrounding text with mode-aware weighted scoring across Grid-Lock, Inline Paragraph, and Standard Label layouts. The label finds you.

02

Human-readable artifacts

Every stage produces a YAML you can open in an editor. The mapping documents what was guessed, what was confident, and what you decided. Re-running propose preserves your edits.

03

Visual review

annotate renders a copy of the PDF with each field labeled inline, colour-coded against your mapping. Green for confident proposals, amber for review-needed, gray for missing. Open it next to your editor.

04

Duplicate-name resolution

Some PDFs ship with three fields all literally named date. prepare spots them and rewrites each to a unique heuristic-proposed name before the mapping is generated, so the YAML stays clean.

05

Unicode-clean labels

Ligatures (fi fl ff), curly quotes, en/em dashes, zero-width chars: NFKC plus a small substitution table normalize every extracted label so grep, search, and your mapping reviews all work on plain ASCII.

06

Schema-driven, deterministic

Declare canonical fields once. The engine canonicalises vendor variations into one key set. Validator enforces type contracts. apply! validates the whole mapping before touching the PDF, so it never half-renames anything.

04 / Surface

Shell or Ruby. Same engine.

The CLI is a thin shell over the public Ruby API. Choose the surface that fits your workflow.

Shell

bash
# Resolve duplicate AcroForm field names, if any
acroforge prepare form.pdf

# Infer schema + propose mapping in one compile pass
acroforge bootstrap form.pdf

# Visual review: open annotated.pdf alongside mapping.yml
acroforge annotate form.pdf --mapping mapping.yml

# Apply your edited mapping to the PDF in place
acroforge relabel apply form.pdf mapping.yml

CLI reference →

Ruby

ruby
require "acroforge"

# Infer schema + propose mapping in one pass
schema = AcroForge::Schema.infer("form.pdf")
AcroForge::Schema.dump(schema, "schema.yml")

# Visual review file colour-coded against the mapping
AcroForge::Annotator.annotate("form.pdf",
  mapping: "mapping.yml", out: "annotated.pdf")

# Apply mapping in place after review
AcroForge::Relabeler.apply!("form.pdf", "mapping.yml")

Library API →

Stop hand-mapping page0_field6.

Add AcroForge to your Gemfile, then walk through the quick start. Five minutes to a fillable PDF.

Released under the MIT License.