Question 1

Does it work with Express and Next.js?

Accepted Answer

Yes. PDFDancer is framework-agnostic. Works in Express.js routes, Next.js API routes, Fastify, Hapi, or any Node.js runtime. Just import, open a PDF, redact, and save — no special setup required.

Question 2

Can I run redaction in AWS Lambda?

Accepted Answer

Yes. PDFDancer is stateless and under 5MB after compression. Works perfectly in Lambda, including S3-triggered workflows. No persistent storage or heap management needed.

Question 3

Is redaction permanent or overlay-based?

Accepted Answer

Permanent binary-level removal. The redacted content is deleted from the PDF file itself — not covered up, not overlaid, not hidden. The bytes are gone.

Question 4

Is there an audit trail?

Accepted Answer

Yes. Every redaction is logged with timestamp, entity type (SSN, Email, Phone, etc.), and confidence score. Useful for compliance reporting and debugging.

Question 5

Is PDFDancer redaction HIPAA-compliant?

Accepted Answer

Yes. PDFDancer performs true content removal (not overlays) and supports audit trails of redacted content. When you redact text with PDFDancer, the original content is permanently deleted from the PDF — nothing can be recovered. This meets HIPAA's requirement for secure destruction. Pair it with proper deployment (on-prem for data residency if needed) and you have HIPAA-ready infrastructure.

Question 6

What does redaction cost?

Accepted Answer

ML-based redaction is available as an add-on on Pro ($199/month) and Enterprise plans at $0.20/page. PDFDancer's core SDK has a free tier for development and testing, but ML-based redaction requires a paid plan.

Feature	PDFDancer	Apryse	Adobe PDF Services	pdf-lib
ML-Powered PII Detection	✓ Entity detection with confidence scores	Limited patterns	Cloud-only API	✗ Text-only, no redaction
Permanent Removal	✓ Binary-level deletion	Annotation-based	Requires separate sanitize	✗ No redaction support
Audit Trail	✓ Full logging with timestamps	Limited metadata	Per-API-call logging	✗ No tracking
Express/Lambda Support	✓ Async/await, stateless	Requires heap allocation	HTTP client required	✓ Browser-focused
Self-Hosted	✓ Yes, on-prem available	✓ Yes (expensive)	✗ Cloud-only	✓ Yes (no redaction)
Pricing	Free tier + usage-based	$10K+/year per dev	$$$ per API call	Open source (Apache 2.0)

Category	Precision	Recall	F1 Score
Person	97.43%	96.28%	0.969
Dates of Birth	100.00%	92.57%	0.961
Account Number / SSN	85.27%	93.93%	0.894
Addresses	99.43%	91.22%	0.951
Phone / Fax Numbers	94.12%	96.30%	0.952
Email Addresses	99.58%	99.98%	0.998

ML-Powered PDF Redaction for Node.js — Remove PII from Any PDF

Why PDF Redaction Is Harder Than It Looks

The Limitations

What PDFDancer Changes

PII Redaction in Node.js

PDFDancer vs. Apryse, Adobe, pdf-lib

ML-Powered Detection Benchmarks

Three Steps to Your First Redaction

Install the Package

Get Your API Key

Run Your First Redaction

Frequently Asked Questions

Let’s Talk About Your Use Case

ML-Powered PDF Redaction for Node.js — Remove PII from Any PDF

Why PDF Redaction Is Harder Than It Looks

The Limitations

What PDFDancer Changes

PII Redaction in Node.js

PDFDancer vs. Apryse, Adobe, pdf-lib

ML-Powered Detection Benchmarks

Three Steps to Your First Redaction

Install the Package

Get Your API Key

Run Your First Redaction

Frequently Asked Questions

Explore Related Topics

PDFDancer Redaction

Python Redaction

Java Redaction

Node.js SDK

How-to: Redact PDFs

How-to: Batch Redaction

Let’s Talk About Your Use Case