Get Started
← Back to Blog

What is PDF Metadata (Creator/Producer/ModDate) and What It Can Reveal

April 18, 2026• 5 min read

Every PDF file contains hidden information beyond its visible content. This metadata can reveal details about who created the document, what software was used, and when it was last modified. Understanding PDF metadata is essential for document forensics, privacy, and compliance.

Two Types of PDF Metadata

PDF files can store metadata in two different locations:

PDF Metadata Types
Document Info Traditional • Key-Value Author Title Creator Producer CreationDate ModDate Subject Keywords 8 standard fields XMP Metadata Modern • XML-based All Document Info fields + Edit History Rights Info Thumbnails Custom Fields Extensible • Rich data

1. Document Info Dictionary

The traditional metadata storage in PDF files. It's a simple key-value structure that typically includes:

  • Title - Document title
  • Author - Name of the document creator
  • Subject - Document topic or subject
  • Keywords - Searchable keywords
  • Creator - Application used to create the original content (e.g., Microsoft Word)
  • Producer - Application that converted/created the PDF (e.g., Adobe PDF Library)
  • CreationDate - When the PDF was first created
  • ModDate - When the PDF was last modified

2. XMP Metadata

XMP (Extensible Metadata Platform) is a more modern, XML-based metadata format. It can store all the information from the Document Info dictionary plus:

  • Detailed editing history
  • Rights management information
  • Custom metadata fields
  • Thumbnail images

What Metadata Can Reveal

What Metadata Reveals
PDF metadata Author Who created it Creator/Producer Software used Dates When edited Organization Company info

Software and Workflow

The Creator and Producer fields reveal your document creation workflow. For example:

  • Creator: Microsoft Word + Producer: Adobe PDF Library indicates the document was created in Word and converted using Adobe tools
  • Producer: Ghostscript might indicate command-line PDF processing
  • Creator: Scanner XYZ reveals the document was scanned

Authorship

The Author field often contains:

  • Username of the person who created the document
  • Organization name
  • Computer account name (which can be personally identifying)

Modification History

When dates don't match, it can indicate editing:

  • CreationDate ≠ ModDate suggests the file was modified after creation
  • Multiple saves may leave traces in incremental updates
  • Different Producer values in different parts of the file can indicate post-processing

Privacy Concerns

PDF metadata can inadvertently expose:

  1. Personal Information - Your name, username, or email address
  2. Software Versions - Potentially revealing security vulnerabilities
  3. Organization Details - Company names, departments
  4. File System Paths - Sometimes embedded in metadata
  5. Editing History - Who modified the document and when

Checking Your PDF's Metadata

Before sharing sensitive documents, you should check what metadata they contain. Our Check PDF Edits tool analyzes both Document Info and XMP metadata, showing you exactly what information your PDF contains.

Removing Metadata

For privacy and compliance, consider removing metadata before sharing documents. Our Sanitize PDF tool removes:

  • All Document Info dictionary fields
  • XMP metadata streams
  • Editing history traces
  • Incremental update signatures

Real-World Examples

Example 1: Resume Leak

A job applicant sends a resume as PDF. The metadata reveals:

  • Author: "John Smith - ABC Corp"
  • Creator: Microsoft Word 2019
  • ModDate: Modified 2 hours before applying

The recruiter now knows the candidate is currently employed at a competitor.

Example 2: Contract Manipulation Detection

A signed contract arrives with:

  • CreationDate: January 15, 2024
  • ModDate: March 3, 2024
  • Producer: Changed from original

This discrepancy suggests the contract was modified after the original signing date.

Example 3: Anonymous Document Deanonymized

A "whistleblower" document contains:

  • Author: Username matching an employee
  • Creator: Software only used in one department
  • File path in XMP: Reveals department server

The source is identified through metadata alone.

Metadata in Different PDF Types

PDF TypeTypical MetadataPrivacy Risk
Scanned documentsScanner info, scan dateLow
Word exportsAuthor, company, softwareHigh
Web capturesURL, browser, dateMedium
Merged PDFsMultiple authors, datesVariable

How Metadata Gets Added

  1. At creation - Software automatically adds Creator, Producer, dates
  2. From templates - Inherited from document templates
  3. During editing - ModDate updated, possibly new Producer
  4. Through conversion - New Producer added, original Creator preserved

Best Practices

  1. Always check metadata before sharing sensitive documents
  2. Use sanitization for documents going to external parties
  3. Establish metadata policies in your organization
  4. Consider the source when receiving documents - metadata can reveal origins
  5. Use clean templates - Remove metadata from document templates
  6. Automate sanitization - Add to document workflows

Conclusion

PDF metadata is a double-edged sword. It's useful for document management but can expose sensitive information. Understanding what metadata exists in your PDFs is the first step toward better document privacy and security.


Want to check your PDF's metadata? Try our Check PDF Edits tool to see exactly what information your documents contain.

Related Articles

See Also

Try CleanPDF

Analyze your PDFs for editing traces or remove metadata for privacy.