What is PDF Metadata (Creator/Producer/ModDate) and What It Can Reveal
Every PDF file contains hidden information beyond its visible content. This metadata can reveal details about who created the document, what software was used, and when it was last modified. Understanding PDF metadata is essential for document forensics, privacy, and compliance.
Two Types of PDF Metadata
PDF files can store metadata in two different locations:
1. Document Info Dictionary
The traditional metadata storage in PDF files. It's a simple key-value structure that typically includes:
- Title - Document title
- Author - Name of the document creator
- Subject - Document topic or subject
- Keywords - Searchable keywords
- Creator - Application used to create the original content (e.g., Microsoft Word)
- Producer - Application that converted/created the PDF (e.g., Adobe PDF Library)
- CreationDate - When the PDF was first created
- ModDate - When the PDF was last modified
2. XMP Metadata
XMP (Extensible Metadata Platform) is a more modern, XML-based metadata format. It can store all the information from the Document Info dictionary plus:
- Detailed editing history
- Rights management information
- Custom metadata fields
- Thumbnail images
What Metadata Can Reveal
Software and Workflow
The Creator and Producer fields reveal your document creation workflow. For example:
Creator: Microsoft Word+Producer: Adobe PDF Libraryindicates the document was created in Word and converted using Adobe toolsProducer: Ghostscriptmight indicate command-line PDF processingCreator: Scanner XYZreveals the document was scanned
Authorship
The Author field often contains:
- Username of the person who created the document
- Organization name
- Computer account name (which can be personally identifying)
Modification History
When dates don't match, it can indicate editing:
- CreationDate ≠ ModDate suggests the file was modified after creation
- Multiple saves may leave traces in incremental updates
- Different Producer values in different parts of the file can indicate post-processing
Privacy Concerns
PDF metadata can inadvertently expose:
- Personal Information - Your name, username, or email address
- Software Versions - Potentially revealing security vulnerabilities
- Organization Details - Company names, departments
- File System Paths - Sometimes embedded in metadata
- Editing History - Who modified the document and when
Checking Your PDF's Metadata
Before sharing sensitive documents, you should check what metadata they contain. Our Check PDF Edits tool analyzes both Document Info and XMP metadata, showing you exactly what information your PDF contains.
Removing Metadata
For privacy and compliance, consider removing metadata before sharing documents. Our Sanitize PDF tool removes:
- All Document Info dictionary fields
- XMP metadata streams
- Editing history traces
- Incremental update signatures
Real-World Examples
Example 1: Resume Leak
A job applicant sends a resume as PDF. The metadata reveals:
- Author: "John Smith - ABC Corp"
- Creator: Microsoft Word 2019
- ModDate: Modified 2 hours before applying
The recruiter now knows the candidate is currently employed at a competitor.
Example 2: Contract Manipulation Detection
A signed contract arrives with:
- CreationDate: January 15, 2024
- ModDate: March 3, 2024
- Producer: Changed from original
This discrepancy suggests the contract was modified after the original signing date.
Example 3: Anonymous Document Deanonymized
A "whistleblower" document contains:
- Author: Username matching an employee
- Creator: Software only used in one department
- File path in XMP: Reveals department server
The source is identified through metadata alone.
Metadata in Different PDF Types
| PDF Type | Typical Metadata | Privacy Risk |
|---|---|---|
| Scanned documents | Scanner info, scan date | Low |
| Word exports | Author, company, software | High |
| Web captures | URL, browser, date | Medium |
| Merged PDFs | Multiple authors, dates | Variable |
How Metadata Gets Added
- At creation - Software automatically adds Creator, Producer, dates
- From templates - Inherited from document templates
- During editing - ModDate updated, possibly new Producer
- Through conversion - New Producer added, original Creator preserved
Best Practices
- Always check metadata before sharing sensitive documents
- Use sanitization for documents going to external parties
- Establish metadata policies in your organization
- Consider the source when receiving documents - metadata can reveal origins
- Use clean templates - Remove metadata from document templates
- Automate sanitization - Add to document workflows
Conclusion
PDF metadata is a double-edged sword. It's useful for document management but can expose sensitive information. Understanding what metadata exists in your PDFs is the first step toward better document privacy and security.
Want to check your PDF's metadata? Try our Check PDF Edits tool to see exactly what information your documents contain.
Related Articles
Top 5 PDF Sanitization Tools Reviewed (2025)
Compare the best PDF sanitization tools for removing metadata and hidden data. Detailed review of features, security, and pricing for document privacy.
Read article →Why PDF Metadata Matters for Privacy: Real Risks and Examples
Understand why PDF metadata is a privacy concern. Real examples of data leaks, what personal information hides in documents, and how to protect yourself.
Read article →Is My PDF Digitally Signed? How to Check
Learn how to check if your PDF is digitally signed and verify the signature. Step-by-step guide to understanding PDF signature status and what it means.
Read article →PDF Creator and Producer Metadata Explained
Understanding PDF creator and producer metadata fields. Learn what these fields reveal about document origin, software used, and privacy implications.
Read article →See Also
Try CleanPDF
Analyze your PDFs for editing traces or remove metadata for privacy.