Get Started
← Back to Blog

XMP Metadata in PDFs Explained: What It Is and Why It Matters

April 18, 2026• 5 min read

When people talk about PDF metadata, they often mean the Document Info dictionary. But there's another, often larger, metadata system in PDFs: XMP. Understanding XMP is crucial for anyone concerned about document privacy.

What Is XMP?

XMP (Extensible Metadata Platform) is an XML-based metadata standard developed by Adobe. It's designed to:

  • Store comprehensive document information
  • Be readable by any application
  • Support custom metadata schemas
  • Provide more detailed information than traditional metadata

XMP vs. Document Info

PDFs can contain two separate metadata systems:

Document Info Dictionary

The traditional approach:

  • Simple key-value pairs
  • Limited standard fields (Author, Title, Subject, etc.)
  • Smaller in size
  • Found in all PDFs

XMP Metadata

The modern approach:

  • XML-based, extensible
  • Much more detailed
  • Can include custom schemas
  • Often larger and more comprehensive

Key difference: Many tools remove Document Info but leave XMP intact. This means your "sanitized" PDF might still contain detailed metadata.

What XMP Can Contain

Standard Information

  • Document title, author, description
  • Creation and modification dates
  • Keywords and subjects
  • Copyright information

Adobe-Specific Data

  • Creation tool and version
  • Document history
  • PDF/A conformance
  • Font information

Custom Schemas

  • Organization-specific metadata
  • Workflow information
  • Rights management data
  • Custom tracking fields

Potentially Sensitive Data

  • Full author names and emails
  • Software license information
  • Document revision history
  • Editing timestamps with precision
  • GPS coordinates (if added)
  • Organization identifiers

Where to Find XMP in a PDF

XMP metadata is typically stored as:

  • An XML stream within the PDF
  • Can be in multiple locations
  • Often duplicated from Document Info
  • May contain additional information

To view XMP:

  1. Open PDF in a text editor
  2. Search for "<?xpacket" to find XMP data
  3. The XML between xpacket tags is your XMP metadata

Why XMP Matters for Privacy

More Information Than Expected

XMP often contains:

  • More detailed timestamps
  • Software version numbers
  • UUID identifiers that can track documents
  • Edit history information

Commonly Overlooked

Many users and even tools:

  • Check Document Info but not XMP
  • Remove one but leave the other
  • Don't realize PDFs have two metadata systems

Synchronization Issues

When Document Info and XMP disagree:

  • May indicate editing or tampering
  • Shows document history
  • Reveals processing by different tools

XMP and Document Forensics

What XMP Reveals

Forensic analysts look at XMP for:

  1. Document history - Previous saves and edits
  2. Software tracking - What tools touched the file
  3. Timeline analysis - Detailed timestamps
  4. Authenticity checks - Inconsistencies between metadata systems

Detecting Tampering

XMP inconsistencies can indicate:

  • Metadata manipulation
  • Document modification
  • Tool processing history
  • Possible fraud

Removing XMP Metadata

Why Remove It

  • Privacy protection
  • Security compliance
  • Information leakage prevention
  • Document sanitization

How to Remove It

Adobe Acrobat Pro:

  • Tools → Protect → Sanitize Document
  • Should remove both XMP and Document Info

CleanPDF:

  • Automatically removes all XMP metadata
  • Verifies removal
  • Shows what was found

ExifTool:

exiftool -all:all= document.pdf

Verification

After removal, verify:

  1. Check Document Info is empty
  2. Search for "xpacket" in a text editor
  3. Use a metadata viewer tool
  4. Run through CleanPDF analysis

Common XMP Privacy Leaks

Author Email Addresses

XMP often stores full email addresses, not just names.

Organization Identifiers

Company names, department codes, and internal IDs.

Software Licenses

Some software embeds license information in XMP.

Document Identifiers

UUIDs that can track a specific document across systems.

Detailed Timestamps

Precise edit times down to seconds, revealing work patterns.

XMP and Different Software

Adobe Products

Comprehensive XMP support:

  • InDesign, Illustrator, Photoshop add rich XMP
  • Acrobat can edit and remove XMP
  • Adobe Reader shows some XMP info

Microsoft Office

When exporting to PDF:

  • Adds author, title, dates
  • May include organization info
  • Less comprehensive than Adobe

Open Source Tools

Varies by tool:

  • Some add minimal XMP
  • Others include detailed information
  • LibreOffice, for example, adds creator info

Best Practices

Before Sharing Documents

  1. Check both Document Info AND XMP
  2. Use tools that remove both
  3. Verify removal after sanitization
  4. Consider what metadata you actually need

For Document Creators

  1. Configure software to minimize metadata
  2. Use sanitization as a standard step
  3. Establish metadata policies
  4. Train staff on metadata risks

For Document Recipients

  1. Check metadata to understand document history
  2. Be aware that sanitized documents may still contain XMP
  3. Use comprehensive analysis tools

Conclusion

XMP metadata is the "other" metadata in PDFs—often more detailed and frequently overlooked. For proper document privacy:

  • Know it exists - Document Info isn't the whole picture
  • Check both systems - XMP may contain more information
  • Remove thoroughly - Use tools that handle XMP specifically
  • Verify removal - Don't assume sanitization worked

Whether you're protecting privacy or investigating documents, understanding XMP is essential.


Want to see what XMP metadata is in your PDF? Analyze it with CleanPDF to see all metadata types and get a comprehensive report.

Related Articles

See Also

Try CleanPDF

Analyze your PDFs for editing traces or remove metadata for privacy.