Skip to main content

Master PDF Tags, Properties, and Attributes

Learn how to create accessible, structured, and searchable PDFs with our comprehensive guides on PDF tagging, properties, and attributes.

Why Learn PDF Tagging?

Read-only API endpoints for accessing PDF tagging reference data in both human-friendly Markdown and machine-friendly JSON formats

Knowledge Base API

Model Context Protocol (MCP) Server

Access TaggedPDF School's comprehensive database through our MCP server, enabling AI assistants and development tools to query tags, attributes, properties, and Matterhorn Protocol checkpoints programmatically.

API Access

Programmatic access to all PDF tagging reference data via JSON-RPC protocol

AI Integration

Seamlessly integrate with AI assistants like Cursor IDE, Claude Desktop, and other MCP-compatible tools

Powerful Tools

Search, query, and retrieve data from our comprehensive database with built-in tools

Available Tools

The MCP server provides seven powerful tools for querying our database:

  • Get attribute details by name
  • Get tag information and specifications
  • Get property definitions and values
  • Get Matterhorn Protocol checkpoint details
  • Search attributes by keyword
  • Search tags by keyword
  • Get summary of all available databases

Understanding PDF Tags

PDF tags are hidden markers that define the structure and content of a document. They play a crucial role in making PDFs accessible, searchable, and reflow-able across different devices. Our courses will teach you:

  • How to properly tag headings, paragraphs, and lists
  • Techniques for tagging tables and complex layouts
  • Best practices for adding alternative text to images
  • Methods for creating a logical reading order

Exploring PDF Properties and Attributes

In addition to tags, understanding PDF properties and attributes is crucial for creating fully accessible and well-structured documents. Our comprehensive guides cover:

PDF Properties

  • Document metadata
  • Security settings
  • Page layout and viewing options
  • Font embedding and subsetting

PDF Attributes

  • Accessibility attributes
  • Alternative text for images
  • Language specifications
  • Reading order attributes

About TaggedPDF School

An educational and exploratory platform for learning and experimenting with PDF logical structure

Our Mission

TaggedPDF School is an interactive learning resource for PDF tagging—analogous to how w3schools serves foundational web technologies. Our goal is to create a centralized, evolving resource that combines:

  • Structured reference of PDF tags, properties, and attributes
  • Hands-on playground for experimenting with tag hierarchies
  • Validation feedback loop to reinforce correct structural authoring
  • Real PDF generation to see your structures in action

Future Vision

TaggedPDF School continues to evolve with contributions from the PDF Association community. Planned enhancements include:

  • Expanded dataset with authoritative guidance, examples, and anti-patterns
  • Deeper semantic validation rules beyond schema validation
  • Structured learning modules and progressive lessons
  • Community contribution guidelines and feedback channels
  • Enhanced accessibility features and WCAG/PDF/UA compliance checks

Origin & Acknowledgement

This technology was initially developed by Foxit and generously donated to the PDF Association to ensure neutral stewardship and broader community participation.

The concept was originated by Roman Toda (Director at PDF Association; Chief Standardization Officer at Foxit), who identified the need for an approachable, interactive, authoritative learning resource for PDF tagging. Working with Samuel Hrotík as the initial implementer and primary engineer, the prototype evolved from a JSON-based knowledge base into a full-featured educational platform.

Following successful demonstration to PDF Association members, the complete codebase was migrated to the PDF Association GitHub organization, where it continues to evolve as a community resource under the Association's infrastructure.

Key Contributors

Roman Toda

Director at PDF Association; Chief Standardization Officer at Foxit. Originated the concept and co-curated early tag data.

Samuel Hrotík

Initial implementer and primary engineer for the prototype and subsequent platform features.

David Carlisle

Author of the Relax NG schema and C-based validation tooling leveraged for playground validation.

Peter Wyatt & Duff Johnson

PDF Association leadership providing infrastructure support, guidance, and deployment assistance.

Gratitude to Foxit for their initial development and generous donation of this technology, and to the broader PDF Association community whose standards work underpins the educational mission of this project.