Master PDF Tags, Properties, and Attributes
Learn how to create accessible, structured, and searchable PDFs with our comprehensive guides on PDF tagging, properties, and attributes.
Why Learn PDF Tagging?
Read-only API endpoints for accessing PDF tagging reference data in both human-friendly Markdown and machine-friendly JSON formats
Knowledge Base APIModel Context Protocol (MCP) Server
Access TaggedPDF School's comprehensive database through our MCP server, enabling AI assistants and development tools to query tags, attributes, properties, and Matterhorn Protocol checkpoints programmatically.
API Access
Programmatic access to all PDF tagging reference data via JSON-RPC protocol
AI Integration
Seamlessly integrate with AI assistants like Cursor IDE, Claude Desktop, and other MCP-compatible tools
Powerful Tools
Search, query, and retrieve data from our comprehensive database with built-in tools
Available Tools
The MCP server provides seven powerful tools for querying our database:
- Get attribute details by name
- Get tag information and specifications
- Get property definitions and values
- Get Matterhorn Protocol checkpoint details
- Search attributes by keyword
- Search tags by keyword
- Get summary of all available databases
Understanding PDF Tags
PDF tags are hidden markers that define the structure and content of a document. They play a crucial role in making PDFs accessible, searchable, and reflow-able across different devices. Our courses will teach you:
- How to properly tag headings, paragraphs, and lists
- Techniques for tagging tables and complex layouts
- Best practices for adding alternative text to images
- Methods for creating a logical reading order
Exploring PDF Properties and Attributes
In addition to tags, understanding PDF properties and attributes is crucial for creating fully accessible and well-structured documents. Our comprehensive guides cover:
PDF Properties
- Document metadata
- Security settings
- Page layout and viewing options
- Font embedding and subsetting
PDF Attributes
- Accessibility attributes
- Alternative text for images
- Language specifications
- Reading order attributes
About TaggedPDF School
An educational and exploratory platform for learning and experimenting with PDF logical structure
Our Mission
TaggedPDF School is an interactive learning resource for PDF tagging—analogous to how w3schools serves foundational web technologies. Our goal is to create a centralized, evolving resource that combines:
- Structured reference of PDF tags, properties, and attributes
- Hands-on playground for experimenting with tag hierarchies
- Validation feedback loop to reinforce correct structural authoring
- Real PDF generation to see your structures in action
Future Vision
TaggedPDF School continues to evolve with contributions from the PDF Association community. Planned enhancements include:
- Expanded dataset with authoritative guidance, examples, and anti-patterns
- Deeper semantic validation rules beyond schema validation
- Structured learning modules and progressive lessons
- Community contribution guidelines and feedback channels
- Enhanced accessibility features and WCAG/PDF/UA compliance checks
Origin & Acknowledgement
This technology was initially developed by Foxit and generously donated to the PDF Association to ensure neutral stewardship and broader community participation.
The concept was originated by Roman Toda (Director at PDF Association; Chief Standardization Officer at Foxit), who identified the need for an approachable, interactive, authoritative learning resource for PDF tagging. Working with Samuel Hrotík as the initial implementer and primary engineer, the prototype evolved from a JSON-based knowledge base into a full-featured educational platform.
Following successful demonstration to PDF Association members, the complete codebase was migrated to the PDF Association GitHub organization, where it continues to evolve as a community resource under the Association's infrastructure.
Key Contributors
Roman Toda
Director at PDF Association; Chief Standardization Officer at Foxit. Originated the concept and co-curated early tag data.
Samuel Hrotík
Initial implementer and primary engineer for the prototype and subsequent platform features.
David Carlisle
Author of the Relax NG schema and C-based validation tooling leveraged for playground validation.
Peter Wyatt & Duff Johnson
PDF Association leadership providing infrastructure support, guidance, and deployment assistance.
Gratitude to Foxit for their initial development and generous donation of this technology, and to the broader PDF Association community whose standards work underpins the educational mission of this project.