Chapter 3: Document Ontology - A Formal Framework
In the previous chapter, we established theoretical foundations drawing from genre theory, information architecture, pattern languages, knowledge representation, and cognitive science. Now we apply these frameworks to create a systematic ontology of documents—a formal classification system that describes what documents are, how they're structured, and how they relate to underlying data.
This ontology serves multiple purposes:
- Analytical: Provides a framework for analyzing any document domain
- Descriptive: Creates shared vocabulary for discussing document structures
- Prescriptive: Guides design of new document types and templates
- Computational: Enables automated reasoning about documents
- Educational: Helps practitioners understand document patterns
We'll develop this ontology in layers, from abstract to concrete: - Core document dimensions (foundational axes of classification) - Communicative function taxonomy (what documents do) - Information architecture patterns (how information is structured) - Data relationship models (how documents map to data) - The document pattern catalog (reusable solutions)
3.1 Core Document Dimensions
Every document can be analyzed along four orthogonal dimensions. These are independent axes that together characterize any document type:
Dimension 1: Communicative Function
What does this document DO in the world?
Drawing from Speech Act Theory, documents perform acts. The primary function shapes everything else about the document:
Declarative (Assert information) - State facts about the world - Report findings or status - Document what exists - Examples: Reports, directories, catalogs, dashboards
Performative (Enact change) - Create legal/social reality - Certify achievements - Authorize actions - Examples: Contracts, certificates, licenses, invoices
Directive (Instruct behavior) - Specify procedures - Assign tasks - Schedule activities - Examples: Manuals, work orders, schedules, assignments
Commissive (Commit to future actions) - Promise deliverables - Pledge support - Propose solutions - Examples: Proposals, agreements, project plans, warranties
Expressive (Convey attitudes) - Express opinions - Demonstrate satisfaction - Show appreciation - Examples: Testimonials, reviews, recommendations, references
Importance: Communicative function determines: - Required formality and authority - Validation and approval requirements - Legal and compliance considerations - Distribution and retention policies - Update frequency and versioning needs
Dimension 2: Information Architecture
How is information STRUCTURED within the document?
This describes the organizational logic—how information flows and relates within the document itself:
Atomic Structure - Single coherent unit - One record → one document instance - Self-contained - Examples: Individual certificate, single property flyer, one invoice
Linear Structure - Sequential presentation - Beginning-to-end narrative or logical flow - Order matters - Examples: Reports, procedures, timelines, stories
Hierarchical Structure - Nested sections and subsections - Tree-like organization - Parent-child relationships - Examples: Course catalogs, policy manuals, organizational charts
Tabular Structure - Row-and-column organization - Structured data display - Comparison-friendly - Examples: Price lists, schedules, comparison matrices
Network Structure - Non-hierarchical connections - Cross-references and links - Multiple navigation paths - Examples: Hyperlinked documentation, knowledge bases, legal citations
Flowing Structure - Continuous stream with dynamic breaks - Content determines boundaries - Flexible layout - Examples: Newsletters, magazines, multi-column layouts
Importance: Information architecture determines: - Layout complexity - Navigation requirements - How sections relate - Ease of modification - Scalability to large datasets
Dimension 3: Social Context
Who creates and uses this document, and WHY?
Documents exist within social contexts—organizational practices, professional communities, regulatory environments:
Organizational Context - Internal vs. external audience - Formal vs. informal tone - Hierarchical position (executive summary vs. detailed report) - Frequency (daily, monthly, annually, ad-hoc)
Professional Community - Domain-specific conventions (legal briefs vs. medical records) - Industry standards (real estate MLS format, academic transcripts) - Certification requirements (who can issue what) - Regulatory compliance (FERPA for education, HIPAA for healthcare)
Stakeholder Relationships - Power dynamics (manager to employee, vendor to client) - Legal relationships (parties to contract, plaintiff to defendant) - Transactional vs. relational - Accountability and liability
Cultural Context - Regional variations (date formats, measurement systems) - Language and translation considerations - Accessibility requirements - Privacy and consent norms
Importance: Social context determines: - Required content and disclosures - Appropriate tone and formality - Distribution and access controls - Audit and compliance requirements - Signature and approval workflows
Dimension 4: Material Form
What is the document's physical/digital MANIFESTATION?
While content is primary, form matters for usability and function:
Format - Word processing document (editable) - PDF (fixed layout, printable) - Web page (responsive, searchable) - Spreadsheet (calculable) - Presentation (sequential slides)
Medium - Digital only - Print only - Digital-first with print option - Multi-modal (QR codes linking digital content)
Layout Characteristics - Page size and orientation - Column structure - Image density - Color vs. black-and-white - Interactive elements
Delivery Mechanism - Email attachment - Printed and mailed - Web portal access - API delivery - Embedded in other documents
Importance: Material form determines: - Authoring tools and workflows - Distribution infrastructure - Storage and archiving approach - Editability and version control - Accessibility considerations
The Four-Dimensional Classification Space
Every document can be plotted in this four-dimensional space:
Example: Student Report Card - Communicative Function: Declarative (reports academic performance) - Information Architecture: Hierarchical (sections by subject) with Tabular elements (grades table) - Social Context: Internal educational document; regulatory compliance (FERPA); parent-school relationship; quarterly frequency - Material Form: PDF for distribution; print option; portrait letter-size; contains sensitive data
Example: Real Estate Purchase Agreement - Communicative Function: Performative (creates binding contract) - Information Architecture: Hierarchical (articles and clauses) with some Linear elements (recitals) - Social Context: Legal document; buyer-seller relationship; regulatory compliance (real estate law); one-time transaction - Material Form: Word document for negotiation, then PDF for execution; requires signatures; legal-size pages
This four-dimensional framework enables: - Systematic comparison of document types - Identification of requirements (performative documents need more validation than declarative) - Pattern recognition (similar positions in space suggest similar solutions) - Gap analysis (what document types are missing for a domain?)
3.2 Communicative Function Taxonomy
Let's examine each communicative function in depth, as this is the primary axis for understanding what documents accomplish.
3.2.1 Declarative Documents: Asserting Information
Purpose: To make claims about the state of the world, report findings, or document what exists.
Characteristics: - Factual rather than opinional - Can be verified as accurate or inaccurate - Often include data, measurements, observations - May aggregate or summarize information - Updated when facts change
Subtypes:
Reports (Temporal analysis) - Status reports (current state) - Progress reports (change over time) - Performance reports (metrics and analysis) - Financial reports (accounting data) - Research reports (findings and interpretation)
Directories (Entity listings) - Membership directories (who belongs) - Contact lists (how to reach people) - Organizational rosters (structure and roles) - Resource inventories (what's available) - Product catalogs (what's for sale)
Summaries (Information condensation) - Executive summaries (key points only) - Dashboards (metrics at-a-glance) - Abstracts (paper summaries) - Meeting minutes (discussion summary) - Case summaries (legal/medical briefs)
Documentation (Recorded information) - Medical records (patient history) - Case files (legal matter details) - Project documentation (technical specs) - Meeting notes (discussion capture) - Observation logs (chronological records)
Data Requirements: - Underlying data must be current and accurate - Clear attribution of information sources - Timestamp or effective date - Version control for updates
Validation Needs: - Moderate (facts should be accurate but errors rarely catastrophic) - Data quality checks - Calculation verification - Consistency across reports
Update Frequency: - Varies widely: real-time dashboards to annual reports - Often scheduled (monthly, quarterly) - Triggered by data changes in underlying systems
Design Considerations: - Clarity and readability paramount - Visual hierarchy to show importance - Tables and charts for data-heavy content - Comparisons (current vs. previous, actual vs. target) - Context (what do these numbers mean?)
Examples Across Domains: - Education: Student rosters, grade reports, attendance summaries, enrollment statistics - Business: Sales reports, employee directories, inventory lists, financial statements - Legal: Document indices, witness lists, exhibit lists, case chronologies - Real Estate: Property listings, market analyses, comparable sales reports - Healthcare: Patient summaries, lab results, census reports, quality metrics
3.2.2 Performative Documents: Enacting Change
Purpose: To create new states of affairs, not merely describe them. The document itself performs an action.
Characteristics: - Have legal or social force - Create obligations, rights, or status changes - Require authority (only certain people/roles can issue) - Often require signatures or formal approval - Become part of official record - Have compliance and audit requirements
Subtypes:
Contracts and Agreements (Mutual obligations) - Purchase agreements (buyer-seller) - Service contracts (provider-client) - Employment contracts (employer-employee) - Lease agreements (landlord-tenant) - Partnership agreements (between entities)
Certificates and Credentials (Status conferral) - Academic certificates (completion, achievement) - Professional licenses (authorization to practice) - Awards and recognitions (acknowledgment) - Birth/death certificates (vital records) - Property titles (ownership)
Financial Instruments (Payment obligations) - Invoices (requests for payment) - Receipts (acknowledgment of payment) - Purchase orders (commitment to buy) - Bills of sale (transfer of ownership) - Loan documents (debt obligations)
Authorizations (Permission grants) - Permission slips (parental consent) - Access grants (security clearances) - Licenses (right to use/do something) - Permits (building, event, business) - Powers of attorney (delegation of authority)
Legal Filings (Court/regulatory actions) - Complaints and petitions (initiate proceedings) - Motions and briefs (request court action) - Regulatory filings (compliance documents) - Registrations (official notice) - Notices (formal communication)
Data Requirements: - All parties must be correctly identified - Terms must be complete and unambiguous - Dates and effective periods must be clear - Consideration or obligations must be specified - Conditions and contingencies must be explicit
Validation Needs: - High (errors can have legal and financial consequences) - Authority verification (is issuer authorized?) - Completeness checking (all required fields present?) - Consistency checking (no contradictions?) - Signature/approval tracking - Audit trail preservation
Update Frequency: - Generally immutable once executed - Amendments require formal process - Superseding documents explicitly reference predecessors - Version control critical
Design Considerations: - Precision and unambiguity essential - Standardized language and clauses - Clear visual hierarchy for key terms - Signature blocks and date fields - Reference numbers for tracking - Witnessing and notarization provisions where required
Examples Across Domains: - Education: Diplomas, transcripts, enrollment agreements, permission forms - Business: Contracts, NDAs, offer letters, stock certificates - Legal: Pleadings, orders, judgments, powers of attorney - Real Estate: Purchase agreements, deeds, leases, disclosures - Healthcare: Consent forms, DNR orders, HIPAA authorizations
3.2.3 Directive Documents: Instructing Behavior
Purpose: To tell readers what to do, how to do it, or when to do it.
Characteristics: - Action-oriented - Often sequential or procedural - May include conditions ("if X, then do Y") - Clarity is paramount (ambiguity causes errors) - May have compliance implications
Subtypes:
Procedures and Instructions (How-to guides) - Standard operating procedures (organizational processes) - User manuals (product operation) - Recipes (cooking/chemistry instructions) - Assembly instructions (construction steps) - Treatment protocols (medical procedures)
Assignments and Tasks (What to accomplish) - Work orders (specific jobs) - Homework assignments (academic tasks) - Project charters (initiative definitions) - Service requests (customer needs) - Action items (meeting outcomes)
Schedules and Calendars (When to do things) - Class schedules (when/where classes meet) - Work schedules (shift assignments) - Project timelines (milestone dates) - Event calendars (upcoming activities) - Maintenance schedules (recurring tasks)
Policies and Guidelines (Rules and norms) - Policy manuals (organizational rules) - Style guides (writing/design standards) - Code of conduct (behavioral expectations) - Best practice guides (recommended approaches) - Compliance requirements (regulatory rules)
Data Requirements: - Tasks/actions described clearly - Sequence or dependencies specified - Required resources identified - Deadlines or timeframes included - Responsible parties assigned
Validation Needs: - Moderate to high (depends on consequence of errors) - Completeness (all necessary steps included?) - Logical order (dependencies respected?) - Feasibility (resources available? timeframes realistic?) - Clarity (unambiguous language?)
Update Frequency: - Procedures: Updated when process changes - Assignments: One-time or recurring - Schedules: Updated regularly (term, season, project phase) - Policies: Reviewed periodically, updated as needed
Design Considerations: - Numbered steps for procedures - Clear visual hierarchy - Warnings and cautions highlighted - Checklists where appropriate - Examples and illustrations - Easy navigation (indexed, searchable)
Examples Across Domains: - Education: Assignment sheets, lesson plans, academic calendars, classroom procedures - Business: SOPs, project plans, work orders, maintenance schedules - Legal: Court procedures, filing instructions, compliance checklists - Real Estate: Showing instructions, closing checklists, maintenance guides - Healthcare: Treatment protocols, medication administration, emergency procedures
3.2.4 Commissive Documents: Committing to Actions
Purpose: To pledge future performance or commit to delivering something.
Characteristics: - Forward-looking (describe future, not present/past) - Create expectations and accountability - Often include deliverables, timelines, costs - May become legally binding - Performance measured against commitments
Subtypes:
Proposals (Offers to deliver) - Business proposals (project bids) - Grant proposals (funding requests) - Research proposals (study plans) - Sales proposals (product/service offers) - Partnership proposals (collaboration offers)
Project Plans (Work commitments) - Project charters (scope, objectives, deliverables) - Implementation plans (how work will be done) - Delivery schedules (when outputs arrive) - Resource plans (who/what will be allocated) - Risk management plans (mitigation strategies)
Service Agreements (Ongoing support) - Service level agreements (performance guarantees) - Maintenance contracts (recurring support) - Subscription terms (continuous service) - Retainer agreements (available capacity) - Support agreements (help desk, technical assistance)
Warranties and Guarantees (Quality promises) - Product warranties (repair/replace commitments) - Satisfaction guarantees (refund promises) - Performance bonds (financial backing) - Quality assurances (standards compliance) - Professional guarantees (work quality)
Data Requirements: - What will be delivered (scope, specifications) - When it will be delivered (timelines, milestones) - How much it will cost (pricing, payment terms) - Who is responsible (parties, roles) - What happens if commitments aren't met (remedies)
Validation Needs: - High (unfulfilled commitments damage relationships and finances) - Feasibility checking (can we actually do this?) - Resource validation (do we have capacity?) - Cost verification (pricing accurate?) - Risk assessment (what could go wrong?) - Approval workflows (authority to commit?)
Update Frequency: - Proposals: One-time documents - Plans: Updated as project progresses - Agreements: Periodic renewal - Versioning critical when changes occur
Design Considerations: - Clear statement of what's promised - Explicit timelines and milestones - Costs and payment terms prominent - Conditions and contingencies stated - Measurement criteria for success - Exit or termination provisions
Examples Across Domains: - Education: Course syllabi (instructor commitments), partnership MOUs, improvement plans - Business: Sales proposals, project plans, service contracts, product roadmaps - Legal: Settlement agreements (future payments), plea agreements (defendant promises) - Real Estate: Letters of intent, development proposals, property management agreements - Healthcare: Treatment plans, care coordination agreements, quality improvement plans
3.2.5 Expressive Documents: Conveying Attitudes
Purpose: To express feelings, opinions, or subjective assessments.
Characteristics: - Subjective rather than objective - Often persuasive or evaluative - May support other functions (express AND recommend) - Credibility depends on source authority/expertise - Often shorter and less formal than other types
Subtypes:
Testimonials and References (Positive assessments) - Customer testimonials (product/service praise) - Letters of recommendation (candidate endorsements) - Professional references (colleague assessments) - Success stories (implementation experiences) - Case studies (detailed positive examples)
Reviews and Evaluations (Critical assessments) - Product reviews (feature/quality analysis) - Performance reviews (employee assessments) - Peer reviews (manuscript/proposal evaluation) - Service reviews (experience ratings) - Course evaluations (teaching assessments)
Feedback and Comments (Responses and reactions) - Comment letters (regulatory responses) - Survey responses (opinion collection) - User feedback (improvement suggestions) - Complaint letters (dissatisfaction expression) - Thank you notes (appreciation expression)
Opinions and Perspectives (Viewpoint statements) - Opinion pieces (editorial content) - Position papers (organizational stance) - White papers (perspective with evidence) - Commentary (interpretation of events) - Expert opinions (professional judgments)
Data Requirements: - Subject being evaluated - Evaluator identity and credentials - Criteria or dimensions of assessment - Specific observations or experiences - Recommendations or conclusions
Validation Needs: - Low to moderate (opinion by nature subjective) - Authenticity (is reviewer who they claim?) - Relevance (does reviewer have basis for opinion?) - Appropriateness (any conflicts of interest?) - Respectfulness (no abuse or inappropriate content)
Update Frequency: - Generally point-in-time (opinions reflect moment) - May be superseded by new opinions - Historical opinions retained (show evolution)
Design Considerations: - Author/source prominently displayed - Date of opinion clear - Context provided (what prompted this?) - Balance (acknowledge limitations) - Supporting evidence or examples - Separate facts from opinions
Examples Across Domains: - Education: Teacher recommendations, course evaluations, peer assessments - Business: Performance reviews, customer testimonials, vendor assessments - Legal: Expert opinions, character references, witness statements - Real Estate: Property appraisals, neighborhood assessments, agent reviews - Healthcare: Treatment recommendations, patient satisfaction surveys, quality reviews
3.2.6 Hybrid and Composite Functions
Many real-world documents combine multiple functions:
Example: Grant Proposal - Commissive: "We will deliver these outputs on this timeline" - Declarative: "Our organization has these qualifications and this track record" - Directive: "Here's our project plan with tasks and milestones" - Expressive: "We believe this approach will be effective"
Example: Annual Report - Declarative: "Here are our financial results and operational metrics" - Expressive: "We're proud of these achievements" - Commissive: "Here are our goals for next year"
Example: Invoice with Terms - Performative: "You owe this amount" - Declarative: "Here's what we delivered" - Directive: "Payment due by this date via these methods"
Design Implication: Templates must accommodate primary and secondary functions, with appropriate emphasis on each.
3.3 Information Architecture Patterns
Now we shift from "what documents do" to "how documents are structured." These are the fundamental organizational patterns that recur across document types and domains.
These patterns are technology-agnostic—they apply whether you're creating Word documents, PDFs, web pages, or printed materials. They describe logical structure, not physical implementation.
Pattern 1: Atomic Pattern
Also known as: Single-instance, One-record-one-document
Context: You need to create individual documents for discrete entities or events, where each document is self-contained and independent.
Structure:
Document = Single Record + Template
One student → One certificate
One property → One listing flyer
One invoice → One billing document
One patient → One chart summary
Visual Characteristics: - Typically single page (or few pages) - All information about one entity - No repeated sections or loops - Often formatted for printing/framing/filing - Self-contained (doesn't require other documents to make sense)
Data Requirements: - One master record (the primary entity) - Attributes of that record - Optionally: lookup data (e.g., issuer information, organization logo) - Optionally: calculated fields (e.g., age from birthdate)
When to Use: - Certificates and credentials - ID cards and badges - Individual cover letters - Property flyers - Patient discharge summaries - Purchase receipts - Award certificates - License documents
Variations:
Simple Atomic: Just field substitution - Certificate with name, date, achievement - Badge with photo, name, role - Minimal formatting, focus on key data
Rich Atomic: Complex layout and content - Property flyer with multiple photos, detailed descriptions, maps - Resume with multiple sections, formatting, graphics - More sophisticated design
Conditional Atomic: Content varies by data - Invoice (may or may not have discount fields) - Certificate (may include honors notation if GPA > 3.5) - Personalized letters (content blocks appear based on recipient attributes)
Examples Across Domains:
Education: - Completion certificate for individual student - Individual progress report - Transcript for one student - Award certificate
Legal: - Engagement letter for one client - Retainer agreement - Certificate of service - Individual contract
Real Estate: - Property listing flyer - Comparative market analysis for one property - Offer presentation
Retail: - Product specification sheet - Individual item label - Single product flyer
Healthcare: - Patient discharge summary - Lab result report for one patient - Prescription document
Implementation Considerations: - Simplest pattern to implement - No relationship resolution needed - Can batch generate (100 certificates from 100 student records) - Each output document completely independent - Easy to parallelize generation
Pattern 2: Directory Pattern
Also known as: List, Roster, Catalog, Registry
Context: You need to present multiple similar entities in a structured way for browsing, reference, or comparison.
Structure:
Document = Collection of Similar Records + Consistent Format
All students → Student roster
All products → Product catalog
All employees → Staff directory
All properties → Listing book
Visual Characteristics: - Repeating blocks or rows - Consistent format for each entity - May include photos/images - Often includes categorization or grouping - May span multiple pages - Often includes index or search aids
Layout Variants:
Grid Layout: - Entities arranged in rows and columns - Like trading cards or photo grid - Good for image-heavy content (student photos, product images) - Fixed space per entity - Visual scanning friendly
List Layout: - Entities in single column - More detailed information per entity - Variable space per entity - Good for text-heavy content - Scrolling-friendly for digital
Table Layout: - Rows for entities, columns for attributes - Dense information presentation - Easy comparison across entities - Good for data-heavy content (specifications, metrics) - Spreadsheet-like
Card Layout: - Individual "cards" for each entity - Can be physical cards (student ID, baseball card) - Or visual cards on page/screen - Flexible space per entity - Can include front/back
Data Requirements: - One table with multiple records - Consistent attributes across records - Optional grouping/category field - Optional sort key (alphabetical, by date, by category)
When to Use: - Student/employee rosters - Product catalogs - Contact lists - Property listings - Course catalogs - Event attendee lists - Membership directories - Inventory lists
Grouping and Organization:
Flat Directory: All entities at one level - Alphabetical by name - Chronological by date - Numerical by ID
Grouped Directory: Entities organized into categories - Students by grade level - Products by category - Employees by department - Properties by neighborhood
Hierarchical Directory: Nested categories (see Hierarchical Pattern)
Filtering and Searching: - May include search features (digital) - May include index (alphabetical, by attribute) - May include table of contents - May highlight featured/priority items
Examples Across Domains:
Education: - Class roster with student photos - Instructor directory with bios - Course catalog (if simple list; hierarchical if complex) - Student contact list - Class schedule (all classes)
Legal: - Document index for case - Witness list - Exhibit list - Court docket - Attorney directory
Real Estate: - Property listings by agent - Open house schedule - Agent team directory - Available properties
Retail: - Product catalog - Price list - Inventory report - Vendor directory
Healthcare: - Provider directory - Patient census - Equipment inventory - Medication list
Implementation Considerations: - Single data source (one table) - Consistent formatting critical - Pagination strategy for large datasets - Sorting/filtering options - Performance: can generate thousands of rows - Consider breaking into sections for very large directories
Pattern 3: Master-Detail Pattern
Also known as: Header-Lines, Parent-Child, One-to-Many
Context: You need to show a parent entity with its related child records, where the relationship is one-to-many.
Structure:
Document = Master Record + Collection of Detail Records
Invoice = Invoice Header + Line Items
Report Card = Student Header + Grades by Subject
Order = Order Header + Products Ordered
Case = Case Header + Documents Filed
Visual Characteristics: - Header section with master information - Detail section with related records (often in table) - Usually includes aggregations (totals, counts, averages) - May repeat entire structure (multiple invoices, each with line items)
Components:
Master/Header Section: - Primary entity information - Aggregated/calculated fields (subtotal, total, GPA, etc.) - Context information (dates, parties, references) - Typically appears once per document
Detail Section: - Related records in structured format - Often tabular (rows for detail records) - May include calculations per row - May include subtotals by category
Footer/Summary Section: - Grand totals across all details - Final calculations - Signatures or authorizations - Notes or terms
Data Requirements: - Master table with one record per document - Detail table with multiple records per master - Foreign key relationship (detail.master_id = master.id) - Often calculations (sum, average, count over details)
When to Use: - Financial documents (invoices, orders, receipts) - Academic documents (report cards, transcripts) - Any "header + line items" scenario - Shopping carts and orders - Project summaries with task lists - Case files with document lists
Relationship Patterns:
Simple One-to-Many: - One invoice → many line items - One student → many grades - Straightforward foreign key
Categorized Details: - Details grouped by category - Student grades grouped by subject/class - Order items grouped by product category - Requires secondary grouping field
Nested Details (becomes Hierarchical Pattern): - Master → Detail → Sub-detail - Order → Line Items → Components - Project → Tasks → Subtasks
Calculations and Aggregations:
Row-level calculations: - Line total = quantity × unit price - Weighted score = points earned × weight - Hours × rate = charge
Subtotal calculations: - Total by category - Average by group - Count by type
Grand total calculations: - Sum of all line items - Overall GPA across all classes - Total hours across all tasks
Examples Across Domains:
Education: - Report card: Student + grades in multiple subjects - Transcript: Student + all courses taken over time - Class summary: Class + list of students with grades - Assignment sheet: Class + list of assignments due
Legal: - Invoice: Matter + time entries by attorney - Case document list: Case + documents filed - Discovery response: Case + documents produced - Exhibit list: Trial + exhibits presented
Real Estate: - Comparative market analysis: Subject property + comparable sales - Agent activity report: Agent + properties shown/sold - Property features: Property + amenities/features
Retail: - Invoice: Order + line items - Purchase order: PO + items ordered - Packing slip: Shipment + items included - Catalog page: Brand + products in that brand
Healthcare: - Patient encounter: Visit + procedures/diagnoses - Lab report: Patient + test results - Treatment plan: Patient + interventions ordered - Medication list: Patient + current medications
Implementation Considerations: - Must resolve relationships before generation - Calculation logic can be complex - Need to handle variable number of detail records - Pagination: what if details don't fit on one page? - Subtotals and groups add complexity - Validation: ensure detail totals match master totals
Validation Challenges: - Referential integrity: every detail must have valid master - Calculation accuracy: sum(line items) should equal invoice total - Missing details: invoice with zero line items is invalid - Orphaned details: detail records with no master
Pattern 4: Hierarchical Pattern
Also known as: Tree, Nested, Outline, Catalog
Context: You need to organize information into nested sections and subsections, where relationships form a tree structure.
Structure:
Document = Nested Sections + Subsections
Course Catalog:
├── College of Engineering
│ ├── Computer Science Department
│ │ ├── CS 101: Introduction to Programming
│ │ ├── CS 102: Data Structures
│ │ └── CS 201: Algorithms
│ └── Electrical Engineering Department
│ ├── EE 101: Circuits
│ └── EE 201: Electronics
└── College of Arts and Sciences
├── English Department
└── History Department
Visual Characteristics: - Clear visual hierarchy (headings, indentation, numbering) - Table of contents often included - Nested numbering (1.0, 1.1, 1.1.1, etc.) - Consistent formatting by level - Navigation aids (page numbers, headers) - May include indexes or cross-references
Hierarchy Levels:
Shallow Hierarchy (2-3 levels): - Section → Item - Category → Product - Department → Employee - Manageable, easy to navigate
Medium Hierarchy (4-5 levels): - Division → Department → Team → Role → Employee - Category → Subcategory → Product Line → Product → SKU - More complex but still navigable
Deep Hierarchy (6+ levels): - Complex organizational structures - Detailed taxonomies - Technical specifications - Can become hard to navigate without good design
Data Requirements: - Hierarchical table(s) with parent-child relationships - Could be self-referential (each record points to parent) - Or multiple tables (Department table, Employee table with dept_id) - Level indicator or depth calculation - Sort order within each level
When to Use: - Catalogs (course catalogs, product catalogs) - Organizational charts - Policy manuals with sections/subsections - Technical specifications - Bill of materials (parts and assemblies) - Content management (websites, documentation)
Design Patterns:
Indented Outline:
1. Parent Item
1.1 Child Item
1.2 Child Item
1.2.1 Grandchild Item
2. Parent Item
Nested Sections with Headings:
Section 1: Parent Topic
1.1 Subtopic
Content...
1.2 Subtopic
Content...
Tree Diagram: - Visual representation - Boxes and connecting lines - Good for organizational charts - Hard to scale to large hierarchies
Navigation Features: - Table of contents (with page numbers) - Breadcrumbs (showing path: Home > Category > Subcategory) - Index (alphabetical listing with page refs) - Running headers showing current section - Bookmarks or links (in digital versions)
Examples Across Domains:
Education: - Course catalog organized by college/department/course - Curriculum guide by grade/subject/topic - School handbook by section/policy/procedure - Academic program structure
Legal: - Contract with articles and sections - Policy manual with chapters and sections - Legal code with titles, chapters, sections - Court rules and procedures
Real Estate: - Property features hierarchically organized - Neighborhood guide by area/amenity - Building specifications by system/component - Development plans by phase/building/unit
Retail: - Product catalog by category/subcategory/product - Parts catalog by system/assembly/part - Vendor directory by category/company - Store directory by floor/department
Healthcare: - Medical records organized by encounter/section/note - Treatment protocols by condition/stage/intervention - Facility directory by building/floor/department - Formulary by therapeutic class/drug class/medication
Implementation Considerations: - Recursive data structures or multiple table joins - Depth limits (how many levels to support?) - Consistent formatting across levels - Table of contents generation - Page numbering strategies (restart in each section?) - Cross-references between sections - What if hierarchy changes frequently?
Challenges: - Maintaining hierarchy can be complex (moving sections) - Deep hierarchies become hard to navigate - Inconsistent categorization across hierarchies - Version control: how to track changes in structure? - Print vs. digital: different navigation affordances
Pattern 5: Matrix Pattern
Also known as: Table, Grid, Cross-tab, Comparison Matrix
Context: You need to show relationships between two dimensions, compare multiple entities across multiple attributes, or display data where both rows and columns have meaning.
Structure:
Document = Two-Dimensional Grid
| Monday | Tuesday | Wednesday | Thursday | Friday
-----------------------------------------------------------------
Room A | Math | Science | Math | History | Math
Room B | English| Art | English | Music | English
Room C | PE | PE | Library | PE | Study
Students (rows) × Assignments (columns) = Grades
Products (rows) × Features (columns) = Specifications
Locations (rows) × Time Slots (columns) = Schedule
Visual Characteristics: - Table structure with labeled rows and columns - Both axes have meaning (not just decoration) - Often includes totals or summaries in margins - May use color coding or icons for data - Cells may contain text, numbers, checkmarks, or status indicators
Matrix Types:
Comparison Matrix: - Entities in rows - Attributes in columns - Values in cells - Purpose: Compare options - Example: Product comparison (features × products)
Assignment Matrix: - Resources in rows - Time periods in columns - Assignments in cells - Purpose: Show who does what when - Example: Staff schedule (employees × shifts)
Status Matrix: - Items in rows - Stages or criteria in columns - Status indicators in cells - Purpose: Track progress - Example: Project tasks × status (planned/started/completed)
Relationship Matrix: - Entities in both rows and columns - Relationships in cells - Purpose: Show connections - Example: Prerequisites (courses × prerequisite courses)
Data Requirements: - Data organized by two dimensions - Can come from flat table with two grouping columns - Or from cross-reference table (many-to-many relationship) - May include calculated cells (totals, percentages)
When to Use: - Schedules (time × location, or students × classes) - Comparison tables (products × features) - Grade sheets (students × assignments) - Skill matrices (employees × skills) - Project tracking (tasks × status) - Seating charts (rows × columns)
Layout Considerations:
Column Width: - Uniform width (cleaner) vs. variable (accommodates content) - Narrow columns for checkmarks/icons - Wide columns for text descriptions
Row Height: - Uniform height (easier to scan) - Variable height (accommodates varying content)
Headers: - Column headers at top - Row headers at left - May need repeated headers if table spans pages - Consider frozen/fixed headers for scrolling (digital)
Aggregations: - Row totals (rightmost column) - Column totals (bottom row) - Grand total (bottom-right cell)
Examples Across Domains:
Education: - Grade sheet: Students (rows) × Assignments (columns) = Grades - Schedule: Time slots (rows) × Classrooms (columns) = Classes - Curriculum map: Grade levels (rows) × Standards (columns) = Coverage - Skill assessment: Students (rows) × Skills (columns) = Proficiency
Legal: - Document responsibility matrix: Documents (rows) × Parties (columns) = Responsible party - Timeline: Events (rows) × Dates (columns) = Status - Privilege log: Documents (rows) × Attributes (columns) = Values
Real Estate: - Property comparison: Properties (rows) × Features (columns) = Values - Showing schedule: Properties (rows) × Time slots (columns) = Showings - Market analysis: Neighborhoods (rows) × Metrics (columns) = Values
Retail: - Product comparison: Products (rows) × Features (columns) = Specifications - Inventory: Products (rows) × Locations (columns) = Quantities - Pricing: Products (rows) × Customer types (columns) = Prices
Healthcare: - Medication administration: Patients (rows) × Time slots (columns) = Medications - Treatment protocol: Conditions (rows) × Treatments (columns) = Indications - Staff schedule: Providers (rows) × Shifts (columns) = Assignments
Implementation Considerations: - Can generate very large tables (100s of rows × dozens of columns) - Pagination: how to break across pages? - Column wrapping for wide tables? - Sorting and filtering options? - Conditional formatting (color cells based on values) - How to handle missing data (empty cells, N/A, default values)
Challenges: - Large matrices don't fit on one page - Variable content length in cells - How to show when cell values are complex (not just text/numbers) - Digital vs. print: scrolling vs. page breaks - Accessibility: screen readers struggle with complex tables
Pattern 6: Narrative Flow Pattern
Also known as: Flowing, Magazine-style, Newsletter, Multi-column
Context: You need to present multiple pieces of content in a continuous reading experience, where layout is flexible and content flows naturally.
Structure:
Document = Stream of Content Blocks
Newsletter:
├── Header (masthead, date, issue)
├── Lead article (spans 2 columns)
├── Sidebar story (1 column)
├── Feature article (2 columns with images)
├── Short items (1 column each)
└── Footer (contact info)
Visual Characteristics: - Multi-column layouts common - Text flows from column to column, page to page - Mixed content types (articles, images, sidebars, callouts) - Visually rich, magazine-like design - Breaking points determined by content and layout - Flexible rather than rigid structure
Content Organization:
Linear Flow: - Content progresses sequentially - Logical reading order - "Continued on page X" for long articles
Sectioned Flow: - Distinct sections (departments, topics) - Each section has multiple items - Consistent section formatting
Grid-based Flow: - Underlying grid determines layout - Content fills grid cells - Visual variety within structure
Data Requirements: - Content items (articles, stories, announcements) - Metadata (title, author, category, length, priority) - Assets (images, graphics) - Layout hints (featured vs. regular, column span)
When to Use: - Newsletters - Magazines - Marketing brochures - Event programs - Annual reports (narrative sections) - Marketing/sales materials
Design Patterns:
Two-Column Layout: - Classic magazine style - Good readability for text-heavy content - Flexible for mixed content
Three-Column Layout: - More flexibility for varied content - Can combine columns (1-col sidebar + 2-col article) - Good for dense information
Modular Grid: - Content blocks of varying sizes - More design flexibility - Requires careful layout planning
Dynamic Layout: - System determines optimal layout based on content - Challenging to implement well - Best with flexible design system
Examples Across Domains:
Education: - School newsletter with multiple articles - Co-op updates with announcements - Parent communication with various items - Event program with schedule and descriptions
Business: - Company newsletter - Annual report (narrative sections) - Marketing brochure - Product launch materials
Real Estate: - Property showcase with multiple listings - Neighborhood guide with various features - Market report with commentary and data
Retail: - Promotional flyer with multiple offers - Seasonal catalog with featured items - Product showcase with descriptions
Non-Profit: - Donor newsletter - Impact report - Event program - Annual appeal
Implementation Considerations: - Much more complex than other patterns - Requires sophisticated layout engine - Content may need to be "flowed" across pages - Images and text must integrate smoothly - Challenging to automate (many design decisions) - May be better suited to semi-automated approach (system generates content blocks, designer arranges)
When NOT to Use: - Fully automated generation is difficult - If content structure is highly consistent, other patterns may work better - If layout expertise isn't available - If print quality matters (automated layout rarely matches professional design)
3.4 Data Relationship Models
Document patterns connect directly to data relationship models. Understanding these relationships is crucial for both database design and template architecture.
One-to-One (1:1) Relationships
Definition: Each record in Table A relates to exactly one record in Table B, and vice versa.
Example: - Each Employee has one EmergencyContact - Each Student has one BirthCertificate - Each Property has one LegalDescription
Document Pattern: Typically Atomic - The related record is simply additional attributes - Merged into single document seamlessly
Implementation: - Can be same table (just more fields) - Or separate tables joined by primary key - Simple foreign key relationship
Rare in Practice: Most true 1:1 relationships could be combined into one table. Separate tables make sense when: - Optional data (not all employees have emergency contact) - Different access controls (birth certificates are sensitive) - Different update patterns (legal descriptions rarely change)
One-to-Many (1:N) Relationships
Definition: Each record in Table A relates to zero or more records in Table B, but each Table B record relates to exactly one Table A record.
Example: - One Student has many Grades - One Invoice has many LineItems - One Class has many Students enrolled - One Department has many Employees
Document Pattern: Master-Detail - Master record = Table A (the "one") - Detail records = Table B (the "many") - Detail section typically in table or list format
Implementation: - Foreign key in Table B points to Table A - Query: SELECT * FROM TableB WHERE table_a_id = ? - Natural for relational databases
Most Common Pattern: The workhorse of data modeling. Most business documents reflect 1:N relationships.
Many-to-Many (M:N) Relationships
Definition: Each record in Table A can relate to multiple records in Table B, and each Table B record can relate to multiple Table A records.
Example: - Many Students are in Many Classes - Many Products are in Many Orders - Many Instructors teach Many Courses - Many Attorneys work on Many Cases
Implementation: Requires junction/bridge table
Students Table
Enrollments Table (junction):
- student_id (FK to Students)
- class_id (FK to Classes)
- enrollment_date
- grade
Classes Table
Document Patterns: - Can view from either side: - Student perspective: Student + Classes enrolled (Master-Detail) - Class perspective: Class + Students enrolled (Master-Detail) - Or Matrix pattern: Students × Classes = Enrollment status
Complexity: - Requires two joins to traverse relationship - Junction table often has its own attributes (enrollment date, role, etc.) - More complex to query and maintain
Hierarchical (Tree) Relationships
Definition: Records organized in parent-child tree structure, where each record has zero or one parent, and zero or more children.
Example: - Organization: Company → Divisions → Departments → Teams - Course Catalog: College → Department → Course → Section - Bill of Materials: Assembly → Subassembly → Parts - File System: Folder → Subfolder → Files
Implementation Options:
Adjacency List (most common):
Categories Table:
- category_id (PK)
- category_name
- parent_category_id (FK to Categories, NULL for root)
Nested Sets: Encode tree in left/right boundaries (complex but efficient)
Path Enumeration: Store full path as string (e.g., "/engineering/cs/courses")
Closure Table: Separate table storing all ancestor-descendant pairs
Document Pattern: Hierarchical - Visual hierarchy with indentation or numbering - Often includes table of contents - Navigation challenges with deep hierarchies
Challenges: - How deep can hierarchy go? (depth limits) - How to move nodes around tree? - How to query (all descendants, all ancestors)? - How to handle circular references (prevent them!)
Network (Graph) Relationships
Definition: More complex than trees—nodes can have multiple parents and multiple children, forming general graphs.
Example: - Course prerequisites: CS202 requires both CS101 AND Math150 - Project dependencies: Task C depends on Task A and Task B - Citation networks: Paper cites multiple other papers - Social networks: Person knows multiple people
Implementation: - Similar to M:N (use junction/edge table) - Must prevent or handle cycles - Graph traversal algorithms needed
Document Patterns: - Diagrams (network visualizations) - Referenced lists with cross-references - Challenging to represent in linear documents
Less Common: Most business documents don't need full graph structures. When they do, often simplified or flattened for presentation.
Temporal Relationships
Definition: Relationships that change over time, requiring historical tracking.
Example: - Student enrolled in different classes each semester - Employee works in different departments over career - Product has different prices in different periods
Implementation Strategies:
Snapshot Approach: - Store state at points in time - Each record includes effective date
Transaction Log Approach: - Record all changes with timestamps - Reconstruct current/historical state from log
Slowly Changing Dimensions (from data warehousing): - Type 1: Overwrite (no history) - Type 2: Add new row with effective dates - Type 3: Add columns for current/previous
Document Considerations: - "As of" date matters: report card as of semester end - Historical documents must be reproducible - Audit trail requirements
Aggregated/Derived Data
Definition: Data calculated or summarized from other data, not stored directly.
Example: - GPA calculated from grades - Invoice total calculated from line items - Inventory quantity from transactions (receipts - shipments)
Options:
Calculate on demand: - Always current - Expensive if complex - May be slow
Store and update: - Fast to retrieve - Risk of staleness - Must maintain correctly
Materialized views: - Database handles refresh - Balance of performance and currency
Document Implications: - Where does calculation happen? (database, application, template) - What if calculation logic changes? (historical documents wrong) - How to handle rounding/precision?
3.5 The Document Pattern Catalog
We now have the foundations to create a comprehensive pattern catalog. This catalog formalizes each pattern following a standard structure.
For each pattern, we document:
- Pattern Name: Memorable identifier
- Aliases: Other names for same pattern
- Intent: What problem does this solve?
- Motivation: Why use this pattern?
- Context: When is this pattern appropriate?
- Structure: Visual/logical organization
- Data Requirements: What entities and relationships needed?
- Participants: Key entities and roles
- Collaborations: How entities interact
- Consequences: Benefits and limitations
- Implementation: Technical considerations
- Examples: Concrete instances across domains
- Variations: Common modifications
- Related Patterns: How it connects to others
The Six Core Patterns:
- Atomic Pattern: One record, one document
- Directory Pattern: Many similar records, one document
- Master-Detail Pattern: One parent with many children
- Hierarchical Pattern: Tree structure with nested sections
- Matrix Pattern: Two-dimensional grid
- Narrative Flow Pattern: Flexible multi-column layout
Derived/Combined Patterns:
- Form Pattern: Structured data entry (variation of Atomic with inputs)
- Dashboard Pattern: Multiple metrics with visualizations (variation of Matrix)
- Timeline Pattern: Chronological events (variation of Directory with temporal axis)
- Comparison Pattern: Side-by-side entities (variation of Matrix with emphasis on comparison)
- Card Deck Pattern: Stack of atomic documents (collection of Atomic)
- Accordion Pattern: Hierarchical with collapsible sections (Hierarchical with interaction)
Pattern Relationships and Composition
Patterns can nest and combine:
Hierarchical containing Directory: - Course catalog (hierarchical) where each department section contains a directory of courses
Master-Detail containing Matrix: - Student (master) with grade sheet (matrix of assignments × grading periods)
Directory with embedded Atomic: - Product catalog where each product listing is essentially an atomic document
Narrative Flow containing multiple patterns: - Newsletter with featured article (atomic), event list (directory), schedule (matrix)
Understanding these compositions enables sophisticated document design while maintaining conceptual clarity.
This formal framework provides:
- Systematic classification of all document types
- Shared vocabulary for discussing document structures
- Analytical tools for understanding any document domain
- Design patterns that guide implementation
- Foundation for all subsequent practical applications
Further Reading
On Ontology Development: - Noy, Natalya F., and Deborah L. McGuinness. "Ontology Development 101: A Guide to Creating Your First Ontology." Stanford, 2001. https://protege.stanford.edu/publications/ontology_development/ontology101.pdf - Gruber, Thomas R. "Toward Principles for the Design of Ontologies Used for Knowledge Sharing." International Journal of Human-Computer Studies 43 (1995): 907-928. - W3C SKOS Simple Knowledge Organization System: https://www.w3.org/2004/02/skos/ (Standard for taxonomies)
On Document Classification: - ISO 15489: Records Management Standard. https://www.iso.org/standard/62542.html (International standard for records) - Dublin Core Metadata Initiative: https://www.dublincore.org/ (Metadata standards for resources)
On Entity-Relationship Modeling: - Chen, Peter Pin-Shan. "The Entity-Relationship Model—Toward a Unified View of Data." ACM Transactions on Database Systems 1 (1976): 9-36. (The original ER model paper) - Teorey, Toby, et al. Database Modeling and Design, 5th Edition. Morgan Kaufmann, 2011. - Ambler, Scott. "Data Modeling 101." http://agiledata.org/essays/dataModeling101.html (Practical introduction)
On Document Patterns: - Coplien, James O., and Douglas C. Schmidt, eds. Pattern Languages of Program Design. Addison-Wesley, 1995. - Fowler, Martin. Analysis Patterns: Reusable Object Models. Addison-Wesley, 1996. (Domain modeling patterns)
Related Patterns in This Trilogy: - All of Volume 2: How to build intelligence on top of document ontologies - Volume 3, Pattern 21 (Form-Document Coherence): Ensuring input structures match output needs - See Appendix G: Cross-Volume Pattern Map: "Chain 5: Form Data → Document Generation → Document Analysis"
Tools for Ontology Development: - Protégé: https://protege.stanford.edu/ (Open-source ontology editor) - WebProtégé: https://webprotege.stanford.edu/ (Collaborative ontology development)