Volume 1: Domain-Specific Document Automation

Chapter 3: Document Ontology - A Formal Framework

In the previous chapter, we established theoretical foundations drawing from genre theory, information architecture, pattern languages, knowledge representation, and cognitive science. Now we apply these frameworks to create a systematic ontology of documents—a formal classification system that describes what documents are, how they're structured, and how they relate to underlying data.

This ontology serves multiple purposes:

  1. Analytical: Provides a framework for analyzing any document domain
  2. Descriptive: Creates shared vocabulary for discussing document structures
  3. Prescriptive: Guides design of new document types and templates
  4. Computational: Enables automated reasoning about documents
  5. Educational: Helps practitioners understand document patterns

We'll develop this ontology in layers, from abstract to concrete: - Core document dimensions (foundational axes of classification) - Communicative function taxonomy (what documents do) - Information architecture patterns (how information is structured) - Data relationship models (how documents map to data) - The document pattern catalog (reusable solutions)

3.1 Core Document Dimensions

Every document can be analyzed along four orthogonal dimensions. These are independent axes that together characterize any document type:

Dimension 1: Communicative Function

What does this document DO in the world?

Drawing from Speech Act Theory, documents perform acts. The primary function shapes everything else about the document:

Declarative (Assert information) - State facts about the world - Report findings or status - Document what exists - Examples: Reports, directories, catalogs, dashboards

Performative (Enact change) - Create legal/social reality - Certify achievements - Authorize actions - Examples: Contracts, certificates, licenses, invoices

Directive (Instruct behavior) - Specify procedures - Assign tasks - Schedule activities - Examples: Manuals, work orders, schedules, assignments

Commissive (Commit to future actions) - Promise deliverables - Pledge support - Propose solutions - Examples: Proposals, agreements, project plans, warranties

Expressive (Convey attitudes) - Express opinions - Demonstrate satisfaction - Show appreciation - Examples: Testimonials, reviews, recommendations, references

Importance: Communicative function determines: - Required formality and authority - Validation and approval requirements - Legal and compliance considerations - Distribution and retention policies - Update frequency and versioning needs

Dimension 2: Information Architecture

How is information STRUCTURED within the document?

This describes the organizational logic—how information flows and relates within the document itself:

Atomic Structure - Single coherent unit - One record → one document instance - Self-contained - Examples: Individual certificate, single property flyer, one invoice

Linear Structure - Sequential presentation - Beginning-to-end narrative or logical flow - Order matters - Examples: Reports, procedures, timelines, stories

Hierarchical Structure - Nested sections and subsections - Tree-like organization - Parent-child relationships - Examples: Course catalogs, policy manuals, organizational charts

Tabular Structure - Row-and-column organization - Structured data display - Comparison-friendly - Examples: Price lists, schedules, comparison matrices

Network Structure - Non-hierarchical connections - Cross-references and links - Multiple navigation paths - Examples: Hyperlinked documentation, knowledge bases, legal citations

Flowing Structure - Continuous stream with dynamic breaks - Content determines boundaries - Flexible layout - Examples: Newsletters, magazines, multi-column layouts

Importance: Information architecture determines: - Layout complexity - Navigation requirements - How sections relate - Ease of modification - Scalability to large datasets

Dimension 3: Social Context

Who creates and uses this document, and WHY?

Documents exist within social contexts—organizational practices, professional communities, regulatory environments:

Organizational Context - Internal vs. external audience - Formal vs. informal tone - Hierarchical position (executive summary vs. detailed report) - Frequency (daily, monthly, annually, ad-hoc)

Professional Community - Domain-specific conventions (legal briefs vs. medical records) - Industry standards (real estate MLS format, academic transcripts) - Certification requirements (who can issue what) - Regulatory compliance (FERPA for education, HIPAA for healthcare)

Stakeholder Relationships - Power dynamics (manager to employee, vendor to client) - Legal relationships (parties to contract, plaintiff to defendant) - Transactional vs. relational - Accountability and liability

Cultural Context - Regional variations (date formats, measurement systems) - Language and translation considerations - Accessibility requirements - Privacy and consent norms

Importance: Social context determines: - Required content and disclosures - Appropriate tone and formality - Distribution and access controls - Audit and compliance requirements - Signature and approval workflows

Dimension 4: Material Form

What is the document's physical/digital MANIFESTATION?

While content is primary, form matters for usability and function:

Format - Word processing document (editable) - PDF (fixed layout, printable) - Web page (responsive, searchable) - Spreadsheet (calculable) - Presentation (sequential slides)

Medium - Digital only - Print only - Digital-first with print option - Multi-modal (QR codes linking digital content)

Layout Characteristics - Page size and orientation - Column structure - Image density - Color vs. black-and-white - Interactive elements

Delivery Mechanism - Email attachment - Printed and mailed - Web portal access - API delivery - Embedded in other documents

Importance: Material form determines: - Authoring tools and workflows - Distribution infrastructure - Storage and archiving approach - Editability and version control - Accessibility considerations

The Four-Dimensional Classification Space

Every document can be plotted in this four-dimensional space:

Example: Student Report Card - Communicative Function: Declarative (reports academic performance) - Information Architecture: Hierarchical (sections by subject) with Tabular elements (grades table) - Social Context: Internal educational document; regulatory compliance (FERPA); parent-school relationship; quarterly frequency - Material Form: PDF for distribution; print option; portrait letter-size; contains sensitive data

Example: Real Estate Purchase Agreement - Communicative Function: Performative (creates binding contract) - Information Architecture: Hierarchical (articles and clauses) with some Linear elements (recitals) - Social Context: Legal document; buyer-seller relationship; regulatory compliance (real estate law); one-time transaction - Material Form: Word document for negotiation, then PDF for execution; requires signatures; legal-size pages

This four-dimensional framework enables: - Systematic comparison of document types - Identification of requirements (performative documents need more validation than declarative) - Pattern recognition (similar positions in space suggest similar solutions) - Gap analysis (what document types are missing for a domain?)

3.2 Communicative Function Taxonomy

Let's examine each communicative function in depth, as this is the primary axis for understanding what documents accomplish.

3.2.1 Declarative Documents: Asserting Information

Purpose: To make claims about the state of the world, report findings, or document what exists.

Characteristics: - Factual rather than opinional - Can be verified as accurate or inaccurate - Often include data, measurements, observations - May aggregate or summarize information - Updated when facts change

Subtypes:

Reports (Temporal analysis) - Status reports (current state) - Progress reports (change over time) - Performance reports (metrics and analysis) - Financial reports (accounting data) - Research reports (findings and interpretation)

Directories (Entity listings) - Membership directories (who belongs) - Contact lists (how to reach people) - Organizational rosters (structure and roles) - Resource inventories (what's available) - Product catalogs (what's for sale)

Summaries (Information condensation) - Executive summaries (key points only) - Dashboards (metrics at-a-glance) - Abstracts (paper summaries) - Meeting minutes (discussion summary) - Case summaries (legal/medical briefs)

Documentation (Recorded information) - Medical records (patient history) - Case files (legal matter details) - Project documentation (technical specs) - Meeting notes (discussion capture) - Observation logs (chronological records)

Data Requirements: - Underlying data must be current and accurate - Clear attribution of information sources - Timestamp or effective date - Version control for updates

Validation Needs: - Moderate (facts should be accurate but errors rarely catastrophic) - Data quality checks - Calculation verification - Consistency across reports

Update Frequency: - Varies widely: real-time dashboards to annual reports - Often scheduled (monthly, quarterly) - Triggered by data changes in underlying systems

Design Considerations: - Clarity and readability paramount - Visual hierarchy to show importance - Tables and charts for data-heavy content - Comparisons (current vs. previous, actual vs. target) - Context (what do these numbers mean?)

Examples Across Domains: - Education: Student rosters, grade reports, attendance summaries, enrollment statistics - Business: Sales reports, employee directories, inventory lists, financial statements - Legal: Document indices, witness lists, exhibit lists, case chronologies - Real Estate: Property listings, market analyses, comparable sales reports - Healthcare: Patient summaries, lab results, census reports, quality metrics

3.2.2 Performative Documents: Enacting Change

Purpose: To create new states of affairs, not merely describe them. The document itself performs an action.

Characteristics: - Have legal or social force - Create obligations, rights, or status changes - Require authority (only certain people/roles can issue) - Often require signatures or formal approval - Become part of official record - Have compliance and audit requirements

Subtypes:

Contracts and Agreements (Mutual obligations) - Purchase agreements (buyer-seller) - Service contracts (provider-client) - Employment contracts (employer-employee) - Lease agreements (landlord-tenant) - Partnership agreements (between entities)

Certificates and Credentials (Status conferral) - Academic certificates (completion, achievement) - Professional licenses (authorization to practice) - Awards and recognitions (acknowledgment) - Birth/death certificates (vital records) - Property titles (ownership)

Financial Instruments (Payment obligations) - Invoices (requests for payment) - Receipts (acknowledgment of payment) - Purchase orders (commitment to buy) - Bills of sale (transfer of ownership) - Loan documents (debt obligations)

Authorizations (Permission grants) - Permission slips (parental consent) - Access grants (security clearances) - Licenses (right to use/do something) - Permits (building, event, business) - Powers of attorney (delegation of authority)

Legal Filings (Court/regulatory actions) - Complaints and petitions (initiate proceedings) - Motions and briefs (request court action) - Regulatory filings (compliance documents) - Registrations (official notice) - Notices (formal communication)

Data Requirements: - All parties must be correctly identified - Terms must be complete and unambiguous - Dates and effective periods must be clear - Consideration or obligations must be specified - Conditions and contingencies must be explicit

Validation Needs: - High (errors can have legal and financial consequences) - Authority verification (is issuer authorized?) - Completeness checking (all required fields present?) - Consistency checking (no contradictions?) - Signature/approval tracking - Audit trail preservation

Update Frequency: - Generally immutable once executed - Amendments require formal process - Superseding documents explicitly reference predecessors - Version control critical

Design Considerations: - Precision and unambiguity essential - Standardized language and clauses - Clear visual hierarchy for key terms - Signature blocks and date fields - Reference numbers for tracking - Witnessing and notarization provisions where required

Examples Across Domains: - Education: Diplomas, transcripts, enrollment agreements, permission forms - Business: Contracts, NDAs, offer letters, stock certificates - Legal: Pleadings, orders, judgments, powers of attorney - Real Estate: Purchase agreements, deeds, leases, disclosures - Healthcare: Consent forms, DNR orders, HIPAA authorizations

3.2.3 Directive Documents: Instructing Behavior

Purpose: To tell readers what to do, how to do it, or when to do it.

Characteristics: - Action-oriented - Often sequential or procedural - May include conditions ("if X, then do Y") - Clarity is paramount (ambiguity causes errors) - May have compliance implications

Subtypes:

Procedures and Instructions (How-to guides) - Standard operating procedures (organizational processes) - User manuals (product operation) - Recipes (cooking/chemistry instructions) - Assembly instructions (construction steps) - Treatment protocols (medical procedures)

Assignments and Tasks (What to accomplish) - Work orders (specific jobs) - Homework assignments (academic tasks) - Project charters (initiative definitions) - Service requests (customer needs) - Action items (meeting outcomes)

Schedules and Calendars (When to do things) - Class schedules (when/where classes meet) - Work schedules (shift assignments) - Project timelines (milestone dates) - Event calendars (upcoming activities) - Maintenance schedules (recurring tasks)

Policies and Guidelines (Rules and norms) - Policy manuals (organizational rules) - Style guides (writing/design standards) - Code of conduct (behavioral expectations) - Best practice guides (recommended approaches) - Compliance requirements (regulatory rules)

Data Requirements: - Tasks/actions described clearly - Sequence or dependencies specified - Required resources identified - Deadlines or timeframes included - Responsible parties assigned

Validation Needs: - Moderate to high (depends on consequence of errors) - Completeness (all necessary steps included?) - Logical order (dependencies respected?) - Feasibility (resources available? timeframes realistic?) - Clarity (unambiguous language?)

Update Frequency: - Procedures: Updated when process changes - Assignments: One-time or recurring - Schedules: Updated regularly (term, season, project phase) - Policies: Reviewed periodically, updated as needed

Design Considerations: - Numbered steps for procedures - Clear visual hierarchy - Warnings and cautions highlighted - Checklists where appropriate - Examples and illustrations - Easy navigation (indexed, searchable)

Examples Across Domains: - Education: Assignment sheets, lesson plans, academic calendars, classroom procedures - Business: SOPs, project plans, work orders, maintenance schedules - Legal: Court procedures, filing instructions, compliance checklists - Real Estate: Showing instructions, closing checklists, maintenance guides - Healthcare: Treatment protocols, medication administration, emergency procedures

3.2.4 Commissive Documents: Committing to Actions

Purpose: To pledge future performance or commit to delivering something.

Characteristics: - Forward-looking (describe future, not present/past) - Create expectations and accountability - Often include deliverables, timelines, costs - May become legally binding - Performance measured against commitments

Subtypes:

Proposals (Offers to deliver) - Business proposals (project bids) - Grant proposals (funding requests) - Research proposals (study plans) - Sales proposals (product/service offers) - Partnership proposals (collaboration offers)

Project Plans (Work commitments) - Project charters (scope, objectives, deliverables) - Implementation plans (how work will be done) - Delivery schedules (when outputs arrive) - Resource plans (who/what will be allocated) - Risk management plans (mitigation strategies)

Service Agreements (Ongoing support) - Service level agreements (performance guarantees) - Maintenance contracts (recurring support) - Subscription terms (continuous service) - Retainer agreements (available capacity) - Support agreements (help desk, technical assistance)

Warranties and Guarantees (Quality promises) - Product warranties (repair/replace commitments) - Satisfaction guarantees (refund promises) - Performance bonds (financial backing) - Quality assurances (standards compliance) - Professional guarantees (work quality)

Data Requirements: - What will be delivered (scope, specifications) - When it will be delivered (timelines, milestones) - How much it will cost (pricing, payment terms) - Who is responsible (parties, roles) - What happens if commitments aren't met (remedies)

Validation Needs: - High (unfulfilled commitments damage relationships and finances) - Feasibility checking (can we actually do this?) - Resource validation (do we have capacity?) - Cost verification (pricing accurate?) - Risk assessment (what could go wrong?) - Approval workflows (authority to commit?)

Update Frequency: - Proposals: One-time documents - Plans: Updated as project progresses - Agreements: Periodic renewal - Versioning critical when changes occur

Design Considerations: - Clear statement of what's promised - Explicit timelines and milestones - Costs and payment terms prominent - Conditions and contingencies stated - Measurement criteria for success - Exit or termination provisions

Examples Across Domains: - Education: Course syllabi (instructor commitments), partnership MOUs, improvement plans - Business: Sales proposals, project plans, service contracts, product roadmaps - Legal: Settlement agreements (future payments), plea agreements (defendant promises) - Real Estate: Letters of intent, development proposals, property management agreements - Healthcare: Treatment plans, care coordination agreements, quality improvement plans

3.2.5 Expressive Documents: Conveying Attitudes

Purpose: To express feelings, opinions, or subjective assessments.

Characteristics: - Subjective rather than objective - Often persuasive or evaluative - May support other functions (express AND recommend) - Credibility depends on source authority/expertise - Often shorter and less formal than other types

Subtypes:

Testimonials and References (Positive assessments) - Customer testimonials (product/service praise) - Letters of recommendation (candidate endorsements) - Professional references (colleague assessments) - Success stories (implementation experiences) - Case studies (detailed positive examples)

Reviews and Evaluations (Critical assessments) - Product reviews (feature/quality analysis) - Performance reviews (employee assessments) - Peer reviews (manuscript/proposal evaluation) - Service reviews (experience ratings) - Course evaluations (teaching assessments)

Feedback and Comments (Responses and reactions) - Comment letters (regulatory responses) - Survey responses (opinion collection) - User feedback (improvement suggestions) - Complaint letters (dissatisfaction expression) - Thank you notes (appreciation expression)

Opinions and Perspectives (Viewpoint statements) - Opinion pieces (editorial content) - Position papers (organizational stance) - White papers (perspective with evidence) - Commentary (interpretation of events) - Expert opinions (professional judgments)

Data Requirements: - Subject being evaluated - Evaluator identity and credentials - Criteria or dimensions of assessment - Specific observations or experiences - Recommendations or conclusions

Validation Needs: - Low to moderate (opinion by nature subjective) - Authenticity (is reviewer who they claim?) - Relevance (does reviewer have basis for opinion?) - Appropriateness (any conflicts of interest?) - Respectfulness (no abuse or inappropriate content)

Update Frequency: - Generally point-in-time (opinions reflect moment) - May be superseded by new opinions - Historical opinions retained (show evolution)

Design Considerations: - Author/source prominently displayed - Date of opinion clear - Context provided (what prompted this?) - Balance (acknowledge limitations) - Supporting evidence or examples - Separate facts from opinions

Examples Across Domains: - Education: Teacher recommendations, course evaluations, peer assessments - Business: Performance reviews, customer testimonials, vendor assessments - Legal: Expert opinions, character references, witness statements - Real Estate: Property appraisals, neighborhood assessments, agent reviews - Healthcare: Treatment recommendations, patient satisfaction surveys, quality reviews

3.2.6 Hybrid and Composite Functions

Many real-world documents combine multiple functions:

Example: Grant Proposal - Commissive: "We will deliver these outputs on this timeline" - Declarative: "Our organization has these qualifications and this track record" - Directive: "Here's our project plan with tasks and milestones" - Expressive: "We believe this approach will be effective"

Example: Annual Report - Declarative: "Here are our financial results and operational metrics" - Expressive: "We're proud of these achievements" - Commissive: "Here are our goals for next year"

Example: Invoice with Terms - Performative: "You owe this amount" - Declarative: "Here's what we delivered" - Directive: "Payment due by this date via these methods"

Design Implication: Templates must accommodate primary and secondary functions, with appropriate emphasis on each.

3.3 Information Architecture Patterns

Now we shift from "what documents do" to "how documents are structured." These are the fundamental organizational patterns that recur across document types and domains.

These patterns are technology-agnostic—they apply whether you're creating Word documents, PDFs, web pages, or printed materials. They describe logical structure, not physical implementation.

Pattern 1: Atomic Pattern

Also known as: Single-instance, One-record-one-document

Context: You need to create individual documents for discrete entities or events, where each document is self-contained and independent.

Structure:

Document = Single Record + Template

One student → One certificate
One property → One listing flyer
One invoice → One billing document
One patient → One chart summary

Visual Characteristics: - Typically single page (or few pages) - All information about one entity - No repeated sections or loops - Often formatted for printing/framing/filing - Self-contained (doesn't require other documents to make sense)

Data Requirements: - One master record (the primary entity) - Attributes of that record - Optionally: lookup data (e.g., issuer information, organization logo) - Optionally: calculated fields (e.g., age from birthdate)

When to Use: - Certificates and credentials - ID cards and badges - Individual cover letters - Property flyers - Patient discharge summaries - Purchase receipts - Award certificates - License documents

Variations:

Simple Atomic: Just field substitution - Certificate with name, date, achievement - Badge with photo, name, role - Minimal formatting, focus on key data

Rich Atomic: Complex layout and content - Property flyer with multiple photos, detailed descriptions, maps - Resume with multiple sections, formatting, graphics - More sophisticated design

Conditional Atomic: Content varies by data - Invoice (may or may not have discount fields) - Certificate (may include honors notation if GPA > 3.5) - Personalized letters (content blocks appear based on recipient attributes)

Examples Across Domains:

Education: - Completion certificate for individual student - Individual progress report - Transcript for one student - Award certificate

Legal: - Engagement letter for one client - Retainer agreement - Certificate of service - Individual contract

Real Estate: - Property listing flyer - Comparative market analysis for one property - Offer presentation

Retail: - Product specification sheet - Individual item label - Single product flyer

Healthcare: - Patient discharge summary - Lab result report for one patient - Prescription document

Implementation Considerations: - Simplest pattern to implement - No relationship resolution needed - Can batch generate (100 certificates from 100 student records) - Each output document completely independent - Easy to parallelize generation

Pattern 2: Directory Pattern

Also known as: List, Roster, Catalog, Registry

Context: You need to present multiple similar entities in a structured way for browsing, reference, or comparison.

Structure:

Document = Collection of Similar Records + Consistent Format

All students → Student roster
All products → Product catalog
All employees → Staff directory
All properties → Listing book

Visual Characteristics: - Repeating blocks or rows - Consistent format for each entity - May include photos/images - Often includes categorization or grouping - May span multiple pages - Often includes index or search aids

Layout Variants:

Grid Layout: - Entities arranged in rows and columns - Like trading cards or photo grid - Good for image-heavy content (student photos, product images) - Fixed space per entity - Visual scanning friendly

List Layout: - Entities in single column - More detailed information per entity - Variable space per entity - Good for text-heavy content - Scrolling-friendly for digital

Table Layout: - Rows for entities, columns for attributes - Dense information presentation - Easy comparison across entities - Good for data-heavy content (specifications, metrics) - Spreadsheet-like

Card Layout: - Individual "cards" for each entity - Can be physical cards (student ID, baseball card) - Or visual cards on page/screen - Flexible space per entity - Can include front/back

Data Requirements: - One table with multiple records - Consistent attributes across records - Optional grouping/category field - Optional sort key (alphabetical, by date, by category)

When to Use: - Student/employee rosters - Product catalogs - Contact lists - Property listings - Course catalogs - Event attendee lists - Membership directories - Inventory lists

Grouping and Organization:

Flat Directory: All entities at one level - Alphabetical by name - Chronological by date - Numerical by ID

Grouped Directory: Entities organized into categories - Students by grade level - Products by category - Employees by department - Properties by neighborhood

Hierarchical Directory: Nested categories (see Hierarchical Pattern)

Filtering and Searching: - May include search features (digital) - May include index (alphabetical, by attribute) - May include table of contents - May highlight featured/priority items

Examples Across Domains:

Education: - Class roster with student photos - Instructor directory with bios - Course catalog (if simple list; hierarchical if complex) - Student contact list - Class schedule (all classes)

Legal: - Document index for case - Witness list - Exhibit list - Court docket - Attorney directory

Real Estate: - Property listings by agent - Open house schedule - Agent team directory - Available properties

Retail: - Product catalog - Price list - Inventory report - Vendor directory

Healthcare: - Provider directory - Patient census - Equipment inventory - Medication list

Implementation Considerations: - Single data source (one table) - Consistent formatting critical - Pagination strategy for large datasets - Sorting/filtering options - Performance: can generate thousands of rows - Consider breaking into sections for very large directories

Pattern 3: Master-Detail Pattern

Also known as: Header-Lines, Parent-Child, One-to-Many

Context: You need to show a parent entity with its related child records, where the relationship is one-to-many.

Structure:

Document = Master Record + Collection of Detail Records

Invoice = Invoice Header + Line Items
Report Card = Student Header + Grades by Subject
Order = Order Header + Products Ordered
Case = Case Header + Documents Filed

Visual Characteristics: - Header section with master information - Detail section with related records (often in table) - Usually includes aggregations (totals, counts, averages) - May repeat entire structure (multiple invoices, each with line items)

Components:

Master/Header Section: - Primary entity information - Aggregated/calculated fields (subtotal, total, GPA, etc.) - Context information (dates, parties, references) - Typically appears once per document

Detail Section: - Related records in structured format - Often tabular (rows for detail records) - May include calculations per row - May include subtotals by category

Footer/Summary Section: - Grand totals across all details - Final calculations - Signatures or authorizations - Notes or terms

Data Requirements: - Master table with one record per document - Detail table with multiple records per master - Foreign key relationship (detail.master_id = master.id) - Often calculations (sum, average, count over details)

When to Use: - Financial documents (invoices, orders, receipts) - Academic documents (report cards, transcripts) - Any "header + line items" scenario - Shopping carts and orders - Project summaries with task lists - Case files with document lists

Relationship Patterns:

Simple One-to-Many: - One invoice → many line items - One student → many grades - Straightforward foreign key

Categorized Details: - Details grouped by category - Student grades grouped by subject/class - Order items grouped by product category - Requires secondary grouping field

Nested Details (becomes Hierarchical Pattern): - Master → Detail → Sub-detail - Order → Line Items → Components - Project → Tasks → Subtasks

Calculations and Aggregations:

Row-level calculations: - Line total = quantity × unit price - Weighted score = points earned × weight - Hours × rate = charge

Subtotal calculations: - Total by category - Average by group - Count by type

Grand total calculations: - Sum of all line items - Overall GPA across all classes - Total hours across all tasks

Examples Across Domains:

Education: - Report card: Student + grades in multiple subjects - Transcript: Student + all courses taken over time - Class summary: Class + list of students with grades - Assignment sheet: Class + list of assignments due

Legal: - Invoice: Matter + time entries by attorney - Case document list: Case + documents filed - Discovery response: Case + documents produced - Exhibit list: Trial + exhibits presented

Real Estate: - Comparative market analysis: Subject property + comparable sales - Agent activity report: Agent + properties shown/sold - Property features: Property + amenities/features

Retail: - Invoice: Order + line items - Purchase order: PO + items ordered - Packing slip: Shipment + items included - Catalog page: Brand + products in that brand

Healthcare: - Patient encounter: Visit + procedures/diagnoses - Lab report: Patient + test results - Treatment plan: Patient + interventions ordered - Medication list: Patient + current medications

Implementation Considerations: - Must resolve relationships before generation - Calculation logic can be complex - Need to handle variable number of detail records - Pagination: what if details don't fit on one page? - Subtotals and groups add complexity - Validation: ensure detail totals match master totals

Validation Challenges: - Referential integrity: every detail must have valid master - Calculation accuracy: sum(line items) should equal invoice total - Missing details: invoice with zero line items is invalid - Orphaned details: detail records with no master

Pattern 4: Hierarchical Pattern

Also known as: Tree, Nested, Outline, Catalog

Context: You need to organize information into nested sections and subsections, where relationships form a tree structure.

Structure:

Document = Nested Sections + Subsections

Course Catalog:
├── College of Engineering
│   ├── Computer Science Department
│   │   ├── CS 101: Introduction to Programming
│   │   ├── CS 102: Data Structures
│   │   └── CS 201: Algorithms
│   └── Electrical Engineering Department
│       ├── EE 101: Circuits
│       └── EE 201: Electronics
└── College of Arts and Sciences
    ├── English Department
    └── History Department

Visual Characteristics: - Clear visual hierarchy (headings, indentation, numbering) - Table of contents often included - Nested numbering (1.0, 1.1, 1.1.1, etc.) - Consistent formatting by level - Navigation aids (page numbers, headers) - May include indexes or cross-references

Hierarchy Levels:

Shallow Hierarchy (2-3 levels): - Section → Item - Category → Product - Department → Employee - Manageable, easy to navigate

Medium Hierarchy (4-5 levels): - Division → Department → Team → Role → Employee - Category → Subcategory → Product Line → Product → SKU - More complex but still navigable

Deep Hierarchy (6+ levels): - Complex organizational structures - Detailed taxonomies - Technical specifications - Can become hard to navigate without good design

Data Requirements: - Hierarchical table(s) with parent-child relationships - Could be self-referential (each record points to parent) - Or multiple tables (Department table, Employee table with dept_id) - Level indicator or depth calculation - Sort order within each level

When to Use: - Catalogs (course catalogs, product catalogs) - Organizational charts - Policy manuals with sections/subsections - Technical specifications - Bill of materials (parts and assemblies) - Content management (websites, documentation)

Design Patterns:

Indented Outline:

1. Parent Item
   1.1 Child Item
   1.2 Child Item
       1.2.1 Grandchild Item
2. Parent Item

Nested Sections with Headings:

Section 1: Parent Topic
  1.1 Subtopic
      Content...
  1.2 Subtopic
      Content...

Tree Diagram: - Visual representation - Boxes and connecting lines - Good for organizational charts - Hard to scale to large hierarchies

Navigation Features: - Table of contents (with page numbers) - Breadcrumbs (showing path: Home > Category > Subcategory) - Index (alphabetical listing with page refs) - Running headers showing current section - Bookmarks or links (in digital versions)

Examples Across Domains:

Education: - Course catalog organized by college/department/course - Curriculum guide by grade/subject/topic - School handbook by section/policy/procedure - Academic program structure

Legal: - Contract with articles and sections - Policy manual with chapters and sections - Legal code with titles, chapters, sections - Court rules and procedures

Real Estate: - Property features hierarchically organized - Neighborhood guide by area/amenity - Building specifications by system/component - Development plans by phase/building/unit

Retail: - Product catalog by category/subcategory/product - Parts catalog by system/assembly/part - Vendor directory by category/company - Store directory by floor/department

Healthcare: - Medical records organized by encounter/section/note - Treatment protocols by condition/stage/intervention - Facility directory by building/floor/department - Formulary by therapeutic class/drug class/medication

Implementation Considerations: - Recursive data structures or multiple table joins - Depth limits (how many levels to support?) - Consistent formatting across levels - Table of contents generation - Page numbering strategies (restart in each section?) - Cross-references between sections - What if hierarchy changes frequently?

Challenges: - Maintaining hierarchy can be complex (moving sections) - Deep hierarchies become hard to navigate - Inconsistent categorization across hierarchies - Version control: how to track changes in structure? - Print vs. digital: different navigation affordances

Pattern 5: Matrix Pattern

Also known as: Table, Grid, Cross-tab, Comparison Matrix

Context: You need to show relationships between two dimensions, compare multiple entities across multiple attributes, or display data where both rows and columns have meaning.

Structure:

Document = Two-Dimensional Grid

              | Monday | Tuesday | Wednesday | Thursday | Friday
-----------------------------------------------------------------
Room A        | Math   | Science | Math      | History  | Math
Room B        | English| Art     | English   | Music    | English
Room C        | PE     | PE      | Library   | PE       | Study

Students (rows) × Assignments (columns) = Grades
Products (rows) × Features (columns) = Specifications
Locations (rows) × Time Slots (columns) = Schedule

Visual Characteristics: - Table structure with labeled rows and columns - Both axes have meaning (not just decoration) - Often includes totals or summaries in margins - May use color coding or icons for data - Cells may contain text, numbers, checkmarks, or status indicators

Matrix Types:

Comparison Matrix: - Entities in rows - Attributes in columns - Values in cells - Purpose: Compare options - Example: Product comparison (features × products)

Assignment Matrix: - Resources in rows - Time periods in columns - Assignments in cells - Purpose: Show who does what when - Example: Staff schedule (employees × shifts)

Status Matrix: - Items in rows - Stages or criteria in columns - Status indicators in cells - Purpose: Track progress - Example: Project tasks × status (planned/started/completed)

Relationship Matrix: - Entities in both rows and columns - Relationships in cells - Purpose: Show connections - Example: Prerequisites (courses × prerequisite courses)

Data Requirements: - Data organized by two dimensions - Can come from flat table with two grouping columns - Or from cross-reference table (many-to-many relationship) - May include calculated cells (totals, percentages)

When to Use: - Schedules (time × location, or students × classes) - Comparison tables (products × features) - Grade sheets (students × assignments) - Skill matrices (employees × skills) - Project tracking (tasks × status) - Seating charts (rows × columns)

Layout Considerations:

Column Width: - Uniform width (cleaner) vs. variable (accommodates content) - Narrow columns for checkmarks/icons - Wide columns for text descriptions

Row Height: - Uniform height (easier to scan) - Variable height (accommodates varying content)

Headers: - Column headers at top - Row headers at left - May need repeated headers if table spans pages - Consider frozen/fixed headers for scrolling (digital)

Aggregations: - Row totals (rightmost column) - Column totals (bottom row) - Grand total (bottom-right cell)

Examples Across Domains:

Education: - Grade sheet: Students (rows) × Assignments (columns) = Grades - Schedule: Time slots (rows) × Classrooms (columns) = Classes - Curriculum map: Grade levels (rows) × Standards (columns) = Coverage - Skill assessment: Students (rows) × Skills (columns) = Proficiency

Legal: - Document responsibility matrix: Documents (rows) × Parties (columns) = Responsible party - Timeline: Events (rows) × Dates (columns) = Status - Privilege log: Documents (rows) × Attributes (columns) = Values

Real Estate: - Property comparison: Properties (rows) × Features (columns) = Values - Showing schedule: Properties (rows) × Time slots (columns) = Showings - Market analysis: Neighborhoods (rows) × Metrics (columns) = Values

Retail: - Product comparison: Products (rows) × Features (columns) = Specifications - Inventory: Products (rows) × Locations (columns) = Quantities - Pricing: Products (rows) × Customer types (columns) = Prices

Healthcare: - Medication administration: Patients (rows) × Time slots (columns) = Medications - Treatment protocol: Conditions (rows) × Treatments (columns) = Indications - Staff schedule: Providers (rows) × Shifts (columns) = Assignments

Implementation Considerations: - Can generate very large tables (100s of rows × dozens of columns) - Pagination: how to break across pages? - Column wrapping for wide tables? - Sorting and filtering options? - Conditional formatting (color cells based on values) - How to handle missing data (empty cells, N/A, default values)

Challenges: - Large matrices don't fit on one page - Variable content length in cells - How to show when cell values are complex (not just text/numbers) - Digital vs. print: scrolling vs. page breaks - Accessibility: screen readers struggle with complex tables

Pattern 6: Narrative Flow Pattern

Also known as: Flowing, Magazine-style, Newsletter, Multi-column

Context: You need to present multiple pieces of content in a continuous reading experience, where layout is flexible and content flows naturally.

Structure:

Document = Stream of Content Blocks

Newsletter:
├── Header (masthead, date, issue)
├── Lead article (spans 2 columns)
├── Sidebar story (1 column)
├── Feature article (2 columns with images)
├── Short items (1 column each)
└── Footer (contact info)

Visual Characteristics: - Multi-column layouts common - Text flows from column to column, page to page - Mixed content types (articles, images, sidebars, callouts) - Visually rich, magazine-like design - Breaking points determined by content and layout - Flexible rather than rigid structure

Content Organization:

Linear Flow: - Content progresses sequentially - Logical reading order - "Continued on page X" for long articles

Sectioned Flow: - Distinct sections (departments, topics) - Each section has multiple items - Consistent section formatting

Grid-based Flow: - Underlying grid determines layout - Content fills grid cells - Visual variety within structure

Data Requirements: - Content items (articles, stories, announcements) - Metadata (title, author, category, length, priority) - Assets (images, graphics) - Layout hints (featured vs. regular, column span)

When to Use: - Newsletters - Magazines - Marketing brochures - Event programs - Annual reports (narrative sections) - Marketing/sales materials

Design Patterns:

Two-Column Layout: - Classic magazine style - Good readability for text-heavy content - Flexible for mixed content

Three-Column Layout: - More flexibility for varied content - Can combine columns (1-col sidebar + 2-col article) - Good for dense information

Modular Grid: - Content blocks of varying sizes - More design flexibility - Requires careful layout planning

Dynamic Layout: - System determines optimal layout based on content - Challenging to implement well - Best with flexible design system

Examples Across Domains:

Education: - School newsletter with multiple articles - Co-op updates with announcements - Parent communication with various items - Event program with schedule and descriptions

Business: - Company newsletter - Annual report (narrative sections) - Marketing brochure - Product launch materials

Real Estate: - Property showcase with multiple listings - Neighborhood guide with various features - Market report with commentary and data

Retail: - Promotional flyer with multiple offers - Seasonal catalog with featured items - Product showcase with descriptions

Non-Profit: - Donor newsletter - Impact report - Event program - Annual appeal

Implementation Considerations: - Much more complex than other patterns - Requires sophisticated layout engine - Content may need to be "flowed" across pages - Images and text must integrate smoothly - Challenging to automate (many design decisions) - May be better suited to semi-automated approach (system generates content blocks, designer arranges)

When NOT to Use: - Fully automated generation is difficult - If content structure is highly consistent, other patterns may work better - If layout expertise isn't available - If print quality matters (automated layout rarely matches professional design)

3.4 Data Relationship Models

Document patterns connect directly to data relationship models. Understanding these relationships is crucial for both database design and template architecture.

One-to-One (1:1) Relationships

Definition: Each record in Table A relates to exactly one record in Table B, and vice versa.

Example: - Each Employee has one EmergencyContact - Each Student has one BirthCertificate - Each Property has one LegalDescription

Document Pattern: Typically Atomic - The related record is simply additional attributes - Merged into single document seamlessly

Implementation: - Can be same table (just more fields) - Or separate tables joined by primary key - Simple foreign key relationship

Rare in Practice: Most true 1:1 relationships could be combined into one table. Separate tables make sense when: - Optional data (not all employees have emergency contact) - Different access controls (birth certificates are sensitive) - Different update patterns (legal descriptions rarely change)

One-to-Many (1:N) Relationships

Definition: Each record in Table A relates to zero or more records in Table B, but each Table B record relates to exactly one Table A record.

Example: - One Student has many Grades - One Invoice has many LineItems - One Class has many Students enrolled - One Department has many Employees

Document Pattern: Master-Detail - Master record = Table A (the "one") - Detail records = Table B (the "many") - Detail section typically in table or list format

Implementation: - Foreign key in Table B points to Table A - Query: SELECT * FROM TableB WHERE table_a_id = ? - Natural for relational databases

Most Common Pattern: The workhorse of data modeling. Most business documents reflect 1:N relationships.

Many-to-Many (M:N) Relationships

Definition: Each record in Table A can relate to multiple records in Table B, and each Table B record can relate to multiple Table A records.

Example: - Many Students are in Many Classes - Many Products are in Many Orders - Many Instructors teach Many Courses - Many Attorneys work on Many Cases

Implementation: Requires junction/bridge table

Students Table
Enrollments Table (junction):
  - student_id (FK to Students)
  - class_id (FK to Classes)
  - enrollment_date
  - grade
Classes Table

Document Patterns: - Can view from either side: - Student perspective: Student + Classes enrolled (Master-Detail) - Class perspective: Class + Students enrolled (Master-Detail) - Or Matrix pattern: Students × Classes = Enrollment status

Complexity: - Requires two joins to traverse relationship - Junction table often has its own attributes (enrollment date, role, etc.) - More complex to query and maintain

Hierarchical (Tree) Relationships

Definition: Records organized in parent-child tree structure, where each record has zero or one parent, and zero or more children.

Example: - Organization: Company → Divisions → Departments → Teams - Course Catalog: College → Department → Course → Section - Bill of Materials: Assembly → Subassembly → Parts - File System: Folder → Subfolder → Files

Implementation Options:

Adjacency List (most common):

Categories Table:
- category_id (PK)
- category_name
- parent_category_id (FK to Categories, NULL for root)

Nested Sets: Encode tree in left/right boundaries (complex but efficient)

Path Enumeration: Store full path as string (e.g., "/engineering/cs/courses")

Closure Table: Separate table storing all ancestor-descendant pairs

Document Pattern: Hierarchical - Visual hierarchy with indentation or numbering - Often includes table of contents - Navigation challenges with deep hierarchies

Challenges: - How deep can hierarchy go? (depth limits) - How to move nodes around tree? - How to query (all descendants, all ancestors)? - How to handle circular references (prevent them!)

Network (Graph) Relationships

Definition: More complex than trees—nodes can have multiple parents and multiple children, forming general graphs.

Example: - Course prerequisites: CS202 requires both CS101 AND Math150 - Project dependencies: Task C depends on Task A and Task B - Citation networks: Paper cites multiple other papers - Social networks: Person knows multiple people

Implementation: - Similar to M:N (use junction/edge table) - Must prevent or handle cycles - Graph traversal algorithms needed

Document Patterns: - Diagrams (network visualizations) - Referenced lists with cross-references - Challenging to represent in linear documents

Less Common: Most business documents don't need full graph structures. When they do, often simplified or flattened for presentation.

Temporal Relationships

Definition: Relationships that change over time, requiring historical tracking.

Example: - Student enrolled in different classes each semester - Employee works in different departments over career - Product has different prices in different periods

Implementation Strategies:

Snapshot Approach: - Store state at points in time - Each record includes effective date

Transaction Log Approach: - Record all changes with timestamps - Reconstruct current/historical state from log

Slowly Changing Dimensions (from data warehousing): - Type 1: Overwrite (no history) - Type 2: Add new row with effective dates - Type 3: Add columns for current/previous

Document Considerations: - "As of" date matters: report card as of semester end - Historical documents must be reproducible - Audit trail requirements

Aggregated/Derived Data

Definition: Data calculated or summarized from other data, not stored directly.

Example: - GPA calculated from grades - Invoice total calculated from line items - Inventory quantity from transactions (receipts - shipments)

Options:

Calculate on demand: - Always current - Expensive if complex - May be slow

Store and update: - Fast to retrieve - Risk of staleness - Must maintain correctly

Materialized views: - Database handles refresh - Balance of performance and currency

Document Implications: - Where does calculation happen? (database, application, template) - What if calculation logic changes? (historical documents wrong) - How to handle rounding/precision?

3.5 The Document Pattern Catalog

We now have the foundations to create a comprehensive pattern catalog. This catalog formalizes each pattern following a standard structure.

For each pattern, we document:

  1. Pattern Name: Memorable identifier
  2. Aliases: Other names for same pattern
  3. Intent: What problem does this solve?
  4. Motivation: Why use this pattern?
  5. Context: When is this pattern appropriate?
  6. Structure: Visual/logical organization
  7. Data Requirements: What entities and relationships needed?
  8. Participants: Key entities and roles
  9. Collaborations: How entities interact
  10. Consequences: Benefits and limitations
  11. Implementation: Technical considerations
  12. Examples: Concrete instances across domains
  13. Variations: Common modifications
  14. Related Patterns: How it connects to others

The Six Core Patterns:

  1. Atomic Pattern: One record, one document
  2. Directory Pattern: Many similar records, one document
  3. Master-Detail Pattern: One parent with many children
  4. Hierarchical Pattern: Tree structure with nested sections
  5. Matrix Pattern: Two-dimensional grid
  6. Narrative Flow Pattern: Flexible multi-column layout

Derived/Combined Patterns:

  1. Form Pattern: Structured data entry (variation of Atomic with inputs)
  2. Dashboard Pattern: Multiple metrics with visualizations (variation of Matrix)
  3. Timeline Pattern: Chronological events (variation of Directory with temporal axis)
  4. Comparison Pattern: Side-by-side entities (variation of Matrix with emphasis on comparison)
  5. Card Deck Pattern: Stack of atomic documents (collection of Atomic)
  6. Accordion Pattern: Hierarchical with collapsible sections (Hierarchical with interaction)

Pattern Relationships and Composition

Patterns can nest and combine:

Hierarchical containing Directory: - Course catalog (hierarchical) where each department section contains a directory of courses

Master-Detail containing Matrix: - Student (master) with grade sheet (matrix of assignments × grading periods)

Directory with embedded Atomic: - Product catalog where each product listing is essentially an atomic document

Narrative Flow containing multiple patterns: - Newsletter with featured article (atomic), event list (directory), schedule (matrix)

Understanding these compositions enables sophisticated document design while maintaining conceptual clarity.


This formal framework provides: - Systematic classification of all document types - Shared vocabulary for discussing document structures
- Analytical tools for understanding any document domain - Design patterns that guide implementation - Foundation for all subsequent practical applications

Further Reading

On Ontology Development: - Noy, Natalya F., and Deborah L. McGuinness. "Ontology Development 101: A Guide to Creating Your First Ontology." Stanford, 2001. https://protege.stanford.edu/publications/ontology_development/ontology101.pdf - Gruber, Thomas R. "Toward Principles for the Design of Ontologies Used for Knowledge Sharing." International Journal of Human-Computer Studies 43 (1995): 907-928. - W3C SKOS Simple Knowledge Organization System: https://www.w3.org/2004/02/skos/ (Standard for taxonomies)

On Document Classification: - ISO 15489: Records Management Standard. https://www.iso.org/standard/62542.html (International standard for records) - Dublin Core Metadata Initiative: https://www.dublincore.org/ (Metadata standards for resources)

On Entity-Relationship Modeling: - Chen, Peter Pin-Shan. "The Entity-Relationship Model—Toward a Unified View of Data." ACM Transactions on Database Systems 1 (1976): 9-36. (The original ER model paper) - Teorey, Toby, et al. Database Modeling and Design, 5th Edition. Morgan Kaufmann, 2011. - Ambler, Scott. "Data Modeling 101." http://agiledata.org/essays/dataModeling101.html (Practical introduction)

On Document Patterns: - Coplien, James O., and Douglas C. Schmidt, eds. Pattern Languages of Program Design. Addison-Wesley, 1995. - Fowler, Martin. Analysis Patterns: Reusable Object Models. Addison-Wesley, 1996. (Domain modeling patterns)

Related Patterns in This Trilogy: - All of Volume 2: How to build intelligence on top of document ontologies - Volume 3, Pattern 21 (Form-Document Coherence): Ensuring input structures match output needs - See Appendix G: Cross-Volume Pattern Map: "Chain 5: Form Data → Document Generation → Document Analysis"

Tools for Ontology Development: - Protégé: https://protege.stanford.edu/ (Open-source ontology editor) - WebProtégé: https://webprotege.stanford.edu/ (Collaborative ontology development)