Skip to content

Page Metadata & Data Freshness

This documentation site now includes automated data freshness tracking to help identify which pages need updates when new medical data is received.

How It Works

1. Git-Based Timestamps (Automatic)

Every page shows when it was last modified in git. This appears at the bottom of each page:

  • Last update: Shows the git commit timestamp
  • Fully automatic - no maintenance needed
  • Accurate to the second

2. Data Dependency Tracking (Automated)

Key pages have frontmatter metadata that tracks:

  • data_last_updated: When the underlying data files were last modified
  • data_sources: Which data files the page depends on

These are automatically updated by running:

docker exec remission python src/exporters/update_page_metadata.py --verbose

3. Data Freshness Report

Generate a comprehensive status report:

docker exec remission python src/exporters/update_page_metadata.py --report

This creates docs/technical/data-status.md showing:

  • Which data files were updated and when
  • Which pages depend on which data files
  • Helps identify pages that need content updates

Page Dependencies

The following pages have automated dependency tracking:

Page Depends On
docs/index.md blood_tests.csv, medical_reports_catalog.json, treatment_timeline.json
docs/medical-summary.md blood_tests.csv, medical_reports_catalog.json
docs/health-data/index.md blood_tests.csv, treatment_timeline.json
docs/health-data/trends.md blood_tests.csv, biomarker_catalog.json
docs/health-data/timeline.md treatment_timeline.json
docs/medical-reports/index.md medical_reports_catalog.json

Checking Page Metadata

To see which pages need updating, check the frontmatter in each markdown file:

---
data_last_updated: 2025-12-03 10:52:10
data_sources: blood_tests.csv, biomarker_catalog.json
---

Or generate the status report to see all dependencies at once.

Integration with Data Update Workflow

The metadata updater is integrated into the complete data update workflow. When new blood tests or medical reports are added:

  1. Parse/extract the new data
  2. Consolidate into master CSV files
  3. Regenerate visualizations
  4. Update page metadata (identifies which pages need updating)
  5. Manually update page content where needed
  6. Rebuild vector database
  7. Restart Docker containers

See the Copilot Instructions for the complete workflow.

Adding New Page Dependencies

To track dependencies for a new page, edit src/exporters/update_page_metadata.py:

PAGE_DEPENDENCIES = {
    "docs/your-new-page.md": [
        "data/processed/some_data.csv",
        "data/processed/other_data.json"
    ],
    # ... existing entries
}

Then run the updater to initialize the metadata.

Benefits

This system solves several problems:

✅ Never forget to update pages - The status report shows which pages depend on recently updated data

✅ No manual timestamp tracking - Git and the metadata updater handle it automatically

✅ Clear data lineage - See exactly which pages use which data sources

✅ Zero maintenance overhead - Runs as part of the existing data processing workflow


For more technical documentation, see the Technical Documentation Index