Page Metadata & Data Freshness
This documentation site now includes automated data freshness tracking to help identify which pages need updates when new medical data is received.
How It Works
1. Git-Based Timestamps (Automatic)
Every page shows when it was last modified in git. This appears at the bottom of each page:
- Last update: Shows the git commit timestamp
- Fully automatic - no maintenance needed
- Accurate to the second
2. Data Dependency Tracking (Automated)
Key pages have frontmatter metadata that tracks:
- data_last_updated: When the underlying data files were last modified
- data_sources: Which data files the page depends on
These are automatically updated by running:
docker exec remission python src/exporters/update_page_metadata.py --verbose
3. Data Freshness Report
Generate a comprehensive status report:
docker exec remission python src/exporters/update_page_metadata.py --report
This creates docs/technical/data-status.md showing:
- Which data files were updated and when
- Which pages depend on which data files
- Helps identify pages that need content updates
Page Dependencies
The following pages have automated dependency tracking:
| Page | Depends On |
|---|---|
docs/index.md |
blood_tests.csv, medical_reports_catalog.json, treatment_timeline.json |
docs/medical-summary.md |
blood_tests.csv, medical_reports_catalog.json |
docs/health-data/index.md |
blood_tests.csv, treatment_timeline.json |
docs/health-data/trends.md |
blood_tests.csv, biomarker_catalog.json |
docs/health-data/timeline.md |
treatment_timeline.json |
docs/medical-reports/index.md |
medical_reports_catalog.json |
Checking Page Metadata
To see which pages need updating, check the frontmatter in each markdown file:
---
data_last_updated: 2025-12-03 10:52:10
data_sources: blood_tests.csv, biomarker_catalog.json
---
Or generate the status report to see all dependencies at once.
Integration with Data Update Workflow
The metadata updater is integrated into the complete data update workflow. When new blood tests or medical reports are added:
- Parse/extract the new data
- Consolidate into master CSV files
- Regenerate visualizations
- Update page metadata (identifies which pages need updating)
- Manually update page content where needed
- Rebuild vector database
- Restart Docker containers
See the Copilot Instructions for the complete workflow.
Adding New Page Dependencies
To track dependencies for a new page, edit src/exporters/update_page_metadata.py:
PAGE_DEPENDENCIES = {
"docs/your-new-page.md": [
"data/processed/some_data.csv",
"data/processed/other_data.json"
],
# ... existing entries
}
Then run the updater to initialize the metadata.
Benefits
This system solves several problems:
✅ Never forget to update pages - The status report shows which pages depend on recently updated data
✅ No manual timestamp tracking - Git and the metadata updater handle it automatically
✅ Clear data lineage - See exactly which pages use which data sources
✅ Zero maintenance overhead - Runs as part of the existing data processing workflow
For more technical documentation, see the Technical Documentation Index