Data Updates#

Data is kept fresh by automated GitHub Actions workflows. Each workflow fetches from upstream sources, validates the response, and writes to Cloudflare R2 (with a git-tracked fallback).

Workflow overview#

Workflow

Schedule

Primary Sources

Taxonomy Product

housing.yml

manual

Inside Airbnb, Property24, PayProp

housing

crime_safety.yml

manual

CrimeHub, SAPS, Google News

crime-safety

loadshedding.yml

every 5 hours (:05)

OurPower, Eskom

loadshedding

loadshedding_cities.yml

every 5 hours (:35)

OurPower

loadshedding

dam_levels.yml

weekly (Tue 06:00 UTC)

DWS

dam-levels

update_metrics.yml

monthly (1st, 06:00 UTC)

Property24, Lightstone

traffic.yml

every 10 min (01:00–17:59 UTC)

Google Maps

update-fx-latest.yml

twice daily (08:00, 16:00 UTC)

exchangerate.host

How it works#

flowchart LR A[Cron / manual trigger] --> B[GitHub Actions] B --> C[Fetch from APIs] C --> D{Valid?} D -->|yes| E[Upload to R2] D -->|no| F[Log error, skip] E --> G[Git commit fallback]

Data Product Taxonomy#

Each data product has a DataProductManifest defining its sub-datasets, dependencies, and refresh mechanics. These are registered in reports/manifest_registry.py and visible in the admin at /admin/reports/dataset/taxonomy/.

Product

Registry

Sub-datasets

Composite slug

Sea Point Crime & Safety

sea_point_registry.py

7

sea-point-composite

Cape Town Dam Levels

dam_levels_registry.py

4

dam-levels-composite

SA Loadshedding Status

loadshedding_registry.py

5

loadshedding-composite

Cape Town Ward Crime & Safety

crime_safety_registry.py

7

crime-safety-composite

Cape Town Housing Market

housing_registry.py

7

housing-composite

See reports app docs for the full taxonomy reference.

See the Auto-generated Reference section for per-workflow details built from data/updates.json.