capetowndata.com#

Documentation#

Cape Town data portal — housing, safety, energy, and tourism data products.

GitHub Actions Workflows#

All workflows live in .github/workflows/.

Data pipelines#

Workflow

File

Schedule

Description

Update Cape Town Loadshedding Data

loadshedding.yml

Every 5 hours (:05)

Cape Town loadshedding stage fetch with retry

Update Other Cities Loadshedding Data

loadshedding_cities.yml

Every 5 hours (:35)

Joburg, Durban, Pretoria, PE loadshedding

Update Cape Town Traffic Data

traffic.yml

Every 10 min (01:00–17:59 UTC)

Waterfront ↔ Airport commute times via Google Maps Routes API

Update Cape Town Dam Levels

dam_levels.yml

Weekly (Tue 06:00 UTC)

Dam level data from DWS; syncs to R2 + git

Update Housing Data

housing.yml

Manual

Airbnb and rental market data fetch

Update Crime Safety Data

crime_safety.yml

Manual

Crime safety data with retry logic

Update Table Mountain Status

table_mountain.yml

Manual (schedule disabled)

Cableway status and weather

Update Neighbourhood Metrics

update_metrics.yml

Monthly (1st, 06:00 UTC)

Neighbourhood metrics + property snapshots

Monitoring & reports#

Workflow

File

Schedule

Description

Identity Overview

identity_overview.yml

Daily (07:00 UTC)

24h data integrity lookback; emails admin on warnings

Neighbourhood Overview

neighbourhood_overview.yml

Daily (08:00 UTC)

Neighbourhood data quality report

Translation Overview

translation_overview.yml

Daily (07:30 UTC)

Translation coverage across .po files and model fields

CI / Docs#

Workflow

File

Schedule

Description

CI

ci.yml

Push / PR

Linting (ruff), Python tests, Sphinx docs build

Docs (daily)

docs_daily.yml

Daily (05:30 UTC) + manual

Generate docs, build HTML, link check, upload artifact

Data Models (35 models, 10 apps)#

erDiagram District ||--o{ Neighbourhood : contains Neighbourhood ||--o{ NeighbourhoodHighlight : highlights Neighbourhood ||--o{ PropertySnapshot : tracks Neighbourhood ||--o{ NeighbourhoodWard : maps_to Ward ||--o{ NeighbourhoodWard : maps_to Ward ||--o{ WardSafetySnapshot : has Category ||--o{ BlogPost : categorises BlogPost ||--o{ Comment : has Comment ||--o{ Comment : replies BlogPost ||--o{ Favorite : saved_as Itinerary ||--o{ Favorite : saved_as Itinerary }o--o{ TravelMode : uses SubscriptionTier ||--o{ Subscription : defines SubscriptionTier ||--o{ GuestSubscription : defines Customer ||--o{ Subscription : owns Dataset ||--o{ DatasetAccess : grants CrimeDataPurchase ||--o{ DatasetAccess : migrated_to Area ||--o{ QuizNeighbourhood : groups Quiz ||--o{ Question : contains Question ||--o{ QuestionChoice : has Quiz ||--o{ QuizResult : produces QuizNeighbourhood ||--o{ QuizResult : recommended_in

See Data Models for full field-level ER diagrams by domain.

FAQ — Where does the data come from?#

Where does the safety score come from?#

Every neighbourhood has a single canonical safety_score (0.0–10.0). It is determined by a priority hierarchy — the safety_source field records which source was used:

Priority

Source

safety_source

When used

1 (highest)

BlogPost.safety (editorial override)

blogpost-fk

A blog post is linked to the neighbourhood and has a safety rating

2

Weighted ward snapshot average

ward-snapshot

WardSafetySnapshot data exists for the linked wards

3 (fallback)

Neighbourhood.blog_safety_score (legacy)

blog-json

No blog post or ward data available

flowchart LR SAPS[SAPS / CrimeHub] -->|fetch_crime_safety| JSON[crime_safety.json] JSON -->|ingest_ward_safety| WSS[WardSafetySnapshot\nper ward per quarter] WSS -->|weighted avg via\nNeighbourhoodWard.weight| NS[Neighbourhood.safety_score] BP[BlogPost.safety\neditorial override] -.->|priority 1| NS

Score levels:

Score

Level

Map colour

8.0–10.0

Very Safe

🟢 #2E7D32

6.0–7.9

Safe

🟢 #66BB6A

4.0–5.9

Moderate

🟡 #FFD54F

2.0–3.9

Caution

🟠 #FF8A65

0.0–1.9

Avoid

🔴 #E53935

Where do housing prices and rental data come from?#

A two-tier system — district-level aggregates are overridden by per-neighbourhood ground-truth metrics:

Source

Data provided

Inside Airbnb

Listing count, avg nightly rate

PayProp / TPN

Median rents, YoY growth, vacancy rates

Lightstone

Avg house prices, price/m², YoY changes

Property24

District rental and property price ranges

flowchart LR SRC[Property24 / PayProp\nLightstone / Inside Airbnb] -->|fetch_housing| HJ[housing.json\ndistrict-level] HJ -->|populate_neighbourhoods\n+ district_mappings.json| NH[Neighbourhood model] MET[neighbourhood_metrics.json\nper-neighbourhood ground truth] -->|update_neighbourhood_metrics| NH NH -->|snapshot| PS[PropertySnapshot\ntrend tracking]

neighbourhood_metrics.json includes sample sizes (n) and standard deviations (σ) per neighbourhood so data quality is transparent.

Where does loadshedding data come from?#

Three sources are fetched in parallel and cross-validated:

Source

Role

OurPower.co.za

Primary — Cape Town stage + schedule

Eskom GetStatus API

Fallback — national Eskom stage

EskomSePush API

Optional — both Eskom and Cape Town stages

A verification layer compares all three, calculates a confidence score (0–1), and prevents suspicious stage jumps. The output JSON includes _verification.confidence and sources_agreeing.

flowchart LR OP[OurPower] --> V[Verification\nconsensus] EK[Eskom GetStatus] --> V SP[EskomSePush] --> V V --> LS[loadshedding.json] LS --> R2[Cloudflare R2] LS --> API[LoadsheddingView\nJSON API] LS --> PAGE[status_page\nHTML]

Where does traffic data come from?#

  • Google Maps Routes API v2 with TRAFFIC_AWARE routing

  • Route: V&A Waterfront ↔ Cape Town International Airport (both directions, prioritises N2)

  • Returns duration, distance, and route description

flowchart LR GM[Google Maps\nRoutes API v2] -->|fetch_traffic| TJ[traffic.json] TJ --> R2[Cloudflare R2] TJ --> VIEW[traffic view\n+ Leaflet map]

Where do dam level readings come from?#

  • Department of Water and Sanitation (DWS) — HTML table scraped from dws.gov.za

  • 6 dams tracked: Berg River, Steenbras Lower, Steenbras Upper, Theewaterskloof, Voëlvlei, Wemmershoek

  • Metrics per dam: capacity (MCM), this week %, last week %, last year %

  • A weighted total is calculated from the individual dam percentages

flowchart LR DWS[DWS\ndws.gov.za] -->|fetch_dam_levels\nHTML scrape| DJ[dam_levels.json] DJ --> R2[Cloudflare R2] DJ --> VIEW[dam_levels view\n+ Leaflet map]