capetowndata.com#
Documentation#
Cape Town data portal — housing, safety, energy, and tourism data products.
GitHub Actions Workflows#
All workflows live in .github/workflows/.
Data pipelines#
Workflow |
File |
Schedule |
Description |
|---|---|---|---|
Update Cape Town Loadshedding Data |
|
Every 5 hours (:05) |
Cape Town loadshedding stage fetch with retry |
Update Other Cities Loadshedding Data |
|
Every 5 hours (:35) |
Joburg, Durban, Pretoria, PE loadshedding |
Update Cape Town Traffic Data |
|
Every 10 min (01:00–17:59 UTC) |
Waterfront ↔ Airport commute times via Google Maps Routes API |
Update Cape Town Dam Levels |
|
Weekly (Tue 06:00 UTC) |
Dam level data from DWS; syncs to R2 + git |
Update Housing Data |
|
Manual |
Airbnb and rental market data fetch |
Update Crime Safety Data |
|
Manual |
Crime safety data with retry logic |
Update Table Mountain Status |
|
Manual (schedule disabled) |
Cableway status and weather |
Update Neighbourhood Metrics |
|
Monthly (1st, 06:00 UTC) |
Neighbourhood metrics + property snapshots |
Monitoring & reports#
Workflow |
File |
Schedule |
Description |
|---|---|---|---|
Identity Overview |
|
Daily (07:00 UTC) |
24h data integrity lookback; emails admin on warnings |
Neighbourhood Overview |
|
Daily (08:00 UTC) |
Neighbourhood data quality report |
Translation Overview |
|
Daily (07:30 UTC) |
Translation coverage across .po files and model fields |
CI / Docs#
Workflow |
File |
Schedule |
Description |
|---|---|---|---|
CI |
|
Push / PR |
Linting (ruff), Python tests, Sphinx docs build |
Docs (daily) |
|
Daily (05:30 UTC) + manual |
Generate docs, build HTML, link check, upload artifact |
Data Models (35 models, 10 apps)#
See Data Models for full field-level ER diagrams by domain.
FAQ — Where does the data come from?#
Where does the safety score come from?#
Every neighbourhood has a single canonical safety_score (0.0–10.0). It is
determined by a priority hierarchy — the safety_source field records which
source was used:
Priority |
Source |
|
When used |
|---|---|---|---|
1 (highest) |
|
|
A blog post is linked to the neighbourhood and has a safety rating |
2 |
Weighted ward snapshot average |
|
|
3 (fallback) |
|
|
No blog post or ward data available |
Score levels:
Score |
Level |
Map colour |
|---|---|---|
8.0–10.0 |
Very Safe |
🟢 |
6.0–7.9 |
Safe |
🟢 |
4.0–5.9 |
Moderate |
🟡 |
2.0–3.9 |
Caution |
🟠 |
0.0–1.9 |
Avoid |
🔴 |
Where do housing prices and rental data come from?#
A two-tier system — district-level aggregates are overridden by per-neighbourhood ground-truth metrics:
Source |
Data provided |
|---|---|
Inside Airbnb |
Listing count, avg nightly rate |
PayProp / TPN |
Median rents, YoY growth, vacancy rates |
Lightstone |
Avg house prices, price/m², YoY changes |
Property24 |
District rental and property price ranges |
neighbourhood_metrics.json includes sample sizes (n) and standard
deviations (σ) per neighbourhood so data quality is transparent.
Where does loadshedding data come from?#
Three sources are fetched in parallel and cross-validated:
Source |
Role |
|---|---|
Primary — Cape Town stage + schedule |
|
Eskom GetStatus API |
Fallback — national Eskom stage |
EskomSePush API |
Optional — both Eskom and Cape Town stages |
A verification layer compares all three, calculates a confidence score
(0–1), and prevents suspicious stage jumps. The output JSON includes
_verification.confidence and sources_agreeing.
Where does traffic data come from?#
Google Maps Routes API v2 with
TRAFFIC_AWAREroutingRoute: V&A Waterfront ↔ Cape Town International Airport (both directions, prioritises N2)
Returns duration, distance, and route description
Where do dam level readings come from?#
Department of Water and Sanitation (DWS) — HTML table scraped from
dws.gov.za6 dams tracked: Berg River, Steenbras Lower, Steenbras Upper, Theewaterskloof, Voëlvlei, Wemmershoek
Metrics per dam: capacity (MCM), this week %, last week %, last year %
A weighted total is calculated from the individual dam percentages
Hand-written Docs
Application Reference
Auto-generated Reference
- Data Products (generated)
- Data Updates (generated)
- Data Models (generated)
- Area
- Article
- BlogPost
- Category
- CategoryLayout
- Comment
- CookieConsent
- CrimeDataPurchase
- Customer
- Dataset
- DatasetAccess
- District
- EbookPurchase
- Favorite
- GuestSubscription
- IdentityEvent
- Invoice
- Itinerary
- Neighbourhood
- Neighbourhood
- NeighbourhoodHighlight
- NeighbourhoodWard
- PropertySnapshot
- Question
- QuestionChoice
- Quiz
- QuizResult
- StayPreference
- StripeCheckoutEvent
- StripeWebhookEvent
- Subscription
- SubscriptionTier
- TravelMode
- TravelPreference
- Ward
- WardSafetySnapshot