methodology · v1 · 2026-05
From four satellite channels to one map color.
Updated each quarter when caveats change. The pipeline is a single Python script (pipeline/pipeline.py) reproducible by anyone with Earth Engine access.
flowchart LR S2NIR["Sentinel-2 B8
NIR delta"] S2SWIR["Sentinel-2 B11
SWIR delta"] LST["Landsat 8/9 ST_B10
LST anomaly"] VIIRS["VIIRS DNB
nightlight delta"] MASK["ESA WorldCover
built-up mask"] Z["z-score per signal
across 61 cities"] W["weighted sum
-0.40 * NIR, -0.30 * SWIR
-0.20 * LST, +0.10 * NL"] MAP["choropleth fill"] S2NIR --> Z S2SWIR --> Z LST --> Z VIIRS --> Z MASK -.->|restricts pixels| Z Z --> W W --> MAP
What SolarMap.PH measures
A composite of four satellite-derived signals computed at city/municipality scale across Meralco's franchise area, each quarter:
- NIR delta (Sentinel-2 B8): median near-infrared reflectance over the city, current quarter minus 2022 baseline. Crystalline silicon panels darken NIR. Most direct PV signature in the literature. Weight: 0.40.
- SWIR delta (Sentinel-2 B11): same delta on the short-wave infrared band. Confirmatory; weaker signal than NIR. Weight: 0.30.
- LST anomaly (Landsat 8/9 ST_B10): midday land surface temperature current vs baseline. Solar arrays run cooler than bare roof in midday sun. Faint but independent. Weight: 0.20.
- Nightlight delta (VIIRS DNB): monthly average radiance current vs baseline. Proxy for load-shape change. Noisy. Use as triangulation. Weight: 0.10.
How signals become a composite
Each city's raw signal is z-scored across all cities in the franchise. Composite =
-0.40 × z_nir + -0.30 × z_swir + -0.20 × z_lst + 0.10 × z_nightlight.
The negative weights on NIR/SWIR/LST reflect that "darker" or "cooler" indicates more solar. Positive z-scores on nightlight are a load-shape proxy. Initial weights are v1 priors based on PV remote-sensing literature; if/when ground-truth labels are obtained, weights become tunable.
Cloud and noise handling
- Sentinel-2: SCL band classes 1, 3, 8, 9, 10 masked out (saturated, shadow, clouds, cirrus).
- Landsat: QA_PIXEL bits 3 and 4 masked out (cloud and shadow).
- Built-up area mask from ESA WorldCover 2021 class 50 reduces vegetation and water noise.
- Heavy-cloud quarters: pipeline broadens the date window to a 4-month rolling median.
Aggregation unit
City/municipality level is the headline scale (~50 polygons). Barangay drilldown is computed only where built-up area exceeds 1 km2; below that, signal is too noisy for a useful claim.
What SolarMap.PH does not claim
- No individual-rooftop detection. We do not, and will not, identify any specific home or building as solar.
- No address-level data. No PII is ingested or published.
- No causal claim. The composite correlates with PV adoption but could also reflect roof reconstruction, paint changes, or sensor drift. We say "consistent with," never "proves."
- No prosecution-grade evidence. This data is for civic transparency and homeowner research, not enforcement.
Data joiners
- Microsoft GlobalMLBuildingFootprints for per-city building counts as a denominator.
- Meralco aggregate registry counts from public reports, refreshed quarterly. See
pipeline/meralco_aggregates.json. - DOE LGU-level net-metering counts where published.
- LGU permit cost and delay table: hand-curated, verified rows only. See
pipeline/lgu_friction.json. Open a PR to contribute. - PVGIS (European Commission, free) for the homeowner tool's solar potential math.
Reproducibility
The pipeline is one Python script (pipeline/pipeline.py) plus a validator. Earth Engine auth via service account. Anyone with EE access and the repo can run it and produce the same outputs. The full algorithm, weights, and data files are MIT/CC-BY licensed.
Ground-truth validation, and why the pin layer was dropped (2026Q2)
We hand-checked all hot-spot pins from the 2026Q2 release against Esri World Imagery (~1-3 yr vintage in PH) and ran a follow-up extraction with multi-channel filtering. The pin layer was retired before launch when both passes returned precision below the threshold for an honest claim.
v1.0: NIR-only local-anomaly extraction
We hand-checked all 181 hot-spot pins from the 2026Q2 release against Esri World Imagery (~1-3 year vintage in PH) by inspecting a 240m-wide tile centered on each hot-spot's lat/lon. Labels:
- solar: a clear panel-grid array was visible on a rooftop at the centered location.
- not_solar: the rooftop had visibly changed (reroofing, asphalt, construction, paint) but no panel grid was visible.
- unclear: ambiguous; could be panels, could be material change.
- missing: Esri returned 500 for that lat/lon (12 of 181 hot-spots, mostly remote rural areas outside hi-res coverage).
| Label | Count | % of reviewable |
|---|---|---|
| solar | 6 | 3.6% |
| unclear | 16 | 9.5% |
| not_solar | 147 | 87.0% |
| missing imagery | 12 | excluded |
| total | 181 | 169 reviewable |
docs/groundtruth/labels.json. Strict precision (treat unclear as not-solar): 6/169 = 3.6%.
Lenient precision (treat unclear as solar): 22/169 = 13.0%.
Both numbers are low. Even at the most generous interpretation, fewer than 1 in 5 hot-spots show visible solar panels in the reviewable hi-res imagery.
What that actually means
- The pin layer surfaces statistically anomalous rooftop change, not confirmed solar arrays. Most pixel-level darkening is reroofing, asphalt resurfacing, paint changes, or atmospheric leftovers from cloud-mask, not solar panels.
- This is a lower bound. Esri PH imagery is typically 1-3 years stale. Solar installs from 2024-2026 may exist on the ground but not yet in our reference imagery, so they get marked "not_solar" in the validation. Real precision is higher than 3.6% but we cannot prove how much higher without fresher imagery.
- The hot-spot extraction currently uses NIR-only local-anomaly detection. The map-level diff visualization uses the tighter NIR-and-SWIR fingerprint. A v1.1 hot-spot extraction tightened to multi-channel would likely raise precision substantially. Pencilled in.
- The city-aggregate composite score remains usable: individual false positives wash out across hundreds of pixels, and the overall darkening pattern still tracks rooftop turnover (which correlates with solar adoption when zoomed out).
v1.1: multi-channel AND extraction
We tried tightening: require BOTH NIR drop AND SWIR drop (the same fingerprint used in the diff visualization), top-1 per city, larger minimum cluster size. v1.1 produced 61 hot-spots from 60 cities. After re-validation against fresh Esri tiles, strict precision was 1/57 = 1.75%, worse than v1.0. The multi-channel filter selected different clusters but they weren't more solar-y. Asphalt resurfacing, fresh concrete, and roof material changes all darken in BOTH NIR and SWIR. The AND was less specific than we hoped.
Conclusion: pixel-level NIR/SWIR differencing at Sentinel-2's 10m resolution isn't specific enough to claim per-location solar. The signal exists at city-aggregate scale (multi-signal composites average out per-pixel noise), but individual pixels and small clusters are too noisy to publish as identification claims.
What we kept and what we dropped
- Kept: the city-aggregate composite (choropleth on the map). Multi-signal blending at the per-city scale washes out individual false positives. Use this to compare cities and track quarter-over-quarter trends.
- Kept: the before/after imagery modal per city. Same neighborhood in 2022 and current quarter with the multi-channel diff overlay. Lets users judge for themselves.
- Promoted: the 6 hand-verified case studies (Makati, Valenzuela, Meycauayan, San Mateo, Dasmariñas, Carmona). These are the only locations where Esri hi-res imagery showed a visibly distinct rooftop solar panel array. They become the "notable confirmed installations" sidebar on the map page.
- Dropped: the per-pin hot-spot layer. The data is still produced by the pipeline (
hot_spots_*.geojson) for researchers who want to inspect candidate locations themselves, but it is no longer rendered on the public map.
How to use SolarMap.PH honestly
- Cite the composite score per city as a directional indicator of rooftop change consistent with solar growth. Don't claim per-pin solar.
- Cite the 6 confirmed case studies as concrete evidence that satellite-detectable solar adoption is happening at warehouse scale in the franchise.
- For address-level claims, hand-verify each candidate against fresh hi-res imagery (Mapbox satellite tiles updated more recently than Esri sometimes show panels Esri does not, but we lack a free API for batch validation).
- Future quarters will let us add a persistence filter (only flag pixels that stay dark across 2-3 consecutive quarters), which is the most likely path to defensible per-location precision.
CNN-based per-tile detection (v3)
The Sentinel-2 pipeline above gives a useful city-level signal but is structurally too coarse to claim solar at a specific roof (one S2 pixel = 100 m²; a typical residential 5 kWp install fits inside a single pixel). To make per-roof claims defensible we built a separate detection layer on hi-res Esri imagery and a CNN classifier.
The ML stack, end to end
Three layers, in order: a frozen CLIP-ViT-L/14 encoder turns each 600 × 600 px Esri tile into a 768-dim embedding; a logistic-regression head turns embeddings into a tile-level solar probability; a SAM-based localizer (covered below) turns each high-confidence tile into a per-building panel polygon. Training is offline, ~30 min on an M-series Mac. Inference over all of NCR is ~13 min (cold) or ~3 min (when re-classifying a cached tile set).
flowchart TB
subgraph TRAIN[Training]
direction TB
OSM["OSM Overpass
312 solar locations"]
GT["6 hand-verified
case studies"]
NEG["46 GT not_solar
+ 154 random NCR"]
FETCH["Esri tile fetch
240 m view, 0.4 m/px"]
AUG["4x augmentation
rotation, flip, jitter"]
EMB["CLIP-ViT-L/14 frozen
768-dim embedding"]
LR["Logistic Regression
class-balanced, C=1.0"]
CV["5-fold group-aware CV"]
CLF["clf_v3.joblib"]
OSM --> FETCH
GT --> FETCH
NEG --> FETCH
FETCH --> AUG
AUG --> EMB
EMB --> LR
LR --> CV
CV --> CLF
end
subgraph SCAN[NCR scan]
direction TB
GRID["240 m grid over NCR
16,544 tiles"]
SCANEMB["CLIP embedding"]
SCORE["clf_v3 score per tile"]
HIGH["high tier
score 0.85 and up"]
CAND["candidate tier
score 0.70 to 0.85"]
GEO["rooftop_solar_ncr.geojson
280 high + 235 candidate"]
GRID --> SCANEMB
SCANEMB --> SCORE
SCORE --> HIGH
SCORE --> CAND
HIGH --> GEO
CAND --> GEO
end
CLF -.->|loaded for inference| SCORE
Pipeline
- Positive set bootstrap. We queried OpenStreetMap (Overpass) for nodes and ways tagged
generator:source=solarANDlocation=roofwithin the Meralco franchise bounding box (14.0–15.2 N, 120.6–121.5 E). 312 verified rooftop-solar locations, deduped to 294 unique buildings at ~10 m clustering. We added the 6 hand-verified case studies on top. - Tile fetch. For each location, fetch a 600×600 px Esri World Imagery tile centered on the OSM lat/lon, covering ~240 m × ~240 m at ~0.4 m/pixel.
- Negative set. Re-used the 46 hand-labeled
not_solartiles from the ground-truth pass, plus 154 random tiles drawn uniformly from the NCR bounding box. Total 200 negative sources. - Encoder. CLIP-ViT-L/14 (
openai/clip-vit-large-patch14), frozen. We tried two pretrained DeepSolar-flavored Hugging Face Mask2Former models first; both saturated the segmentation mask at 100% on every image and were unusable. CLIP zero-shot with a prompt ensemble gave F1 ≈ 0.5. Linear-on-CLIP-features beats both. - Augmentation. 4× per source (rotation 0/90/180/270, optional horizontal flip, mild brightness ±15% and color ±10% jitter). 2,500 rows total: 1,500 positive, 1,000 negative.
- Classifier. Logistic regression with class-balanced weights on the 768-dim CLIP image embedding. Trained for 2,000 max iterations, regularization C=1.0.
- Validation. 5-fold group-aware cross-validation. Each source's augmentations are held out together (no within-source leakage). Per-source max-score aggregation.
Results (v2 baseline)
- Best F1 = 0.906 at threshold 0.525 (TP=271, FP=27, FN=29, TN=173). Precision 90.9%, recall 90.3%.
- High-precision setting: threshold 0.85 → precision 98.1%, recall 68.0% (TP=204, FP=4, TN=196, FN=96). This is what we used for the original "high-confidence detection" tier.
- Even-higher precision: threshold 0.96 → precision 100%, recall 34.7%. Catches roughly a third of the rooftop-solar locations with zero false positives in the validation set.
- Honest caveat: at NCR-wide deployment (16,544 tiles, ~0.1-0.5% prevalence), even a 2% LOSO false-positive rate produces hundreds of false alarms per real positive. The candidate tier (score 0.70–0.85) is a flagging surface, not a verified inventory.
- Label-noise discovery: the classifier scored two of the original 6 case studies (
case_valenzuela,case_san_mateo) below 0.30, suggesting the panels in those tiles may be greenhouse roofs or painted metal, not actual photovoltaics. OSM-bootstrapped labels turned out cleaner than single-pass vision-model labeling on ambiguous PH industrial roofs.
v3 cleanup pass: label-noise removed
v2 surfaced its own label noise: 4 random-NCR tiles that scored above 0.85 (rneg_0074, rneg_0086, rneg_0116, rneg_0136) were unmistakably real solar when checked by eye. v2 also kept the 2 noisy case studies as positives. We retrained clf_v3 with those 4 promoted to positives and the 2 noisy cases dropped: pure label cleanup, no architectural changes.
- v3 best F1 = 0.918 (vs v2 0.906). Precision 91.2%, recall 92.4% at threshold 0.519.
- v3 at t=0.85: precision 99.1%, recall 70.5% (vs v2 P=98.1% R=68.0%). The threshold gives nearly perfect precision while finding more true solar.
- v3 NCR scan: 114 high + 280 candidate (vs v2 92 + 209). +24% high, +34% candidate. 23 tiles upgraded from candidate to high; 1 lost from high (the dropped noisy case study area).
- v3 OSM cross-match: high tier 24 confirmed + 90 new (79% NEW vs v2 75%). The cleaned classifier finds more genuinely-undocumented installations.
The "false positives" are real positives
The 4 highest-scored "negatives" in the validation set (random NCR tiles) all contain visible rooftop solar arrays when checked by eye. These were never in OSM but the classifier found them. Two examples:
rneg_0116(score 0.951): commercial complex with 4+ separate rooftop panel arrays.rneg_0136(score 0.954): institutional building with a clearly visible blue-gray panel grid.
The LOSO precision number quoted above (98% at threshold 0.85) is a lower bound. The "false positives" inflating the negative class are mostly OSM-coverage gaps. The CNN's value sits there: it surfaces solar the OSM community hasn't yet mapped.
v4: a second active-learning round + honest holdout calibration
The v3 CV precision is biased upward: data augmentation creates near-duplicate variants of the same source image, and group-aware folds can leak similarity through the augmentation chain. To get an honest precision number we ran another active-learning round and split out a never-trained 20% holdout for Platt-sigmoid calibration.
- Active learning, round 2. All 114 v3 high-confidence tiles were re-inspected at the source 600px Esri resolution (not the 480px verification thumbnails). 111 confirmed real solar, 1 confirmed false positive (uniform blue painted metal roof, no inter-panel structure), 2 stay ambiguous. Two thumbnail-level mislabels surfaced: a sawtooth industrial roof initially called false (was solar on south-facing slopes) and 29 thumbnails initially called ambiguous that turned out to host clearly visible PV at full resolution.
- Calibration. 20% of OSM-positive sources held out, never seen by the trained classifier. The remaining 80% trains the base LR; the held-out set fits a Platt sigmoid
P = sigmoid(1.29 × decision_function(x) + 0.16)mapping raw scores to calibrated probabilities. - Honest holdout, threshold 0.85: precision 95.9%, recall 79.7%, F1 0.870 across 59 positive + 39 negative held-out sources. The honest precision is ~3 points lower than the LOSO number, which is consistent with augmentation leakage. To hit P ≥ 0.99 on this honest holdout, threshold needs to climb to 0.97 (recall drops to 66%).
- v4 NCR scan: 130 high + 216 candidate (vs v3 114 + 280). 40 tiles upgraded to high, 2 lost. 82% of high tier are NEW (not in OSM).
Encoder ablation: CLIP wins
Before locking the encoder, we ran the same training set through two alternatives at the same head and the same Platt holdout:
openai/clip-vit-large-patch14(current): calibrated holdout F1 = 0.870, P = 0.959.facebook/dinov2-large: F1 = 0.830 (-4 pts), P = 0.936 (-2.3 pts).satlas Aerial_SwinB_SI(aerial-imagery pretrained): F1 = 0.727 (-14 pts), P = 0.900 (-5.9 pts).
The aerial-pretrained encoder underperforming the general-purpose one was the most surprising result. CLIP's web-scale text-paired image pretraining apparently encodes "is this a solar panel array" better than satlas's narrower aerial-task pretraining. CLIP-ViT-L stays locked.
Cross-match against OpenStreetMap
For every CNN detection we look up the nearest OSM-tagged solar location and treat detections within 200 m as "confirmed by OSM" (model rediscovers known solar) and detections farther away as "new" (model proposes solar OSM has not tagged). On the production v4 round-3 NCR scan (16,544 tiles on a 240 m grid, 515 detections; a franchise extension to the Bulacan / Cavite / Rizal / Laguna industrial belts is queued for a future quarter):
- HIGH-CONFIDENCE TIER (280 detections, score ≥ 0.85): 36 confirmed by OSM (13%) · 244 new (87%). 277 of these 280 land inside a city polygon; 3 fall on LGU borders.
- CANDIDATE TIER (235 detections, score 0.70-0.85): 20 confirmed (9%) · 215 new (91%).
- Per-building polygons after SAM segmentation + residential suppression: 384 buildings, 69.9 MWp aggregate. The published per-building dataset is non-residential only by policy; see /privacy.
This is a useful sanity check for the classifier. If almost every detection re-mapped an OSM point, the model would just be memorizing training data; if none did, the model would be hallucinating. The ~13% confirmed / ~87% new split at the high tier is consistent with a classifier that learned the visual signature of rooftop PV well enough to recognize it on buildings the OSM community has not yet mapped, in a region where OSM rooftop-solar tagging is sparse. "Within 200 m" is the same proximity convention used by DeepSolar (Stanford, 2018) and SPECTRUM (ICSC, 2025).
The "candidate" tier is noisier: many genuine borderline cases plus more visually-ambiguous false positives. The high-confidence tier is the production set.
Coverage scan
We tile NCR (14.20-14.85 N, 120.88-121.22 E) on a 240 m grid, no overlap: 16,544 tiles. A Phase 5 extension to the Bulacan / Cavite / Rizal / Laguna industrial belts (45,752 tiles total) is staged in detection/scan/luzon_scan.py and queued for a future quarter. Each tile is fetched from Esri, embedded with CLIP-ViT-L, and scored by the classifier. Results at score ≥ 0.70 are written to site/public/data/rooftop_solar_ncr.geojson with a tier property (high for ≥ 0.85, candidate otherwise). The "your roof" page surfaces detections within 1.5 km of the user's geocoded address.
The base detection layer outputs tile-level Points (the 240 m square center). v2.1 (next section) localizes panels at building-polygon resolution.
v2.1 Building-level panel localization
Tile-level detection answers "is there solar in this 240m grid square." It cannot answer "which roof," and it cannot estimate kWp installed without a footprint. v2.1 adds panel-polygon resolution by combining three signals.
flowchart TB TILE["High-confidence tile
from clf_v3 scan"] SAM["SAM ViT-B
auto-mask generator
~150 masks per tile"] COLOR["Color signature filter
brightness 25 to 140
blue bias -25 to +60"] CROP["192 px context crop
centered on each mask"] CEMB["CLIP-ViT-L embed"] LR2["clf_v3 score"] COMB["combined score
0.6 * CLIP + 0.4 * color"] KEEP["keep if combined >= 0.70"] OSM2["OSM buildings within 200 m
via Overpass"] MATCH["Centroid-in-polygon
or bbox overlap"] MERGE["Per-building merge
sum panel area
cap at footprint"] KWP["kWp estimate
area_m2 divided by 6"] OUT["per_building_solar_ncr.geojson
384 buildings, 69.9 MWp"] TILE --> SAM SAM --> COLOR COLOR -->|drops ~70% of masks| CROP CROP --> CEMB CEMB --> LR2 LR2 --> COMB COMB --> KEEP KEEP --> MATCH OSM2 --> MATCH MATCH --> MERGE MERGE --> KWP KWP --> OUT
- SAM (Segment Anything, ViT-B checkpoint). For each high-confidence v3 tile we run SAM's automatic mask generator. SAM is class-agnostic: it carves the image into ~150-200 generic regions per tile (every roof slope, road, treeline becomes its own mask).
- Color signature filter. Solar panels in PH aerial imagery are dark with neutral-to-blue tone. We compute per-mask mean RGB and reject masks whose brightness or blue bias is incompatible with panels (light roofs, vegetation, asphalt, brightly-tarpaulined informal roofs all get filtered out before the next step).
- CLIP+LR scoring on 192 px context window. Each surviving mask is centered in a 192×192 px window (~76 m), embedded with CLIP-ViT-L, and scored by clf_v3. Final segment confidence is
0.6 × CLIP_score + 0.4 × color_score. We keep segments with combined confidence ≥ 0.70. - OSM building intersection. Each kept segment polygon is matched against OSM building footprints around the tile (Overpass query with 200 m radius). The segment is assigned to the building containing its centroid (or, if outside any building, to the building with largest bbox overlap).
- Per-building merge. SAM frequently over-segments a single rooftop array into 8-30 pieces; without grouping the same SM Mall installation would emit 30 features. We group all segments by OSM building id, sum their panel area (capped at the building footprint), use the highest-confidence segment as the displayed polygon, and keep the max segment confidence as the building's score.
- kWp estimate. Industry rule of thumb: ~6 m² of crystalline-silicon panel per kWp installed.
kwp_estimate = panel_area_m² / 6. This is approximate: efficiency varies, mounting density varies, and SAM's polygon often clips the array boundary by ±10%.
Output: site/public/data/per_building_solar_ncr.geojson, one Feature per building with detected panels. Properties include building_osm_id, building_type, panel_area_m2, kwp_estimate, confidence, n_segments_merged. The "your roof" tool does an osm_id lookup against this file: when the user's address matches a building that v2.1 found panels on, the result page surfaces a "your roof MAY already have solar" banner with the detection thumbnail and estimated kWp.
What this is not: a verified panel-area number. SAM's polygons are approximate (contour-of-a-mask, not a precise rectified panel outline); CLIP+LR was trained at tile scale and applied at segment scale; OSM building footprints in PH are uneven. Treat the kWp number as ±20–30%. The high-value claim is "there is solar on this specific OSM building"; the m² and kWp are useful framing, not audit-grade.
Honest limitations
- 10m Sentinel-2 resolution caps inference at city/barangay scale, not roof-level.
- Composite weights are v1 priors, not validated against ground truth.
- Cloud-heavy quarters introduce noise; broadening the window helps but does not eliminate it.
- Baseline 2022 means installs prior to 2022 are baked into the baseline, not detected.
- Aggregate Meralco counts come from public reports, not audited filings.
- LGU friction table is small (~30 LGUs) and unevenly verified.
References
- Yu, J., Wang, Z., Majumdar, A., Rajagopal, R. DeepSolar: A Machine Learning Framework to Efficiently Construct a Solar Deployment Database in the United States. Joule (2018). link
- ICSC (Institute for Climate and Sustainable Cities). icsc.ngo
- Meralco public statements (2026-05): Bilyonaryo, Tribune, CleanTechnica.
- Inquirer Opinion, "Guerrilla solar installers in summer of discontent": link
- Manila Times, "A brewing solar controversy" (2026-05-07): link
- Power Philippines, "Inconsistent LGU Permits Stalling Rooftop Solar Growth": link
- pv-magazine, "Philippines accelerates permits for solar net-metering" (2026-02-04): link
- ESA WorldCover v200: esa-worldcover.org
- PVGIS: European Commission JRC