What this is

I learned to turn a simple weather model into a calibrated probabilistic forecaster. I used Koopman theory to encode the 24-hour daily cycle in a latent state and a Student-t Ensemble Kalman Filter to handle outliers and adapt uncertainty. This demo lets you see calibration, not just point error: forecast bands, PIT, reliability, and a plain-language scenario view (for example, frost risk). Link to github repo: https://github.com/Magret-Oladunjoye/calibrated-weather-lran-enkf

Data source and scope

  • Dataset: Historical Hourly Weather Data 2012–2017 (Kaggle). Source: https://www.kaggle.com/datasets/selfishgene/historical-hourly-weather-data/
  • Geography: 36 cities — 30 in the US/Canada and 6 in Israel.
  • Cities: Vancouver, Portland, San Francisco, Seattle, Los Angeles, San Diego, Las Vegas, Phoenix, Albuquerque, Denver, San Antonio, Dallas, Houston, Kansas City, Minneapolis, Saint Louis, Chicago, Nashville, Indianapolis, Atlanta, Detroit, Jacksonville, Charlotte, Miami, Pittsburgh, Toronto, Philadelphia, New York, Montreal, Boston, Beersheba, Tel Aviv District, Eilat, Haifa, Nahariyya, Jerusalem.
  • Period: Hourly measurements, roughly 2012–2017.
  • Variables used here: temperature.

What I did to the data (so results are reproducible)

  • I convert all temperatures to °C at load time for consistency across files.
  • I added time features (sin/cos of hour-of-day and day-of-year).
  • I split by year: train < 2016, validation = 2016, test = 2017.
  • I benchmark against a seasonal baseline and an EDMD (Koopman) baseline, then show my LRAN + Student-t EnKF.
  • I evaluate point accuracy (RMSE) and probabilistic accuracy (CRPS), plus calibration via PIT/reliability and interval coverage/width.

How to read this UI

  • Choose a model to compare approaches.
  • Pick dates to focus on a window. The shaded band is the model’s predicted range.
  • PIT should be roughly flat if probabilities are honest. Reliability points should sit near the diagonal.
  • The Scenario helper turns the band into a simple risk number (for example, “chance of ≤ 0 °C”).

What this is not

  • This is a historical demo — no real-time data or nowcasting.
  • Coverage is limited to the 36 cities and 2012–2017 period in the dataset.

Why this matters

  • It’s a physics-aware backbone (Koopman/LRAN) with a robust Bayesian filter (Student-t EnKF).
  • It stresses calibration, not just error — crucial for decision-making.
  • The code is structured for reproducibility and extension (new cities/variables or other filters).
Choose a model

Switch between saved models if multiple .npz files are present.

Saved probabilistic forecast (mean and uncertainty) loaded from file — units converted to °C

0.01 0.49
0.51 0.99
  • Forecast vs truth: the shaded band is the range I think the future will fall in. The line is what actually happened.
  • PIT: if bars are roughly flat, my probabilities are honest. U-shape means bands too narrow; hill-shape means too wide.
  • Reliability: the points should sit near the diagonal if I am well calibrated.
  • Scenario helper: helps turn the band into a simple risk number for frost or heat.
  • Otto, S. E., & Rowley, C. W. (2019). Linearly Recurrent Autoencoder Networks for Learning Dynamics. SIAM Journal on Applied Dynamical Systems, 18(1), 558–593. DOI: 10.1137/18M1177846
  • Preprint: arXiv:1712.01378

turn the band into a simple decision signal. For example, a city operator might care about frost (≤ 0 °C) or heat days (≥ 25 °C). compute the average forecast probability, the observed frequency in the window, and the peak-risk time.

Event type
-30 45