Where Does The Weather (Data) Come From? Visualizations of Worldwide Weather Stations

This post was written by Hohyon Ryu, who interned with us this past summer as a Catalog Engineer.  He’s currently pursuing a PhD at the University of Texas’ School of Information.

The idea for this project started from one of the simplest and most essential questions in computer science. How close is the nearest X from where I stand?  To explore how to answer this question, I used our NCDC Weather Station API and attempted to answer, “What is the closest weather station from where I stand”?

stations map left 1024x517 Where Does The Weather (Data) Come From?  Visualizations of Worldwide Weather Stations

A brute force algorithm that calculates distances from all the weather stations will have to go through 2.5 million weather stations. It works but it just takes long long time.

mapgrid Where Does The Weather (Data) Come From?  Visualizations of Worldwide Weather Stations

One better solution is dividing the earth by grids. We may divide the globe into small tiles and find the closest station in the grid that I’m standing in. This solution is very fast, but there’s another problem. In the map below, let’s say I’m standing in New Orleans. There are 2 stations: one in Baton Rouge and one in Slidell. The closest station would be the one in Slidell, it is in a different tile. So this algorithm would find the one in Baton Rouge as the closest point.

voronoi 1024x560 Where Does The Weather (Data) Come From?  Visualizations of Worldwide Weather Stations

So, came up with this solution, a Voronoi diagram for all the stations in the world! It looks like a very complex calculation should be involved to generate a map like the following, but it takes only a few minutes to build the world scale map with 2.5 million points. Each station has a polygon that indicates the range it covers.

The best solution for us was grid + Voronoi lattice. Now let’s go back to the New Orleans problem. We’re in New Orleans and it is in the grid that intersects with the Voronoi polygons of Slidell and Baron Rouge. So now, we know that we have 2 candidate stations and the one in Slidell is the closest one.

Want to try out making your own visualization of our weather station data?  You can find the NCDC Weather Stations API here and the Voronoi Lattice library written in Python is available at Github.

Comments are closed.