The Summarizer: The Infochimps Cure for Geolocation Overload

Last week, we revealed our brand new Infochimps Geo APIs. Not only are our APIs chock full of millions of points of interest and contextual data, but the schema is also dead simple to learn and implement. And since all the new Geo APIs are unified under the same schema, no matter what location data you are looking to access, the API should always work exactly as you’d expect it too.

In the process of developing our new Geo APIs, we developed one very important and useful feature: The Summarizer. It makes presenting venues user-friendly by intelligently clustering locations, and you can take advantage of it automatically. Let’s dive in a little deeper so you can see exactly what we’re talking about.

texas churches all1 The Summarizer: The Infochimps Cure for Geolocation Overload

The image above is a plot of all the churches in Texas, taken from Geonames Places. Cyan is Dallas, blue is Austin, green is San Antonio, and purple is Houston. As you can easily see, there are way more points being plotted than it makes sense to present to the user. The common remedy is to present a smaller sample of data. In practice though, that sample ends up being more random than not.

texas churches search11 The Summarizer: The Infochimps Cure for Geolocation Overload

For example, the image above shows a search of churches in Texas. Since search doesn’t fetch every single church, the map isn’t nearly as overwhelming. However zoom out, and the issue becomes obvious.

texas churches search21 The Summarizer: The Infochimps Cure for Geolocation Overload

Suddenly Abilene, indicated by a red box, doesn’t have any churches! Just a second ago there were a several. Presenting a faithful representation of the data is important, which is why The Summarizer can make a big difference. Instead of using the “search” action, use the “list” action. To perform the “list” action, it is required that you specify a “layer.” For more information on actions and layers, check out the Getting Started with Geo Documentation.

For example, the “search” API query producing the map above is:
&rend.shape=geo_json&_apikey=[INSERT API KEY HERE]

The “list” API query, which uses The Summarizer, producing the map below is:
&rend.shape=geo_json&_apikey=[INSERT API KEY HERE]

texas churches list1 The Summarizer: The Infochimps Cure for Geolocation Overload

As seen in the image above, instead of an overwhelming number of points, we are presented with intelligent groupings. Abilene has four cluster points representing the churches there.

How does it know how to group points? We take into account the distance and density of the locations to only show the optimal number of clusters. The number will vary as you zoom in and out, because the distribution will change from zoom level to zoom level. Eventually you’ll zoom in close enough, when it’s better just to show the individual points, we won’t cluster any more.

You might wonder though, how many points are being represented by each of the clusters? Or how do I even know it’s a cluster? Maybe you’d like to present clusters differently to your users and show the number of points in that cluster as part of the icon. Have no fear, the response includes both the “_type” of point as well as the “_total” number of points being summarized, highlighted in the code snippet below.

“type”: “Feature”,

“id”: “000accc57b71d465bdd698672274e17c”,
“geometry”: {
“type”: “Point”,
“coordinates”: [
“properties”: {
     “_type”: “cluster_point”,
     “_total”: 151

The Summarizer, while powerful, is a work in progress. We’d love for you to test out the feature and let us know what you think!

Comments are closed.