Data Visualization Prevents Curation Bias in Social Media

A lot of social media analysts are predicting that curation will help solve the issue of social media overload. Curation has been touted as “the chosen” social media buzzword du jour and the new form of search that will prove more useful than Google’s spammed result pages. Rather than paying attention to just anyone and everyone, we will defer to the nine percent of people who actively search for content, and ¬†listen to them on networks like Twitter or Quora.

How does this paint a very different perception of reality? After all, we will be listening to very select sources and filtering out the inconvenient users of social media who may just so happen to disagree with us. We then listen to these same sources over and over. What happens when we happen to encounter someone who either contradicts our life paradigm or is simply too unfamiliar with our priorities to even make conversation?

Visualizing social media data allows us to make sense of massive amounts of raw data in a very clear way. Rather than relying on someone to sift through the noise to find the useful nuggets of information, data visualization gives us a holistic view so that we can make sense of a lot information within seconds.  It also prevents us from shielding our eyes to the inconvenient truths provided by those who just so happen to be outside our social streams.

Rio Akasaka, a first year Master’s student in Human Computer Interaction at Stanford and Infochimps user, created a good use case of how data visualization can help us make sense of what occurs via social media. Rio first downloaded an Infochimps data set of tweets pertaining to the Haiti earthquake that occurred a year ago. Using the Google Maps API, he plotted these tweets on a map to show when they occurred are where they came from.

projecthaiti Data Visualization Prevents Curation Bias in Social Media

You can actually see this data visualization in action here and learn more about how Rio created it here.

How would it alter someone’s perception to see only curated stories about the Haiti earthquake or the aftermath of the Gabriel Gifford shooting versus a bird’s eye version Rio’s visualization provides?


  1. Oliver Starr January 29, 2011 at 10:54 pm


    A truly though provoking post. As Chief Evangelist of Pearltrees, the first social curation community your ideas on curator’s biases are very interesting. Right now we are very actively engaged in getting to know our community and to learn a lot more about people who curate content and how tools like ours can help democratize curation the way that twitter has democratized publishing.

    The issue of bias as an inevitable aspect of curation may best be addressed by a truly socialist social system. At present such a system is working for Pearltrees, as our recently released “Team” feature has allowed teams and small groups to work together to collaboratively curate any topic. The lack of controls with respect to editing rights and even addition of new team members has the potential to avoid a lot of potential bias because it is highly likely – as in many of the teams in which I am a member – that the people that have “teamed-up” have no prior knowledge of one another and aside from having a common interest that they wish to curate may have no other information about or shared interests with anyone else on the team.

    It is fascinating to look at what’s been curated by some of the larger teams as a result of these mechanics in the application.

    Here are a couple of examples for you to check out: -Abstract, Supernatural – New Media Business Models – Music

    If you have any questions about Pearltrees you can find me @owstarr on twitter.

  2. michelle January 19, 2011 at 11:47 am

    @Alex, it is true. There is no such way to escape one’s own bias. Someone implementing data visualizations does get to pick what fields to show and how prominently certain data should be displayed.

    That being said, so many data visualizations actually pull from APIs. The creator is forced to show ALL of the data versus select nuggets. I find that viewing data visualizations often allows me to better escape my little tech bubble perfect world to see all of the “neighborhoods” and relationships that exist on the web.

    Thanks to you and @Chris for leaving intelligent comments on our blog! :D

  3. Alex Jones (@BaldMan) January 19, 2011 at 11:12 am

    Great post Michelle, it sparks some great questions.

    The recent swing in popularity of the term ‘curation’, masks the fact that the majority of humans have relied on few sources of knowledge and information throughout our history. Whether we look at the current media landscape, the newspapers before that or place our lens on the town criers of Medieval Europe or Rome, we see that most people do not have the time, resources and/or inclination to seek out multiple sources of news and information.

    Data visualization isn’t a curators tool – it’s a form of content creation and it has the same weaknesses as any other content medium. There isn’t a way to eliminate the natural biases of the curators, nor the instinct of most people to seek like-minded sources for their content and news (MSNBC, Fox News, HuffPo, etc.).

    As with any other form of content, interesting visualizations will have a higher rate of selection by curators.

    Sadly, those with overt biases will select and twist data to suit their views. Conversely, even the most trusted of sources will be doubted by those who don’t like the results, even with an amazing and accurate visualization.

    The power of the visualization lies in recognizing trends at a much higher level or over a greater period of time than has ever been possible. That knowledge will guide the creators and those who seek a deeper understanding.

  4. Chris Almond January 19, 2011 at 11:02 am

    Very interesting post.

    It all depends on how its curated. In the case of a large scale disaster like the Haiti Earthquake, it would be great if the curation could rank tweets on some kind of response importance scale. But, then you’d have to have some way of facilitating near-real-time curation of the stream. Now there’s a big juicy problem to solve (hello I would think any type of breaking news event or something requiring ASAP large scale coordinated response could benefit from that. With a huge crowd tweeting you would need a small crowd curating. A twitter SOS first responder corp?

    Another angle – a superbowl commercial incites the audience to tweet a reaction (w/correct hashtag of course). The gross stream (uncurated) in a visualization like this will would show the response heat map by geo loc. The curated stream could surface various filtered patterns for marketologists to crunch, etc.