Patients Like Me: Get Your Data Out of the Silo

I stumbled upon this TED Talk yesterday and found it too compelling not to share.  The speaker, Jamie Haywood describes his brother’s fight with ALS and the ingenious website they built together, where people share and track data on their illnesses.  Their discovery? The enormous power of collective data to explain and predict disease progression.

An individual patient can know their own symptoms, what drugs and treatments they are using, etc, but to deeply understand what fellow folks afflicted with your ailment are going through and how those findings can help you on your path, is far more powerful knowledge.

Eternal Flame

eternal flame Eternal Flame
Thanks, xkcd.

The Importance of Office Joy

You can please some of the people some of the time, all of the people some of the time, some of the people all of the time, but you can never please all of the people all of the time.” – Anonymous (although often attributed to Abraham Lincoln)

Happy Monkey Loves Camera  The Importance of Office JoyAs Minister of Office Joy, the OM (Office Manager) of Infochimps, I am well aware of the truth of this statement. However, my career goal is to prove it wrong (at least occasionally), and the chimps have given me a chance to attain this seemingly impossible dream.

My job is pretty simple: remove obstacles to my team, help provide them with the tools they need to shine, and increase team joy.  The thought is that happy teammates are productive teammates and productive teammates are happy teammates.  A few months ago, we started the Office Joy fund. Responsibility for this fund was put in my hands (insert evil laughter) and I thought, “There has to be a way to get everyone involved and please the majority of the team most of the time, (and maybe even all of them at once every now and then).”


Chimps in Chicago | Infochimps Developer Contest

funny baseball pictures 124 243x300 Chimps in Chicago | Infochimps Developer ContestWhat are you doing on Friday, July 22nd at 1:20pm CT? You could be knocking back brewskis with our CEO, Nick Ducoff at historic Wrigley Field. He’s got prime tickets to the Cubs vs. Astros game and wants to take one lucky & talented developer out to the ball game. He’ll treat you to all the beer and peanuts you can handle – oh yeah!

All you’ve got to do is hack together an application or data visualization using at least one of our data sets or APIs and submit it to us by Wednesday, July 20th at 5pm CT. We’ll pick our favorite and let you know if you’re our winner by Thursday, July 21st!

Don’t live in Chicago, but still want to play along? We’ll send every person who enters a handful of starter decks of Startup: The Hackering, Infochimps’ infamous SXSW card game and some sweet stickers.

Here’s some of our favorite data sets and APIs we’d recommend starting off with:

For inspiration, check out what others have built using Infochimps data: App Gallery.

Entries must be submitted by Wednesday, July 20th at 5pm CT.  Click here for our entry form.

Data In Sight: Visualizations Built In Two Days

datainsight Data In Sight: Visualizations Built In Two DaysThis past weekend, Jacob Perkins and I attended data in sight: making the transparent visual, a data visualization competition organized by Creative Commons, Swissnex San Francisco and the Kingdom of the Netherlands.

Held in the Adobe SF office and structured as a competitive hackathon, the aim for teams was to create a complete data visualization from scratch in two days.  Participants came from all over the world and included folks from established large companies, small start-ups, academia, non-profits, and several lone freelancers.

Friday evening, the contestants were briefed on the challenge.  Our very own Jacob delivered a stellar presentation of a carefully curated collection of useful datasets, that included specific suggestions of how the data might best be used.  This layer of practical explanation really helped folks quickly understand and get excited about the beautiful possibilities of Infochimps datasets.

After the presentations, participants formed into 19 teams of 3-5 developers, designers and data experts.  The groups worked continuously until Sunday at lunchtime and in the end, 14 of the teams delivered a final presentation, and 8 of the 14 used Infochimps data.  (You can peruse those 8 visualizations here: Pathlist, Marvel Universe Social Graph, UFO Siter, Uber Shady, Parkalator, CuriouSnakes, Disaster Strikes: A World In Sight and Silenced.)

A group of 11 judges (including myself) evaluated the teams’ efforts and while most of the teams created some impressive results, we quickly agreed upon the ones we thought were the best.  There were five prize categories, and 4 out of the 5 winners used Infochimps data!

This is a multi-model parking cost optimization tool for San Francisco residents.  It helps drivers decide where to park to save money or whether it’d be cheaper to take a cab. parkalator 1024x448 Data In Sight: Visualizations Built In Two Days

Sloppy Joes, Slop-Sloppy Joes, anyone?

freelunch 300x300 Sloppy Joes, Slop Sloppy Joes, anyone?There is no such thing as a free lunch… except at Infochimps.

The idea behind free lunch being policy at Infochimps is that it helps people maximize productivity because each individual doesn’t have to think of where they want to eat, then what, and then go out, get it, and return; it also helps us bond as teammates to all eat at the same time and place. We’re constantly running the cost/benefit on this practice, but at least for now it seems to be much more beneficial than cost incurring.

Infochimps is about increasing both user and programmer joy in whatever ways we can. We’re always streamlining our processes, tweaking, and fixing things along the way. It’s amazing how sometimes a small but major pain point can be fixed with some deft coding.

As we’ve grown larger, making lunch easy to get for everyone and making sure that everyone could have their tastes accommodated was becoming a problem. I created a small database with all the restaurants we order from on a regular basis so that I could find their contact info and menus more quickly, and that helped for a little while, but even with that tool, we had a “lunch coup” one day. (It was peacefully resolved with some Asian take-out, and no one was harmed in the process.)

Enter: The Lunchlady

lunchlady simpsons 300x232 Sloppy Joes, Slop Sloppy Joes, anyone?

No, not that one! (more…)

Overview of Open Government Budget Crisis

It’s hard to say what will become of and Researcher and Scholar Vivek Wadhwa claims the sites have plenty of support from government officials, but do they have enough support from lawmakers to stay afloat? Reports claim that budget for and will plummet from $35 million to $2 million.

If there’s one thing we like to do at Infochimps, it’s collecting interesting nuggets of information for you to use. So here are some useful posts on the matter. Please share them with your friends so we can ensure support for open government:


A Data Driven Race to Solve America’s Health Care Woes

Over $30 billion was spent on unnecessary hospital admissions in 2006. Each of these unnecessary admissions took away one hospital bed from someone else who needed it more. Rather than waiting for politicians to settle their arguments about how to implement health care reform, health care provider Heritage Provider Network teamed up with data modeling and prediction competition network Kaggle to offer a very interesting solution.

Heritage Provider Network launched the Heritage Health Prize with one goal in mind: to develop a breakthrough algorithm that uses available patient data, including health records and claims data, to predict and prevent unnecessary hospitalizations. They’ve invited data scientists to help crack the problem, and the winner will receive $3 million.

$3 million sounds like a lot, but it could save Heritage Provider Network a considerable amount of superfluous claims and make our healthcare system much more efficient. How effective do you think data algorithms can be at distinguishing life-saving versus unnecessary visits? What data and precautions could be crucial for this contest to be a success?

To register your interest in the Heritage Health Prize that begins on April 4, please visit the official website. Be sure to check out other current and upcoming competitions at

Help Release Over 40,000 Songs with Lyrics at

My friend Tahir Hemphill has built the Hip Hop Word Count, a searchable database of over 40,000 songs with lyrics and metadata – including dates and geolocation of the artists.  Check out Tahir talking about the project:

He was picked up in ReadWriteWeb recently and he’s raised over $6,000 through his Kickstarter campaign, from the likes of Clay Shirky no less, to launch the service publicly.  And he’s started to share his data on Infochimps, now you can download a pack of Jay-Z lyrics.  You can find similar data on Infochimps by searching the music tag.

Show your support for another developer/artist that’s doing something cool with data, and contribute to his fundraising campaign. Tahir will be using the proceeds to release the data, and his tool, to the public.

Stay tuned next week for a release of data from the Million Song Dataset project, a massive dataset that catalogs the features of a million songs. It’s music data like this and from the HHWC project that help create web services like Pandora, neat graphics about whether crunk was first used in the South, and that make the dreams of us data hobbyists come true.

Sharing the Love

Data visualizations are like houses and neighborhoods, monuments even, built on the foundation that Infochimps is laying with our big data gathering and processing. We love it when people do really cool things with the information that we have on our site and just wanted to share a recent example with you. One of our users, Kennedy Elliott (@kennelliott) found subway trend data on our site and used it to make a really cool holiday greeting card that she sent to us. :)5266393170 b9918c1506 Sharing the Love