25 Years of the World Wide Web

Anyone reading this blog post right now knows the significance of the World Wide Web. It’s an invention that has revolutionized our world and given rise to seemingly boundless creativity, innovation, collaboration and knowledge — but it hasn’t yet reached its full potential. Father of the Web, Tim Berners-Lee, named some of the key challenges we still face:

  • How do we connect the nearly two-thirds of the planet who can’t yet access the Web?
  • Who has the right to collect and use our personal data, for what purpose and under what rules?
  • How do we create a high-performance open architecture that will run on any device, rather than fall back into proprietary alternatives?

Berners-Lee and the Web Foundation are launching the net-neutrality Web We Want campaign to promote changes in public policy to make sure the web stays open, free and accessible. In his guest blog post for Google Berners-Lee writes, “On the 25th birthday of the web, I ask you to join in—to help us imagine and build the future standards for the web, and to press for every country to develop a digital bill of rights to advance a free and open web for everyone. Learn more at and speak up for the sort of web we really want with #web25.”

119efc1b cf09 4f4f 9085 057e76e0464c 25 Years of the World Wide Web

The Quantified Cow: The Internet of Things for the Dairy Industry

The future is here. We’ve all heard about the Internet of Things, another buzz word circulating the tech community recently. Although technically in existence for more than two decades, the Internet of Things movement has gained greater momentum in the last few years—most notably stepping into a bigger spotlight with Google’s $3.2 billion purchase of Nest Labs, a home device company responsible for the best-selling Nest thermostat. By keeping track of manually inputted temperature settings and surrounding environmental data like room humidity and lighting, Nest eventually collects enough data to learn the daily behavior and preferences of the residents in the home.

These ideas tie into the concept of the Quantified Self, the movement to incorporate technology into data acquisition on aspects of a person’s daily life. Things like daily food consumption, quality of surrounding air, blood oxygen levels, physical and mental performance, and even mood and arousal can be tracked, measured and analyzed—all in the name of improving daily functions and making better decisions (or maybe just nodding thoughtfully at the data instead).

Milking the Benefits in the Dairy Industry

So how does the Quantified Self and the Internet of Things fit with cows, pastures, farmers and milk? Three words: robotic milking machines. Dutch company Lely, self-proclaimed innovators in agriculture, created the Astronaut A4, a state-of-the-art “fully automated milk harvester.” Although the robotic milking machine will set you back about $200,000, the Lely Astronaut A4 collects a large range of cow data to help dairy farmers make better decisions regarding milk production and herd management.

cow 300x168 The Quantified Cow: The Internet of Things for the Dairy Industry

The A4 keeps track of each individual cow’s feeding and health history, preventing cows from sneaking back into the machine for more food if they return too close to their last visit. The system tracks different variables on each cow as it’s being milked: its weight, milk production, time required for milking, amount of feed eaten, and how long the cow chews on its cud. If there’s a health issue with one of the herd, farmers can isolate the problem right away. The machine collects data on the milk itself too, checking the color fat and protein content, temperature, somatic cell count and overall quality.

Equipped with access to more data, dairy farmers are able to gain greater knowledge into their industry and thus maximize outputs. All of this data have translated to better decision-making for the farmers, better quality control of milk production and generally happier cows—and who doesn’t want happy cows? Having a machine do the work allows farmers to focus their energy elsewhere too, freeing up time for really anything else. The trend is clear: as the technology continues to get better, I have a feeling we’ll be seeing a lot more quantifiable and actionable data. Quantified Self, the movement to incorporate technology into data acquisition on aspects of not only a person’s daily life, but a cow’s daily life as well.

162e2ef6 f2d3 4701 97b7 4fd140b7a864 The Quantified Cow: The Internet of Things for the Dairy Industry

Take Our CIO & Big Data Survey (and win $100 Amazon gift cards)

We learned a lot from our 2013 CIOs & Big Data report — for instance, that 96% of enterprises have Big Data in their top 10 priorities list, but 55% of Big Data projects aren’t completed. This year we’re interested in seeing if the stats have changed, and we want to hear from you.

takethesurvey big Take Our CIO & Big Data Survey (and win $100 Amazon gift cards)

Take this quick survey and tell us what you really want your execs to know when doing Big Data projects. It only takes 10 minutes, and you can win awesome prizes like $100 Amazon gift cards and a membership to, one of the largest community-driven sites focused on enterprise technology, IT education and professional growth.

Although the gift cards cap out at $100, we think letting your execs know how you truly feel is pretty priceless. Take the survey here.

162e2ef6 f2d3 4701 97b7 4fd140b7a864 Take Our CIO & Big Data Survey (and win $100 Amazon gift cards)

Movies + Charts = Nerdy Creativity

I love movies. I love charts. I have to say that FlowingData did it again – this is brilliant:

AFI movie quotes Movies + Charts = Nerdy Creativity


In celebration of their 100-year anniversary, the American Film Institute selected the 100 most memorable quotes from American cinema. FlowingData took those quotes and created the 100 most memorable quotes in chart form.

See the chart in bigger detail, here. >>

As always, thank you FlowingData for providing interesting posts for us data nerds.

3527b357 2038 47ae a163 deda4a8c5176 Movies + Charts = Nerdy Creativity

Part 2: The Lucky Break Scoreboard

Last week, Infochimps CTO Flip Kromer introduced his truth on the failures that led to the successful acquisition by CSC in his blog post, Part 1: The Truth – We Failed, We Made Mistakes.  Flip continues his blog series with Part 2, his love letter – the real Infochimps story.


7 years ago, having switched majors from Computer Science in college to Physics in grad school, and failing twice to successfully execute a plan of research in Physics, I decided to switch to Education – my favorite part of grad school was teaching. A year before, my ever-patient advisor, physics professor Mike Marder, had started a wildly successful alternative program for a public-school teaching certification. It replaced a full general education curriculum with frequent in-classroom experience and focused education classes  — and it let me reuse the scientific coursework I already had way too much of.

A year later, I was near the end of the program and preparing my teaching portfolio, which led me to spend a lot of time thinking about what I wanted my students to learn, and why. For many of them, my course would be their last formal chance to acquire the skill of quantitatively understanding their universe. As I started to write (less bluntly), I had no interest in burdening them with three different forms of the quadratic equation, or pretending that as a practicing physicist I’d ever used the formula for the perimeter of a trapezoid.

What they should be learning was the ability to make use of a complex information stream, understand sophisticated information displays, and extract straightforward insight using tools such as … … ‽‽

I paused, struck, mid-sentence. Those tools do not exist. Not for a high school student, not for a domain expert in another field, and only after years of study, for me. That’s what I was supposed to be working on: democratizing the ability to see, explore and organize rich information streams.

So as a lapsed computer scientist and failed physicist, I decided to abandon education as well and start yet a different new thing, one that was none of those and all of those together.

Challenge Accepted

I asked Mike Marder if I could come back to his research group and work on tools to visualize data; we could figure out along the way how to tie it into a research plan. I had some savings (thanks largely to my Grandmother, who was just your typical successful 1940’s woman entrepreneur), so I wouldn’t cost him any money. Mike reasoned that although I didn’t know how to solve my own problems, I was frequently useful in helping others solve theirs — and who knows, I seemed really fired up about this new idea whatever it was. So all in all it was an easy decision to hide me away in a shared office and let me get to work.

Building the visualization tool required demonstration data sets to prove the concept, and there are few better than the ocean of numbers around Major League Baseball.

In addition to the retrosheet project — the history of every major-league baseball game back to the 1890s — was publishing one of the most remarkable data sets I knew of. For the past seven years, it gives every single game, every single at-bat, every single play, down to the actual trajectory of every single pitch. I first started playing with the retrosheet data, and found some scattered errors — things like a game-time wind speed of 60mph.

(Lucky break scoreboard: most patient graduate advisor ever; financial safety and family support.)

Weekend Project Gone Awry

Well, the NOAA has weather data. Lots of weather data. The hour-by-hour global weather going back 50 years and more, hundreds of atmospheric measurements for every country in the world, free for the asking. And the Keyhole (now Google Earth) community published map files giving the geolocation of every current and historical baseball stadium.

So if you’re following, we have:

  • A full characterization of every game event
  • … including the time of the game and the stadium it was played in,
  • … and so using the stadium map files, the event’s latitude and longitude
  • … and using that lat/long, all the nearby weather stations
  • … and using the game date and time, the atmospheric conditions governing that event

I connected the data sets looking to correct and fill in the weather data, and found out I accidentally wired up a wind tunnel. There’s no laboratory with the budget to have every major league pitcher throw thousands of pitches for later research purposes — none, except the data set I described.

What’s screwy (and here’s where every practicing data scientist groans and shakes their head) is that the hard part wasn’t performing the analysis. The hard parts were a) making that data useful, and b) connecting the data sets, making them use the same concepts and measurement scales.

But all that work — the mundane, generic work anybody would have to do — just sat there on my hard disk. If I created a useful program, or improved an existing public project, I knew right where to go: open-source collaboration hubs like sourceforge or github. But no such thing existed for data. I had to spend weeks transforming the MLB game data into a form that you could load into a database. If we could avoid that repetition of labor, we would solve the problem of every practicing data scientist.

On Christmas Day 2007, I bought a book on how to build websites using the “Ruby on Rails” framework, and figured I’d knock something useful out in, y’know, a week or so. By sometime that Spring, I had something useful: a few interesting data sets and a website to generically host and describe any further data sets. The initial version of the site was read-only, because I didn’t know how to do join models or form inputs in Ruby on Rails, but I could add new data sets directly to the database. And just like that, Infochimps was born.

I cold-emailed blogger Andy Baio, who linked to “Infochimps, an insane collection of open datasets”. For a guy working alone in an ivory tower, the resulting response was overwhelming.

One of the individuals who emailed to encourage us was Jeff Hammerbacher, founder of the data team at Facebook. Chatting on the phone with him, he told me about a new data analysis tool that Facebook was using, called Hadoop. I looked into it, but couldn’t see how I would ever need to use it. Still, it was really exciting that big names in data were taking interest.

On a trip to San Francisco a few weeks later, I went to a meetup at Freebase. @skud, their community manager, recognized that Infochimps was the perfect raw-data complement to Freebase. She asked me to come back the next month and give a meetup talk. Kurt Bollacker, head of their data team (and future teammate and profoundly valuable mentor), asked me to come back the next day and give an internal lunch lecture. I stayed up all night using google docs on my uncle’s powerpoint-less computer, and gave some hot mess of a presentation to their internal group. Kirrily didn’t uninvite me, so it wasn’t too bad.

It was clear that the lack of a collaboration hub was a problem many people were feeling.

So as a lapsed computer scientist, failed physicist, and no-show educator, I decided to abandon working on a visualization tool and make a collaboration hub instead. Yup.

(Lucky break scoreboard: most patient graduate advisor ever; financial safety and family support; incipient critical mass of public data sets; new breakthroughs in the world; big names taking interest in the project and deciding to market it.)


One of the new faces on Mike’s research team when I returned was Dhruv Bansal, who was working on a fascinating problem bridging Mike’s two interests: physics and education. They used a freedom-of-information request to acquire a fascinating data set: the anonymized test scores for every student, on every question, for the yearly exam taken by every schoolchild in Texas.

They used the physics equations for fluid flow to model the year-on-year change in student test scores, highlighting patterns that demanded immediate action within the education community.

As you can guess again, the costliest part of that project was not performing the analytics; or applying the Fokker-Planck equation for fluid-flow; or working the paper through peer review. No, the costliest part of the project was the 3-month process of acquiring the data and cleaning it for use. For the random researcher who discovered and requested the data, Dhruv would spend a few hours burning the data to a DVD and physically mail a copy. For reasons I still don’t understand, while researchers in Sociology, Psychology, other “soft” sciences immediately latched on to the usefulness of Infochimps from the very start, Physicists and Computer Scientists almost never understood what we were doing or why it might be valuable. Dhruv and Mike’s split focus meant they got it immediately.

This is probably the most unlikely lucky break, and most crucial development, of this adventure: sitting a few offices away from where I worked was one of the most talented programmers I’ve ever worked with, possessed with a mountainous drive to change the world, the laconic cool to keep me level, and a furious anger at the same exact problem I was working to solve.

(Lucky break scoreboard: most patient graduate advisor ever; financial safety and family support; incipient critical mass of public data sets; new breakthroughs in the world; big names taking interest in the project and deciding to market it; sharing the same advisor as Dhruv.)

Twitter Dreams

At around this time Twitter was blowing up in popularity, though still a tool largely used by nerds to tell each other about what they had for lunch. We couldn’t explain, any more than most, the appeal of Twitter a social service.

But to 2 physicists with a background in the theory of random network graphs, Twitter as a data set was more than a social service, it was a scientific breakthrough. It implemented a revolutionary new measurement device, giving us an unprecedented ability to quantify relationships among people and conversations within communities. Just as the microscope changed biology, and the X-ray transformed medicine, we knew seeing into a new realm places us on the cusp of a new understanding of the human condition. Making this data available for analysis and collaboration was the best way to provide value and draw attention to the Infochimps site. We emailed Alex Payne, engineering lead at Twitter, for permission to pull in that data and share it with others. He gave me a ready thumbs-up: better that scientists download the data from us, than that they pound it out of his servers.

We wrote a program to ‘crawl’ the user graph: download a user, list their followers, download those users, list their followers, repeat. That was the easy part. Sure, each hundred followers had hundreds of followers themselves, but we could make thousands of requests per hour, millions of requests per week.

The hard part came over the next few weeks as we realized that none of our tools were remotely capable of managing, let along analyzing, the scale of data we so easily pulled in. As quickly as we could learn MySQL, the data set outgrew it. Sure, Dhruv and I could request supercomputer time for research, but supercomputers weren’t actually a good match — they’d be more like a rocketship when what we needed was a fleet of dump trucks. We realized what we needed was Hadoop, the tool Jeff Hammerbacher mentioned to me a few months earlier.

But where could we set up Hadoop? The physics department’s computers were scattered all over and largely locked down. But I also had an account on the UT Math department’s computers. Their sysadmin, Patrick Goetz, was singularly passionate about enabling researchers with the tools they needed to make breakthroughs. He took the much more courageous (and time-consuming for him) route of allowing expert users to install new software across departmental machines.

What’s more, the Math department had just installed a 70-machine educational lab. During the day, it was filled with frustrated freshman fighting Matlab and math majors making their integrals converge. From evening to 6am, however, they were just sitting there… running… inviting someone to put them to good use.

So that’s what we did; put them to good use. We set up Hadoop on each of the machines, modifying their configuration for the comparatively wussy undergrad-lab hardware, and set about using this samizdat supercluster on the Twitter user graph.

(Lucky break scoreboard: most patient graduate advisor ever; financial safety and family support; incipient critical mass of public data sets; new breakthroughs in the world; big names taking interest in the project and deciding to market it; sharing the same advisor as Dhruv; the explosion of social media data; the invention of Hadoop.)

Data Community

All through 2006-2009, people walking different paths — social media, bioinformatics, web log analysis, graphic design, physics, open government, computational linguistics — were arriving in this wide-open space, forming communities around open data and Big Data.

On Twitter, we were finally seeing what all the people in our favorite data set knew: a novel communication medium that enabled frictionless exchange of ideas and visible community. I’ll call out people like @medriscoll (CEO of Metamarkets), @peteskomoroch (Pricinpal Data Scientist at LinkedIn) @mndoci (Product Manager of Amazon EC2), @hackingdata (Founder of Cloudera, now professor at Mt Sinai School of Medicine), @dpatil (everything), @neilkod and @datajunkie (Facebook data team), @wattsteve (Head of Big Data at Red Hat), among dozens more. It didn’t matter if someone was a random academic, a bored database engineer, a consultant escaping one field into this new one, a big name building the core technology. When you saw a person you respected talking to a person with a good idea, you hit “follow”, and you learned. And when you heard that someone in the Big Data space wasn’t on Twitter, you harangued them until they joined. (Hi, Tom!)

Meanwhile, Aaron Swartz had started the Get.theinfo Google Group. This most minor of his contributions had a larger impact that most know, and was typical of why he’s so missed. He recognized a problem (no conversation space for open-data enthusiasts), built just enough infrastructure to solve it (a google group and a website), then galvanized the community to take over (gifting enthusiastic members with the white elephant of moderator permissions), and offered guidance to make it grow.

The relationships we built and communities we joined became critical catalysts for our growth.

Twitter Reality

We spent the next several months building out the site during the day and running analysis on the growing hundreds of gigabytes by night (does that seem quaintly small now?). Right before Christmas break, we did a set of runs producing data suitable for people in the community to find useful. Hours before hopping on the plane to visit my family, I finished compressing and uploading them, wrote up a minimal readme file, and posted a note to the Get.Theinfo mailing list. I knew the folks there wouldn’t mind the rough cut version, so I figured I’d mention it quietly there, but wait to do a proper release after break — after all, there was no internet where I’d be staying.

Well, two predictable things happened: 1) a huge response, far more than expected, flowing up the chain to large tech blogs and twitter-ers, and 2) a polite but forceful email from Ev Williams (Twitter’s CEO) asked us to take the data files down while they figured out a data terms-of-service. We reluctantly removed the data.

Sure, the experience was a partial success. It brought great publicity, and of course you probably caught the foreshadowing of how important Hadoop was about to become for us. But we failed at the important goal, sharing this immensely valuable data we invested months to release.

Minister of Simplicity

Now to introduce Joe Kelly into the story. Our research center decided to hire someone to build our new website, and one of the respondents to our Craigslist ad was Joe, a former UT business school student who had been working with his roommate to get their general contracting firm off the ground. He didn’t really know how to design websites, but he absolutely loved reading about the science our center was doing, so he applied.

His interview was amazing. He had the design sense of a paper bag compared to the other candidates, but every one of us left the room saying, “wow, that guy was awesome, the kind of person you just want to work with on a project”. Only Dhruv was smart enough to take the face-slappingly obvious next step — replying 1-to-1 to a later email from Joe to say, “well, hey, we also have this other project going on; we don’t really want need your help on the website, but there’s a lot of work to do”. Within days, Joe had set up a bank account and PO box, organized the papers to make us an official partnership, and generally turned this ramshackle project into an infant company. It was an easy decision for Dhruv and I to make him a co-founder.

An easy decision until a few days later, when I read some cautionary article about how the #1 mistake companies make is choosing co-founders hastily. Well, hell. We just made this guy we randomly met a couple weeks ago a co-founder, handing him a huge chunk of the company. I didn’t know if we just made a huge mistake or not.

So the next day, we were hanging out at the Posse East bar (our “office” for the first several months of the company), and Joe introduced us to the idea of an Elevator Pitch. “If we’re going to be at the South by Southwest (SXSW) Conference, we need to be able to explain Infochimps”. I replied with some kind of rambling high-concept noodle. Dhruv rang in with his version — more scientific, more charm and cool, but no more useful than mine.

Joe replied, “No. What Infochimps is this: ‘A website to find or share any data set in the world'”.

I rocked back in my chair and knew Dhruv and I made one of the best decisions of our lives. His version said everything essential, and nothing more. In one week, he understood what we were doing better than we did after a year. Joe’s role emerged as our “Minister of Simplicity”. He removed all complications, handled all necessary details, smoothed all lines of communications, making it possible for our team to Just Hack. Everything essential, and nothing more.

Capital Factory

With the decision to move forward as a company, not an academic project, we applied to the starting class of Capital Factory (Austin’s startup accelerator). It was an amazing experience, and we went hard at it: we hit all the meetings, spent hours working on our pitch, tried to make contact with every mentor, and made an epic application video. (One of Dhruv’s housemates was a professional filmmaker. Friends in high places.)

We got great feedback and obvious interest from the mentors, and were chosen as finalists. We were confident that we had the right combination of team and big idea to merit acceptance.

They rejected us.

After the acquisition, Bryan Menell — one of the Capitol Factory founders — posted a graciously bold blog post explaining what happened. As we later heard from several mentors, they each individually loved our company. Once in the same room though, they found that none of them loved the same company. This mentor loved Infochimps, a company that would monetize social media data. This other one loved Infochimps, a set of brilliant scientists who could help businesses understand their data. Some of them just knew we worked our asses off and were incredibly passionate about whatever the hell it is we were doing but couldn’t explain. A few of the mentors loved Infochimps because we were building something so cool and potentially huge that surely some business value would later emerge. Whichever idea a mentor did like, they generally didn’t like the others.

I can’t overstate how difficult it was to explain what we were doing back then. After two years, we can now crisply state what we had in mind: “A platform connecting every public and commercially available database in the world. We will capture value by bringing existing commercial data to new markets, and creating new data sets from their connections.” It’s easy(er) now, partly because of the time we spent to crystallize an explanation of the idea. Even more so, people now have had years of direct experience and background buzz preparing them to hear the idea. For example, the concept that “sports data” or “twitter data” might have commercial value was barely defensible then, but is increasingly obvious now.

Above all that though, the Capital Factory mentors were right: we were all those ideas, and all of those ideas were (as we’d find out) mostly terrible. And working on the combination of all of them was a beyond-terrible idea. On that point, Capital Factory was right to reject us.

We worked hard, had the perfect opportunity, and failed.

For good reasons and bad, we failed to get in, Or, well, we mostly failed to get in. Some of the mentors liked what they heard enough to stay in touch — meeting for beers and advice, making introductions, and being generous with their time and contacts in many other ways. The Austin startup scene was about to explode, led by Joshua Baer, Jason Cohen, Damon Clinkscales, Alex Jones and others. The energy that the Capital Factory mentors and these other leaders put into mentoring startups like ours ricocheted and multiplied within the community, in the kind of “liquid network” that Steven Johnson writes about. Although the companies within the first CapFac class benefited the most, it was like every startup in Austin was admitted.

The Truth

On the one hand, we had a bunch of fans in blog land, some website code, and a good team. But we had no idea how to make money and a finite runway. Our most notable validation as a project was a failed effort to share data, and our most notable validation as a business was an honorable mention ribbon.

Are you seeing it?

We were experiencing success after success after success.

Every time we failed, a smaller opportunity opened: one that was sharper; one that was more real; one that brought us closer to the right leverage point for changing the world.

These opportunities were smaller, but the energy behind them was the same. We were following what inspired people — to use data sets from Infochimps, to post a data set, to join our pied-piper team, to tweet about us, to make an intro, to have coffee and teach us something. All our ideas were useless crap, except in one essential way: to gather and inspire the people who would help us uncover a few ideas that were good, and execute on them.

(Lucky break scoreboard: most patient graduate advisor ever; financial safety and family support; incipient critical mass of public data sets; new breakthroughs in the world; big names taking interest in the project and deciding to market it; sharing the same advisor as Dhruv; the explosion of social media data; the invention of Hadoop; the completely random intersection with Joe; starting Infochimps just as the Austin startup scene exploded.)

The 3rd part of this blog series will highlight the journey from “project that inspired people” to “business that solved a real problem” — powered by individuals who made sizable investments of time, energy, money and kindness to produce repeated successes from repeated failures, and by the early customers of Infochimps who believed in us.

As we go, that  “lucky break scoreboard” will get more and more improbable, enough to make that word “lucky” ludicrously inapplicable.

Philip (Flip) Kromer is co-founder and CTO of Infochimps where he built scalable architecture that allows app programmers and statisticians to quickly and confidently manipulate data streams at arbitrary scale. He holds a B.S. in Physics and Computer Science from Cornell University and attended graduate school in Physics at the University of Texas at Austin. He authored the O’Reilly book on data science in practice, and has spoken at South by Southwest, Hadoop World, Strata, and CloudCon. Email Flip at or follow him on Twitter at @mrflip.

b0bae296 90b0 4bfe 8177 b5ac72be71c6 Part 2: The Lucky Break Scoreboard

Infochimps SXSW Panels: Voting Closes Tomorrow

sxswi 2014 Infochimps SXSW Panels: Voting Closes TomorrowCalling all supporters, calling all supporters, it’s that time of year again.

SXSW Panel Voting! Voting ends tomorrow, Friday, September 6, 2013 (11:59pmCST) – Please read the panel submissions below and vote for your Chimps.

Growing an Open-Source Project: Code to Community 

  • Speaker: Infochimps CTO Flip Kromer
  • Description: How do you grow an open source project from “It’s public and has a LICENSE file” to “Caught fire; people we’ve never met commit more code than we do”?
  • We’ll explore:
    • How do you promote awareness and word-of-mouth, and foster the early community?
    • How do you navigate and balance the twin goals of production stability and community-driven features?
    • How do you ensure code quality without discouraging involvement?
    • Of the values gained from open source – free velocity, hiring, credibility, reputation, and so forth – how much tangible value are you deriving and when does that return start exceeding investment?

VOTE 300x71 Infochimps SXSW Panels: Voting Closes Tomorrow



Managing Effective Documentation Effectively

  • Speaker: Infochimps Customer Support Engineer Rachel McCuistion
  • Description: Maintaining accurate, up-to-date, and effective documentation requires time, devoted content producer(s), and expertise. The essence of a company’s documentation should not hinder but accelerate the company’s focus and productivity. We’ll discuss the importance of creating effective documentation, how to maintain a healthy lifecycle for internal and external documentation, the common pitfalls that can lead to less effective documentation, answer the most common and difficult questions, and finally introduce an effective workflow for maintaining accurate and helpful documentation including the best tools have have been proven to increase efficiency and minimize downtime.

VOTE 300x71 Infochimps SXSW Panels: Voting Closes Tomorrow

Inbound Marketing for the Lean Startup

  • Speakers:
  • Description: Lean methodology has provided a great framework for validating your business assumptions and model — by leveraging an Inbound Marketing model with Lean, you can benchmark against your hypotheses while also growing your business in real, measurable ways. Proven lean startup veterans will teach you how to set up an inbound marketing engine and use the engine to test, validate, and grow your business using Lean tools and approaches. With over a decade of experience, we will share best practices, lessons learned, and pitfalls to lookout for. This workshop will be four hours long.

VOTE 300x71 Infochimps SXSW Panels: Voting Closes Tomorrow

Thank you for all your support and we hope to talk Big Data with you at SXSW.

Image source:

119efc1b cf09 4f4f 9085 057e76e0464c Infochimps SXSW Panels: Voting Closes Tomorrow

The President + Infochimps + Austin

obama21 The President + Infochimps + Austin

As Austin continues to thrive, making Top 10 Lists for everything from innovation to affordable housing, President Obama himself came to see what’s going on. “I’ve come to listen and learn and highlight some of the good work that’s being done,” Obama said during his visit to Austin. “Folks around here are doing something right.”

We are doing something right – these lists speak for themselves:

  • Best City for Small Business nationally by The Business Journals
  • #1 large city for young entrepreneurs according to Under30CEO
  • #1 among the 100 largest U.S. metros based on amount recovered from pre-recession peak to the present based on employment, unemployment, output, and house prices according to Brookings Institution
  • #3 fastest-growing tech job market according to
  • #3 “Best Cities for Good Jobs” list according to Forbes

In his recent visit to Austin, President Obama stopped by Capital Factory, an incubator for technology startups where he learned about Austin’s technology community, and was introduced to Infochimps. Wanting to move to Austin?

Come Work With Us >>

6fefa857 2e95 4742 9684 869168ac7099 The President + Infochimps + Austin


Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data

developer community Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big DataHere at Infochimps, we value the developer community. From Founder and CTO Flip Kromer being named by GitHub as one of the Top 100 Contributors in 2012, to hosting community discussion webinars, Infochimps tries to keep developers’ needs and interests in mind.

Here are some resources catered towards the developer community:

In case you missed it: Knowing analytics tools designed for developers’ needs are in high demand, Derrick Harris writes an article highlighting the top 12 Big Data tools developers need to know: “A programmer’s guide to big data: 12 tools to know

Pre-Order Now – Big Data for Chimps: In addition to being a prolific code contributor and one of the nations’ leading data scientists, Flip Kromer is the author of Big Data for Chimps, A Guide to Massive Scale Data Processing, published by O’Reilly, and available for pre-order now.

Upcoming Webinar: Ironfan – A Community Discussion
Thursday, January 31 @ 10a P, 12p C, 1p E

Join Nathaniel Eliot, DevOps Engineer and lead on Ironfan, in this community discussion. Ironfan is a lightweight cluster orchestration toolset, built on top of Chef, which empowers spinning up of Hadoop clusters in under 20 minutes. Nathan has been responsible for Ironfan’s core plugin code, cookbooks, and other components to stabilize both Infochimps’ open source offerings, and internal architectures.

Register Now Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data

119efc1b cf09 4f4f 9085 057e76e0464c Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data

Big Data Love + Upcoming Events

Last Friday, we hosted our famous Big Data Love Event at Capital Factory‘s snazzy new co-working space.

big data capital factory Big Data Love + Upcoming Eventscapital factory bookcase Big Data Love + Upcoming Events

Aside from the great view from the 16th floor of the Omni Hotel, neatly organized office space, and a secret meeting room behind bookshelves (so cool), we had the opportunity to catch up with Austin’s most successful entrepreneurs and many friends. For those of you who made it out, thank you. Hope to see you at the next event!

This week, Infochimps is presenting at Intel Developers Forum in San Francisco and moderating a panel at Big Data Innovation Summit in Boston. You can also find us at the following events coming soon:

If you’re in Austin, join us for these upcoming community events:

  • Austin R User Group, Thurs, Sept. 27: We love our local Meetup created to support and share R experience and knowledge among the Austin community
  • ATX Startup Crawl, Thurs, Oct. 11: A chance to mingle in Austin’s hottest startups’ office space, chat with some of Austin’s most renowned entrepreneurs, and drink a free beverage – all at the same time

For other Austin community events:

  • Lean Startup Machine, Fri, Sept. 21: A 3-day workshop where attendees use Customer Development and Lean Startup principles to validate an idea for a new product or service
  • Girl Hacker Drink-up, Wed, Sept. 26: An informal group of female developers in Austin who meet once a month to discuss projects, share new insights, do some coding
  • Austin CTO, Tues, Oct. 2: An opportunity for members of Austin CTO to discuss strategies and thought-leadership over dinner

Much gratitude to Joshua Baer and Capital Factory.

SXSW Panels: Vote Today for Infochimps

SXSW2 SXSW Panels: Vote Today for Infochimps

One. Week. Left.

You’re familiar with South by Southwest (SXSW) held in Austin each March, yes?

SXSW panels are up for voting and we’d love your support. Read the panel submissions below and be sure to vote!

Voting ends August 31st.

A huge thank you to those of you who supported us on Twitter:

  • @hashonomy_gus: SXSW PanelPicker #bigdata #sxsw (via @infochimps)
  • @dteten: RT @ffvc RT @infochimps: SXSW 2013 PanelPicker: Vote Today for @josephkelly: The Tao Te Chimp: A Principle Driven Approach
  • @eldonnn: SXSW 2013 panelpicker under way. wanna go! waah! this panel by @infochimps

 SXSW Panels: Vote Today for Infochimps

SXSW Image courtesy of SXSW