To Open or Not to Open Data: A Private Organization’s Dilemma

Open data has thus far largely been associated with government data. Though government data is indeed valuable, the potential of the data that private organizations gather has been overlooked. These organizations usually don’t realize the potential that their data holds.

At the Data Cluster last month, our own Dhruv Bansal and Gil Elbaz of Factual led the Open Data Birds-of-a-feather session. Using insights from that discussion, and some of our own, we want to highlight some pros and cons of this process to help organizations determine whether opening their data is the right move:

1. Profit generation – Almost all data will have some value to someone else, whether an organization realizes it or not. Putting up data for sale would help these organizations realize how valuable their data is and may even provide another revenue stream from this latent resource. For example, a firm with data on parking meter locations and occupancy rates can sell it to a firm building an iphone app to help you reliably find parking in our nation’s downtowns.
2. Crowd-sourced curation – Gil commented that a lot can be gained from crowd-sourced curation. Firstly, the organization avoids the costs of curating the data themselves. Secondly, the pool of brains working on the data can amount to incredible products that were not immediately evident, especially when your data is mashed with others’. In this Factual table of Nationwide Restaurants, geo data is mashed with information and reviews of restaurants from sites yelp, Yahoo! Citysearch and Zagat, to make this interactive search table.
3. Potential uses – There are many different uses for data that range from cool informational data visualizations to applications to mining for insights. The organization avoids the costs of having to set up infrastructure and gather manpower to translate the data into these products by opening their data for others to use.
Some examples of what has already been done with open government data can be found in a previous blog post “Open data applications”
4. Exposure – Organizations can gain exposure from opening their data, especially now while it’s still relatively uncommon, positioning itself on the cutting edge of the data sphere. Additionally, transparency is demanded more these days, and this is one of the ways to achieve that. Best Buy has an open API called the Best Buy Remix of their product catalog. With this open API, they not only leave the development of apps to others, but they also gain exposure and generate business from apps that would, for example, allow users to search for products they want and get details on it (location, price, specs, etc).

1. Historically difficult – The development of the market for alternative data is relatively new. Opening data used to be incredibly difficult, expensive and labor-intensive. Large amounts of data took a lot of time and were extremely hard, if not impossible, to process. However, things such as cloud computing and processing tools like Hadoop have helped address these problems, making the whole data process a lot easier.
2. Privacy concerns – These fall under two types: First, some companies might be concerned about certain data being accessed by their competitors. This problem can be avoided since companies can choose what data they open and keep more sensitive data secret. In the end, these organizations might find that the data that is crowd-sourced may result in interesting insights that would further develop their product/service. Second, there are also concerns about users’ personal data. Efforts need to be made to ensure that they understand how their data is being used, security upheld, and how to opt-out if they choose to do so.
3. Data processing – Some organizations don’t have the capabilities to process the data for public consumption, but if they really do have valuable data, then a cost-benefit analysis might show that setting up the required infrastructure is worth it. If a company just doesn’t have the resources for this, as mentioned earlier, it can leave some of the data processing to the crowd.
4. Reservations about crowd-sourcing – Someone from Wolfram Alpha pointed out that companies may believe that expert curation is better than crowd-sourcing. What these companies fail to realize is that there are increasingly more people fluent in data. Crowd-sourcing their many talents and ideas means that a lot more can be done with their data- things that one expert alone may overlook.

Verdict? Open your data! The data market is growing and infrastructure is developing alongside. The traditional hindrances to opening data, such as the scarcity of people who can curate data, the difficulty of identifying buyers, and the impossibility of handling large amounts of data, are dissipating. Instead, a lot of potential lies in the data, from financial gains to the increase of brand recognition. With all this in mind, companies need to take a second look at their data and evaluate its worth.


  1. agatheb May 6, 2010 at 10:53 am

    How about the simple financial and competitive advantage kinda reasons? Also, many organizations are very bad at doing anything useful with whatever data they have. They’re basically incompetent at extracting the real value from it. In some cases, i think an organization believes some of its data is a gold mine, obviously does not want to share it, but is itself incapable of leveraging it. The data ends up sitting somewhere for noone’s benefit.