Monthly Archives February 2014

Reinvention of Enterprise Analytics

Authors: Mark Lenke and Shawn Nelson

Breaking the Barriers of Time and Expense

There’s no doubt that companies have benefited tremendously from business intelligence (BI) applications. Enterprise business intelligence (EBI) has enabled companies to spot emerging trends, identify new markets, serve customers more effectively and improve operational efficiencies.

Recently though, EBI solutions have had a hard time adapting to the information explosion that companies have experienced. Attempting to stuff massive volumes of data into the structure required by traditional BI systems is inefficient, expensive and time consuming.

Screen Shot 2014 02 28 at 2.00.59 PM 231x300 Reinvention of Enterprise Analytics

What’s more, systems have become more complex and difficult to use, limiting the types of insights that can be generated in a reasonable time frame. Big data solutions, which can efficiently handle large volumes of data, can also require people with a specific and hard-to-find skill set in order to get results. Typically, business process experts feel shut out from advances because new systems are too hard to use.

As a result, many companies have avoided implementing more advanced BI or next-generation big data solutions because there is a perception — real or imagined — that they take too much time and are too hard to use to justify the expense involved in implementation.

But avoiding the change means that companies are missing out on opportunities to gain new insights that can radically transform their business. The next generation of BI systems, commonly referred to as big data, offers a huge leap forward in capabilities and features. Big data and analytics can help companies ask sophisticated, forward-looking questions that make new connections between seemingly unrelated trends. Big data and analytics can power new types of applications that provide real-time feedback, putting insights directly into the hands of people who can use that information.

This paper examines advances in big data infrastructure and applications that can help companies overcome the challenges associated with bringing these new systems to life.

READ 300x80 Reinvention of Enterprise Analytics

Read the full article “Reinvention of Enterprise Analytics” and gain access to the “Breaking the Barriers of Time and Expense” white paper on the CSC Big Data & Analytics blog.

162e2ef6 f2d3 4701 97b7 4fd140b7a864 Reinvention of Enterprise Analytics

Does the Big Data Solution Exist?

What is a Big Data solution and what does it take to make a project successful? Perform your own experiment by posing this question to technology companies in the Big Data space. Then pose the same question to the pure service providers that are focused on Big Data. Finally, pose the same question to a few customers. Here is what I have found:

Technology providers will talk in terms of their specific contribution to the solution. Let’s think of the architectural stack from the bottom up. In the simplest terms, the Big Data solution is enabled by the infrastructure, the platform for the analytics to be performed, data software (which includes everything from data ingestion to statistical analysis), the visualization of the data, and the applications that depend on this solution. It is the sum of the parts, which no one vendor has, which makes up the enabling technologies that is “Big Data.”

big data 2 300x168 Does the Big Data Solution Exist?Service providers will talk in terms of business needs to understand what value there is in the data (e.g., use case discoveries, the data science engagements, proof-of-value offerings, implementation assistance, and application development).

Customers interested in Big Data are looking to simplify things to get to the incremental and previously unattainable insights that are the promise of Big Data. That journey, however, is a very complex one and one that is not without risk. The customer answer depends on who you ask. Ask IT and they may talk technology and the partners they prefer. Ask the application team or analytics team and your answers will straddle both the business value discussions and the technology needed to get to those answers. Lastly, the more progressive line of business decision makers aren’t interested in the complexities that make up a Big Data solution, but they are interested in the game changing insight that will allow them to create new service offerings or help to make the business more efficient as a result of the analytics being performed.

Is it now time to say that all of these answers combined is what makes up a Big Data solution? Not quite. Compliance and security are considerations businesses must address. Add to this, the deployment options which include on-premise bare metal, on-premise private cloud, a private secured cloud, a hybrid approach with both data center and cloud resources available, and finally public options like Amazon, Google, AT&T, and others. Not to mention, the talent needed to do this all in-house by customers of all sizes isn’t readily available.

The war to win in the Big Data space is being waged and customers are in the middle of it. Continuing the analogy further, customers would like to sit the war out and have the Big Data solution provided to them, removing the confusion, complexity and concern.

Now ask yourself the question, “What is a Big Data solution and what does it take to make your project successful?” Now the answer…it’s easier than you think. Ask yourself who has the technology expertise, services capabilities, customer proof points, provide flexibility in deployment, and has the option to provide all of this in a managed service so that you pay for just what you use. Those who provide “The Big Data Solution” exist. You just need to ask the right questions and look in the right places for those answers.

Alan Geary, VP of business development at Infochimps, a CSC Big Data Business, has focused on business and channel development at software and technology companies that have grown through partnering. Alan has a unique combination of Big Data and Cloud experience by working over the last decade at both a Hadoop distribution company and VMware. Both companies doubled revenue year over year with the partnerships playing a significant role in the adoption of both Hadoop and virtualization respectively.

Image source:

5fd3b37b f0ff 4b11 a9ba 54ff208f06f1 Does the Big Data Solution Exist?

Strata Santa Clara Recap + What Comes Next

Another year, another great Strata! Team Infochimps is back in Austin, Texas, but hopefully you stopped by and grabbed a t-shirt at our booth before you left. We saw a few familiar faces and made a lot of new friends — but in case you weren’t at Strata, here’s what you may have missed…

We got a lot of questions about our workshop, where you can meet with our experienced data scientists to learn key Big Data concepts and best practices. We’re offering personalized recommendations for each workshopper and sharing the secrets of success we’ve seen from other businesses. If you missed us at the booth or haven’t heard about it yet, you can learn more and request a workshop here too.

Many of you asked about our second annual “CIOs & Big Data: What Your IT Team Wants You to Know” report. Last year’s report showed some interesting takeaways — that 55 percent of Big Data projects aren’t completed, with 58 percent citing “inaccurate scope” as the reason for failure. We launched the questionnaire again this year and would love to hear your thoughts on what you want your CIO to know about Big Data. It only takes 10 minutes, and you can win cool prizes like an Amazon gift card

Many of you were interested in the data sheets we had at the booth too, but if you missed it you can still check out our adoption lifecycle data sheet and our solution overview online.

On a fun note, we hosted a Big Data Mixer, co-sponsored by Silicon Valley Data Science and Pacific Crest Securities, and we had a great time picking the brains of other industry pioneers. More than 120 of the best and brightest in Big Data joined in for food, drinks, and good conversation. We took a lot of photos that night (see a few teaser photos below), and you can see the entire album (around 500 total) here.

Screen Shot 2014 02 19 at 11.31.04 PM 300x197 Strata Santa Clara Recap + What Comes Next Screen Shot 2014 02 20 at 9.29.08 AM 300x195 Strata Santa Clara Recap + What Comes Next Screen Shot 2014 02 19 at 11.34.04 PM 252x300 Strata Santa Clara Recap + What Comes Next Screen Shot 2014 02 19 at 11.37.53 PM 262x300 Strata Santa Clara Recap + What Comes Next

Overall we had a blast at this year’s Strata Santa Clara! We hope you had a great time at Strata too, and we can’t wait to see you at the next one in New York. Until then, I’ll leave you with another awesome photo from our mixer featuring a couple of our chimps, Tim Gasper and Cameron Peek. Oh and if you can think of a good caption, I’m all ears.

Screen Shot 2014 02 19 at 10.51.07 PM 202x300 Strata Santa Clara Recap + What Comes Next

5fd3b37b f0ff 4b11 a9ba 54ff208f06f1 Strata Santa Clara Recap + What Comes Next

Live at Strata: Announcing a Workshop with our Big Data Experts

And we’re live at the 2014 O’Reilly Strata Conference! For the next three days, we’ll be joining the most brilliant minds in the Data and Analytics space to discuss the latest (and emerging) tools, technologies, trends and best practices. This year at Strata, Infochimps CEO Jim Kaskade will describe the state of Big Data from the perspective of our company’s work with some of the world’s top companies. He’ll provide a vision of what’s in store for the business landscape in 2014 and share some surprising trends in the world of data-driven decisions. Learn more about what Jim and the rest of our team are up to at the conference here.

4HdeoVb Live at Strata: Announcing a Workshop with our Big Data Experts

February Strata season always gets us excited, but this year we’re thrilled to present a specialized workshop with our leading Big Data Experts. With individualized attention to your business, our experienced team will help you apply key Big Data concepts and teachings to your own business problems and opportunities. If you’re interested in getting personalized recommendations, you can request a workshop here or ask any of the chimps at booth #740 for more info.

On that note, we can’t wait to chat with our peers here in Santa Clara, so be sure to stop by and say hello to us at booth #740 (we’ll be handing out awesome t-shirts too—seriously, take a look). See you out on the floor!

Image source:

5fd3b37b f0ff 4b11 a9ba 54ff208f06f1 Live at Strata: Announcing a Workshop with our Big Data Experts

Data Science: State of the Industry

O’Reilly has released their 2013 Data Science Salary Survey, and it’s a treasure trove of interesting information about the work of data science.

One of the most informative things I found was a breakdown of the data tools that were used most often by data scientists.

 Data Science: State of the Industry

This confirms a lot of hunches about the state of the industry:

  • SQL is the mack daddy of data science. It is used literally twice as much as Hadoop.

  • Excel and R are the analysis tools of choice. Since both of these tools can do multiple things (analysis and visualization), it makes sense that these would be more popular than single-use tools.

  • Scripting is widespread and diverse. Python, R, JavaScript, and Ruby are the glue of data science, with an especially strong showing for Python.

The big surprise to me was the relative unpopularity of SAS/SPSS. I think this effect may be exaggerated by the nature of the survey population (it was limited to people attending the Strata conference). However, a 4x disparity between R and Legacy vendors really highlights what I see as an accelerating trend towards open tools.

Another fascinating visualization was the breakdown of how different tools are used together by data scientists.

 Data Science: State of the Industry

In geek speak, this is a graph that describes the positive and negative correlations between tool usage. Visually, this separates into the traditional I/T world (in blue) and the new Hadoop world (in orange). “Visualization” might be a way to describe the red cluster, although Weka really breaks the mold.

What this tells me is that there is a definite geography to the work of data science. If traditional I/T is North America and Hadoop is South America, Tableau would be the Panama Canal, the conduit between the two continents. Also, this picture makes it easy to see why SQL is so popular. Like Starbucks, there’s at least one SQL-like tool in each of the clusters (Hive, MySQL, PostgreSQL, SQL, and SQL Server), with more on the way soon.

Looking at the big picture, this tells us three important things:

  1. Data science can come from anywhere. Innovation does not require the resources of the Fortune 500, nor the specialization of Silicon Valley. The work can leverage the strengths of either environment, and the best people can work anywhere.

  2. Virtually any company either already has or can inexpensively acquire the tools to do data science. If you can download R Studio and have a SQL database, you can start working like the pros.

  3. Data science isn’t thinking about real-time analytics, yet. Storm, Spark, and other tools are still cutting edge. Watch out for this in the 2014 survey.

Thanks O’Reilly, for the insight into data science and data scientists!

Dhruv Bansal is the chief science officer and co-founder of Infochimps, a CSC Big Data Business. He holds a B.A. in math and physics from Columbia University in New York and attended graduate school for physics at The University of Texas at Austin. For more information, email Dhruv at or follow him on Twitter at @dhruvbansal.

Image source:

119efc1b cf09 4f4f 9085 057e76e0464c Data Science: State of the Industry