Monthly Archives January 2013

[Webinar] Ironfan: A Community Discussion

Ironfan [Webinar] Ironfan: A Community Discussion Traditionally, systems configuration involves a time-consuming process that is vulnerable to human error. Infochimps leverages the power and simplicity of Ironfan as its provisioning and deployment layer, allowing end-users to easily launch and orchestrate repeatable infrastructure. If you’re interested in digging deeper into the details of Ironfan, take advantage of:

Ironfan: A Community Discussion Thursday, January 31 @ 10a P, 12p C, 1p E

Join Nathaniel Eliot, DevOps Engineer and lead on Ironfan, in this community discussion. Ironfan is a lightweight cluster orchestration toolset, built on top of Chef, which empowers spinning up of Hadoop clusters in under 20 minutes. Nathan has been responsible for Ironfan’s core plugin code, cookbooks, and other components to stabilize both Infochimps’ open source offerings, and internal architectures.

Register Today >>

 




817e847c d61d 4d47 88ba 577f69b4e780 [Webinar] Ironfan: A Community Discussion



CIOs & Big Data: What IT Teams Want Their CIOs to Know

It’s no secret that enterprises today face an increasingly competitive and erratic global business environment, and that Big Data is more than just another IT project – it’s truly a finger on the pulse of the business. To say that in 2013 Big Data is “mission critical” is to put it mildly – organizations that ignore the insights that Big Data can deliver are flying blind. So, it is all the more disconcerting that 55% of Big Data projects don’t get completed, and many others fall short of their objectives.

In order to understand the reasons for this, Infochimps partnered with SSWUG.org, one of the largest enterprise technology-focused, community-driven sites and a source for answers to IT-related questions and professional growth for more than 570,000 members. Together we got survey responses from over 300 IT department staffers – 58% of whom have current Big Data projects underway – on what they most wanted their CIOs to know about the process of implementing Big Data projects.

Read the full report here. >>

Key findings are summarized in the following infographic:
SurveyInfographic Final CIOs & Big Data: What IT Teams Want Their CIOs to Know

While the findings reveal many reasons for Big Data project failure, undoubtedly one of the biggest factors is lack of communication between top managers, who provide the overall project vision, and the data scientist and other IT staff charged with actually implementing it. Far too frequently their opinions are taken as an afterthought, and consequently considered only when projects veer off-course.

Given the stakes, it’s imperative that CIOs have a 360-degree view of all that a Big Data project will involve – not just the various Big Data technologies that are so frequently at the forefront of Big Data discussions.

The insight we gleaned reveals much about both enterprise technology and enterprise culture. In order for companies to succeed with Big Data, executives will need to rethink long-held notions of how diverse departments should function together. In the past “breaking down silos” was a nice mantra. Now, it is imperative. Additionally, CIOs and other enterprise executives may find it necessary to educate their organizations on the advantages of new Big Data applications and processes that will give them better customer insights, make their jobs infinitely easier and give their departments the elasticity needed to meet virtually any business need in real-time.

We hope this report will serve not only as a source of insight, but also be a reminder to seek the invaluable perspective of IT staff as early as possible in the process of developing new, technology-intensive projects.

Read the press release here. >>

 

A Sneak Preview: Big Data for Chimps, The Book

  • Amanda McGuckin Hager

Big Data for Chimps A Sneak Preview: Big Data for Chimps, The BookI’ve been reading Flip’s book, Big Data for Chimps: A Guide to Massive Scale Data Processing, available for pre-order now from O’Reilly. While I’m no data engineer, I am able to follow along. After reading a bit, it comes as no surprise that Flip helped to found Infochimps with the philosophy of making the world’s knowledge accessible to anyone.  The content is unexpected and engaging. Take, for example, the story of Chimpanzee and Elephant Start a Business, from The Stream Chapter:

Chimpanzee and Elephant Start a Business

As you know, chimpanzees love nothing more than sitting at typewriters processing and generating text. Elephants have a prodigious ability to store and recall information, and will carry huge amounts of cargo with great determination. The chimpanzees and the elephants realized there was a real business opportunity from combining their strengths, and so they formed the Chimpanzee and Elephant Data Shipping Corporation. They were soon hired by a publishing firm to translate the works of Shakespeare into every language. In the system they set up, each chimpanzee sits at a typewriter doing exactly one thing well: read a set of passages, and type out the corresponding text in a new language. Each elephant has a pile of books, which she breaks up into “blocks” (a consecutive bundle of pages, tied up with string).

Read the full chapter (available here: The Stream Chapter) to understand how this example, combined with pig latin, simple streamers, and running Hadoop jobs have to do with each other. You’ll also get two exercises and a Ruby helper section containing tips and tricks.

Amanda McGuckin Hager is a high-tech marketing professional with over 17 years of experience focused on driving demand through strategic marketing programs and is the Director of Marketing at Infochimps. Follow Amanda on Twitter.




817e847c d61d 4d47 88ba 577f69b4e780 A Sneak Preview: Big Data for Chimps, The Book



Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data

developer community Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big DataHere at Infochimps, we value the developer community. From Founder and CTO Flip Kromer being named by GitHub as one of the Top 100 Contributors in 2012, to hosting community discussion webinars, Infochimps tries to keep developers’ needs and interests in mind.

Here are some resources catered towards the developer community:

In case you missed it: Knowing analytics tools designed for developers’ needs are in high demand, Derrick Harris writes an article highlighting the top 12 Big Data tools developers need to know: “A programmer’s guide to big data: 12 tools to know

Pre-Order Now – Big Data for Chimps: In addition to being a prolific code contributor and one of the nations’ leading data scientists, Flip Kromer is the author of Big Data for Chimps, A Guide to Massive Scale Data Processing, published by O’Reilly, and available for pre-order now.

Upcoming Webinar: Ironfan – A Community Discussion
Thursday, January 31 @ 10a P, 12p C, 1p E

Join Nathaniel Eliot, DevOps Engineer and lead on Ironfan, in this community discussion. Ironfan is a lightweight cluster orchestration toolset, built on top of Chef, which empowers spinning up of Hadoop clusters in under 20 minutes. Nathan has been responsible for Ironfan’s core plugin code, cookbooks, and other components to stabilize both Infochimps’ open source offerings, and internal architectures.

Register Now Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data





119efc1b cf09 4f4f 9085 057e76e0464c Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data



Infochimps CTO Named Top 100 Contributors to GitHub 2012

Github Infochimps CTO Named Top 100 Contributors to GitHub 2012Flip Kromer, Infochimps Founder and CTO, also known as MrFlip, was named by GitHub as one of the Top 100 Contributors in 2012. Flip made over 2,300 contributions to the global, open source developer community.

And he’s in good company. Also on the list are: Linus Torvals of Linux, Erik Michaels-Ober, and Dr. Nic Williams.

In addition to being a prolific code contributor and one of the nations’ leading data scientists, Flip is the author of Big Data for Chimps, A Guide to Massive Scale Data Processing, published by O’Reilly, and available for pre-order now.

About GitHub: Github, a Forbes’ Top Tech Company of 2012 and the largest code host in the world, was founded in 2008 and is leading enterprises to adopt open source technology. Github, known for social coding, was founded as a place for developers to code together, as teams and individuals.

About Infochimps: The Infochimps Platform for Big Data combines leading data technologies with managed cloud services, a strong partner network to empower customers with unprecedented speed, scale and flexibility in their Big Data initiatives. Infochimps is a privately held, venture-backed company with offices in Austin, TX and the Silicon Valley. Follow @infochimps on Twitter.




1edf4f3a 3033 47f8 8b9c d110c666f0fa Infochimps CTO Named Top 100 Contributors to GitHub 2012



Live Webinar: 5 Big Data Use Cases for 2013

Title: 5 Big Data Use Cases for 2013
Date: Thursday, January 24, 2013
Time: 10a Pacific/12p Central/1p Eastern

Register Now Live Webinar: 5 Big Data Use Cases for 2013

TimGasper 300x300 Live Webinar: 5 Big Data Use Cases for 2013Jump start into 2013 by exploring how Big Data can transform your business. In this live webinar, listen to Infochimps Director of Product, Tim Gasper, cover the leading use cases for 2013, sharing where the data comes from, how the systems are architected and most importantly, how they drive business insights for data-driven decisions. Some of the ”next-generation” use case examples for critical ways to gain sustainable, data-driven differentiation include:

  • Risk analysis and fraud detection
  • Brand and sentiment analysis
  • Targeted marketing and personalization
  • Customer insights/behavior
  • Big Data business intelligence

Join the webinar here. Looking forward to seeing you Thursday, January 24, 2013 @ 10a PT, 12p CT, 1p ET!




1edf4f3a 3033 47f8 8b9c d110c666f0fa Live Webinar: 5 Big Data Use Cases for 2013



[Infographic] What Big Data & Your Kids Have in Common

  • Amanda McGuckin Hager

Big Data is changing the game for strategic marketing programs. More data is available to marketers than ever before, and intelligently using this data is driving huge increases to the bottom line.

As a marketer, I’m always looking to improve campaign conversions. Show me the data, and I’ll start asking questions with the curiosity of a child.  Check out some of the top questions our customers are asking of Big Data, and some of the corresponding questions that might be asked by a curious child.

Props to the curious, to the driven and to the ones asking the questions. The answers will drive your company forward.

bigdatakids infographic v3 623x1024 [Infographic] What Big Data & Your Kids Have in Common
Amanda McGuckin Hager, a high-tech marketing professional with over 17 years of experience focused on strategic marketing programs that drive demand, is the Director of Marketing at Infochimps. Follow Amanda on Twitter.





47f18564 d70f 4a11 b8e3 f59ec64f85aa [Infographic] What Big Data & Your Kids Have in Common



Strata Conference: Flip Kromer + Booth#P5

Strata Conference Strata Conference: Flip Kromer + Booth#P5

In 2005, Tim O’Reilly predicted: “Data is the Next Intel Inside.” This year, from February 26th – 28th at the Santa Clara Convention Center in Santa Clara, California, O’Reilly’s Strata Conference will bring together the leading minds in Big Data. The decision makers using the power of Big Data to drive business strategy as well as the practitioners who collect, analyze, and manipulate the data will come together for inspiring keynotes, intensely practical & information-rich sessions, and a sponsor pavilion with key players and products.

As a proponent of empowering businesses through Big Data, Infochimps will be exhibiting at Strata along with a session given by Co-Founder and CTO Flip Kromer:

Title:How Hadoop in the Cloud Affects Developer-Friendly Decision Making
Date: Thursday, February 28, 2013
Time: 10:40a Pacific/12:40p Central/1:40p Eastern
Track: Hadoop in Practice
Location: Ballroom CD
Why: In this talk, Flip Kromer will walk you through a series of decision trees outlining why Hadoop in the cloud can be a powerful combination, helping to make clusters cheaper and developers happier.

Register Today, save 25% with discount code: INFOCHIMPS, and be sure to stop by the Innovators Pavilion, Booth#P5 to chat with us about Big Data.




34523bb2 2e50 4f42 88a1 5bd9ed0fddac Strata Conference: Flip Kromer + Booth#P5



Intelligent Applications: The Big Data Theme for 2013

Intelligent Applications the Big Data theme for 2013 Intelligent Applications: The Big Data Theme for 2013

My prediction for 2013 is that competitive advantage will translate into enterprises using sophisticated Big Data analytics to create a new breed of applications - Intelligent Applications.

“It’s more than just insights from MapReduce”, a CIO from a fortune 100 told me, “It’s about using data to make our customer touch points more engaging, more interactive, more intelligent.”

So when you hear about “Big Data solutions”, you need to translate that into a new category of “Intelligent Applications”. At the end of the day, it’s not about people pouring through petabytes of data. It’s actually about how one turns the data into revenue (or profits).

This means that you MUST:

  1. Start with the business problem first (preferably one with revenue upside versus cost savings)
  2. Determine which data elements you can leverage AFTER #1
  3. Define an analytical three-tier architecture (as shown above)

Which Big Data market segments will grow the fastest in 2013?

Morgan Stanley named the top ten as follows:

  1. Healthcare
  2. Entertainment
  3. Com/Media
  4. Manufacturing
  5. Financial
  6. Business Services
  7. Transportation
  8. Web Tech
  9. Distribution
  10. Engineering

Many have predicted which Industry is the most attractive (see McKinsey’s Quarterly for another). I personally like Ad-Tech and Financial Services for verticals….followed by Information Management , Health (if you can partner to speed up sales cycles), and Communications.

But what about market segments by technology?

The Growth of Cloud Based Big Data Intelligent Applications: The Big Data Theme for 2013

I predict that Data Analytics as a Service (or also referred to as Big Data as a Service (BDaaS)) will have the highest growth (obviously building from a small base in revenue given its level of maturity). Business Intelligence as a Service is the next high-growth segment, given the need for easier ways to present and visualize data, followed by Logging as a Service.

But don’t take my word for this….my data comes from prominent research organizations. I’m just compiling and presenting their data in a slightly new way.

What challenges will end-user organizations struggle with the most in 2013?

End-users will continue to struggle with making sense out of the many technologies available. Is it EMC Greenplum connected to EMC Hadoop? Is it Cloudera Impala + Hadoop? Is it AsterData + Hortonworks? Is it MapR Hbase + HDFS? I think one thing is definite….you have lots of options.

The biggest problem will be whether they are actually satisfying the needs of the business problem. Here are my leading predictions for end-user organizations:

  1. End users just want to solve problems, but will continue to fight IT over who owns the platform powering their much-needed data-driven applications
  2. Ultimately, end-users will be forced to chase “shinny objects” because IT groups will persuade them to wait for the “technology bake-offs” around the Big Data platform soon to be launched (24 months from now)
  3. In the end, many organizations will fail at creating value from Big Data due to a lack of focus on business problems, time-to-market, and in some cases the wrong technology choice

What are some of the key technologies that will dominate the Big Data market in 2013?

So many equate Big Data with Hadoop. But as you begin to see with announcements like Impala from Cloudera, it’s more than just Hadoop. It’s about servicing all the application response time requirements. It’s about volume, velocity, and variety but also time-to-value with your data analytics.

My prediction for 2013 is that you will need the following technology components:

  • Real-time stream processing
  • Ad-hoc near real-time analytics (see NoSQL and NewSQL data stores)
  • Batch Analytics

Not one, but all three!

What steps can customers take to maximize competitive advantage with Big Data in 2013?

Competitive advantage is ALL about time-to-market. I have no doubt that every Global 2000 company will launch their Big Data initiatives in 2013. The question is when they will turn those initiatives into additional revenue…how long will it take from the time that they hire Accenture, CSC, Capgemini, IBM or the like to implement their Big Data strategies, to launching an intelligent application?

My prediction for 2013:

Cloud will become a large part of big data deployment – established by a new cloud ecosystem.

This will be driven by the need for time-to-market and ultimately, competitive advantage. Cloud usually lags any disruption made behind the firewall….by at least 12 months. In the case of Big Data, the launch of Apache 1.0 in December of 2011 basically makes 2013 the year for Cloud-based Big Data.

That being said, large volumes of data, privacy and public cloud are not usually mentioned in the same paragraph by IT in a Global 2000 enterprise. That’s why we’re going to see elastic big data clouds behind the firewall and within trusted third party data center providers.




119efc1b cf09 4f4f 9085 057e76e0464c Intelligent Applications: The Big Data Theme for 2013