Resources

How To Do a Big Data Project: A Template for Success

BDPTscreenshot How To Do a Big Data Project: A Template for SuccessBig Data is sweeping the business world – and while it can mean different things to different people, one thing always rings true: data-driven decisions and applications create immense value by utilizing data sources to discover, present, and operationalize important business insights.

While there is broad industry consensus on the value of Big Data, there is no standardized approach for how to begin and complete a project. This how-to guide leverages our repeated success at working with enterprises to stand up Infochimps Cloud solution in complex organizations and technical environments.

DOWNLOAD 300x80 How To Do a Big Data Project: A Template for Success

 

 

We’ve narrowed it down to 4 key steps to successfully implementing your Big Data project. This part how-to, part working doc will empower your organization to achieve your defined business objectives through Big Data, regardless of the various technical environments.

This Template Also Includes:

  1. Real-life Use Cases
  2. Technical Requirements Worksheet
  3. Business Overview Worksheet
  4. Tips, Tricks, and How-To’s

Download Now and achieve a faster path to ROI; prove the value of Big Data internally; and scale to support more data sources and use cases.

“We’ve successfully empowered a number of Fortune 1000 companies with Big Data systems used to increase bottom lines, and we’ve done so at incredible speed. We’ve done this by combining the power of cloud as a delivery model, along with best practices represented in this project guide.”

PRcta 300x71 How To Do a Big Data Project: A Template for Success

 

 

Serial entrepreneur Jim Kaskade, CEO of Infochimps, the company that is bringing Big Data to the cloud, has been leading startups from their founding to acquisition for more than ten years of his 25 years in technology. Prior to Infochimps, Jim was an Entrepreneur-in-Residence at PARC, a Xerox company, where he established PARC’s Big Data program, and helped build its Private Cloud platform. Jim also served as the SVP, General Manager and Chief of Cloud at SIOS Technology, where he led global cloud strategy. Jim started his analytics and data-warehousing career working at Teradata for 10 years, where he initiated the company’s in-database analytics and data mining programs.




6fefa857 2e95 4742 9684 869168ac7099 How To Do a Big Data Project: A Template for Success



Ironfan Takes Chef to the Next Level

Using Ironfan with Chef

Ironfan Ironfan Takes Chef to the Next Level Think like a system diagram, not your package installer.
Ironfan builds on Chef’s robust systems integration framework with enhanced tools, powerful abstractions, and built-in best practices.

  • Ironfan automates not only machine configuration, but entire clusters — allowing you to quickly describe your entire data system, including tools for storage, computation, data ingestion, scraping, monitoring, and logging.
  • Ironfan’s enhanced automatic integration ability means the entire stack auto wires itself together, making bringing systems to life feel like waving a magic wand.
  • Ironfan doesn’t just make it easy for initial deployment. It enables powerfully fast changes to configuration, addition of new components, and orchestration of elastic resources that can be spun up and down as needed.

Build an entire Hadoop cluster from scratch in less than an hour with just a handful of code lines, and deploy it in minutes with a single command.

READ 300x80 Ironfan Takes Chef to the Next Level




6fefa857 2e95 4742 9684 869168ac7099 Ironfan Takes Chef to the Next Level



[New Whitepaper] Real-Time Data Aggregation

Fast response times generate costs savings and greater revenue. Enterprise data architectures are incomplete unless they can ingest, analyze, and react to data in real-time as it is generated. While previously inaccessible or too complex — scalable, affordable real-time solutions are now finally available to any enterprise.

StormKafka1 e1366923782399 [New Whitepaper] Real Time Data Aggregation

Read Infochimps’ newest whitepaper on how Infochimps Cloud::Streams is a proprietary stream processing framework based on four years of experience with sourcing and analyzing both bulk and in-motion data sources. It offers a linearly and fault-tolerant stream processing engine that leverages a number of well-proven web-scale solutions built by Twitter and Linkedin engineers, with an emphasis on enterprise-class scalability, robustness, and ease of use.

In this whitepaper, you’ll learn:

  • Definitions & History – batch processing, stream processing
  • Comparison of Stream vs. Batch for Selected Use Cases – includes industry use case: aviation
  • Why Cloud::Streams is the leading stream processing framework

DOWNLOAD1 [New Whitepaper] Real Time Data Aggregation




229fa9b4 2ea6 4535 8a80 e041d110204c [New Whitepaper] Real Time Data Aggregation



CIOs & Big Data: What IT Teams Want Their CIOs to Know

It’s no secret that enterprises today face an increasingly competitive and erratic global business environment, and that Big Data is more than just another IT project – it’s truly a finger on the pulse of the business. To say that in 2013 Big Data is “mission critical” is to put it mildly – organizations that ignore the insights that Big Data can deliver are flying blind. So, it is all the more disconcerting that 55% of Big Data projects don’t get completed, and many others fall short of their objectives.

In order to understand the reasons for this, Infochimps partnered with SSWUG.org, one of the largest enterprise technology-focused, community-driven sites and a source for answers to IT-related questions and professional growth for more than 570,000 members. Together we got survey responses from over 300 IT department staffers – 58% of whom have current Big Data projects underway – on what they most wanted their CIOs to know about the process of implementing Big Data projects.

Read the full report here. >>

Key findings are summarized in the following infographic:
SurveyInfographic Final CIOs & Big Data: What IT Teams Want Their CIOs to Know

While the findings reveal many reasons for Big Data project failure, undoubtedly one of the biggest factors is lack of communication between top managers, who provide the overall project vision, and the data scientist and other IT staff charged with actually implementing it. Far too frequently their opinions are taken as an afterthought, and consequently considered only when projects veer off-course.

Given the stakes, it’s imperative that CIOs have a 360-degree view of all that a Big Data project will involve – not just the various Big Data technologies that are so frequently at the forefront of Big Data discussions.

The insight we gleaned reveals much about both enterprise technology and enterprise culture. In order for companies to succeed with Big Data, executives will need to rethink long-held notions of how diverse departments should function together. In the past “breaking down silos” was a nice mantra. Now, it is imperative. Additionally, CIOs and other enterprise executives may find it necessary to educate their organizations on the advantages of new Big Data applications and processes that will give them better customer insights, make their jobs infinitely easier and give their departments the elasticity needed to meet virtually any business need in real-time.

We hope this report will serve not only as a source of insight, but also be a reminder to seek the invaluable perspective of IT staff as early as possible in the process of developing new, technology-intensive projects.

Read the press release here. >>

 

A Sneak Preview: Big Data for Chimps, The Book

  • Amanda McGuckin Hager

Big Data for Chimps A Sneak Preview: Big Data for Chimps, The BookI’ve been reading Flip’s book, Big Data for Chimps: A Guide to Massive Scale Data Processing, available for pre-order now from O’Reilly. While I’m no data engineer, I am able to follow along. After reading a bit, it comes as no surprise that Flip helped to found Infochimps with the philosophy of making the world’s knowledge accessible to anyone.  The content is unexpected and engaging. Take, for example, the story of Chimpanzee and Elephant Start a Business, from The Stream Chapter:

Chimpanzee and Elephant Start a Business

As you know, chimpanzees love nothing more than sitting at typewriters processing and generating text. Elephants have a prodigious ability to store and recall information, and will carry huge amounts of cargo with great determination. The chimpanzees and the elephants realized there was a real business opportunity from combining their strengths, and so they formed the Chimpanzee and Elephant Data Shipping Corporation. They were soon hired by a publishing firm to translate the works of Shakespeare into every language. In the system they set up, each chimpanzee sits at a typewriter doing exactly one thing well: read a set of passages, and type out the corresponding text in a new language. Each elephant has a pile of books, which she breaks up into “blocks” (a consecutive bundle of pages, tied up with string).

Read the full chapter (available here: The Stream Chapter) to understand how this example, combined with pig latin, simple streamers, and running Hadoop jobs have to do with each other. You’ll also get two exercises and a Ruby helper section containing tips and tricks.

Amanda McGuckin Hager is a high-tech marketing professional with over 17 years of experience focused on driving demand through strategic marketing programs and is the Director of Marketing at Infochimps. Follow Amanda on Twitter.




817e847c d61d 4d47 88ba 577f69b4e780 A Sneak Preview: Big Data for Chimps, The Book



Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data

developer community Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big DataHere at Infochimps, we value the developer community. From Founder and CTO Flip Kromer being named by GitHub as one of the Top 100 Contributors in 2012, to hosting community discussion webinars, Infochimps tries to keep developers’ needs and interests in mind.

Here are some resources catered towards the developer community:

In case you missed it: Knowing analytics tools designed for developers’ needs are in high demand, Derrick Harris writes an article highlighting the top 12 Big Data tools developers need to know: “A programmer’s guide to big data: 12 tools to know

Pre-Order Now – Big Data for Chimps: In addition to being a prolific code contributor and one of the nations’ leading data scientists, Flip Kromer is the author of Big Data for Chimps, A Guide to Massive Scale Data Processing, published by O’Reilly, and available for pre-order now.

Upcoming Webinar: Ironfan – A Community Discussion
Thursday, January 31 @ 10a P, 12p C, 1p E

Join Nathaniel Eliot, DevOps Engineer and lead on Ironfan, in this community discussion. Ironfan is a lightweight cluster orchestration toolset, built on top of Chef, which empowers spinning up of Hadoop clusters in under 20 minutes. Nathan has been responsible for Ironfan’s core plugin code, cookbooks, and other components to stabilize both Infochimps’ open source offerings, and internal architectures.

Register Now Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data





119efc1b cf09 4f4f 9085 057e76e0464c Developer Resources: 12 Tools + Ironfan Webinar + THE Book on Big Data



Video: Making Sense of Big Data

Making Sense of Big Data Video: Making Sense of Big DataDid you miss our Making Sense of Big Data: An Infochimps Thought Leadership Webinar? Or maybe you just want to watch the webinar again?

Well you’re in luck. We recorded the webinar for your convenience, watch it here.

Given by Big data expert and Infochimps CEO, Jim Kaskade, the recorded webinar will engage you in discussion about:

  • How to effectively manage, protect and leverage big data in your enterprise
  • How to become a data-driven data organization
  • Best practices – from business problem definition to ROI
  • Compelling use cases for business transformation
  • How to develop data-centric applications – using Infochimps Big Data PaaS

Watch the Webinar

For more recorded webinars, including our High Speed Retail Analytics webinar, visit our Resources Page.




6fefa857 2e95 4742 9684 869168ac7099 Video: Making Sense of Big Data



Case Study: Koupon Media + Infochimps

Koupon Case Study Case Study: Koupon Media + InfochimpsThe Infochimps Big Data Platform provides customers with an affordable and repeatable architecture that helps them see return from their big data efforts. Customers get to data insights fast, with the full power of Big Data technology with developer-friendly simplicity.

Our newest case study highlights Koupon Media, a Digital Campaign Management platform provider. Koupon Media uses the Infochimps Platform to build a real-time demographics dashboard reporting on coupon programs to their customers, a true competitive advantage.

Download the case study here to read about how Koupon Media gained their competitive advantage with Infochimps.

Want more? For more customer case studies, visit our Resources Page.

DeepDive 728px v3 Case Study: Koupon Media + Infochimps


Live Webcast: Making Sense of Big Data

Thought Leadership Webinar   Register Today Live Webcast: Making Sense of Big Data

Title:Making Sense of Big Data: An Infochimps Thought Leadership Webinar
Date: Thursday, October 11, 2012
Time: 10a Pacific/12p Central/1p Eastern

Register Today!  

Big data is likely the most hyped term in tech of the past two years; however, amidst all the hype, we may have actually missed the point of having all of this data in the first place: to generate more value for businesses. Importantly, it’s this gap between hype and value that speaks to why we need to define how to leverage big data technology in the enterprise.

Register for this live webcast and listen to big data expert and Infochimps CEO, Jim Kaskade, explain where to start and how to effectively manage, protect and leverage the growing amounts of data in your enterprise. In addition, you will hear engaging discussion about:

  • How to become a data-driven data organization
  • Best practices – from business problem definition to ROI
  • Compelling use cases for business transformation
  • How to develop data-centric applications – using Infochimps Big Data PaaS

Join the webcast here. Looking forward to seeing you Thursday, October 11th!

Technical White Paper: Big Insights from Big Data

Curious about the leading technology behind the Infochimps™ Platform?

Download our free technical white paper and gain big insights from big data.

Infochimps Platform Technical Overview
The Infochimps Platform is an integrated solution set that makes it easy, fast and simple to perform big data analytics and create big data applications. It’s a collection of open source and proprietary software for big data processing, data collection and integration, data storage, data analysis and visualization, and infrastructure management. Coupled with our expert team and a revolutionary approach to tying it all together, we help you accelerate your big data projects.

Big Data Platform1 1024x682 Technical White Paper: Big Insights from Big Data

This technical overview will explore in more detail these key areas:

Data Delivery Service™

  • Collect Data
  • Perform Stream Processing with Decorators

Data Management

  • Query Data and Build Applications

Cloud Hadoop

  • Perform Hadoop Processing

Download the white paper here to take a deep dive into the leading technology behind the Infochimps Platform.

DeepDive 728px v3 Technical White Paper: Big Insights from Big Data