The President + Infochimps + Austin

obama21 The President + Infochimps + Austin

As Austin continues to thrive, making Top 10 Lists for everything from innovation to affordable housing, President Obama himself came to see what’s going on. “I’ve come to listen and learn and highlight some of the good work that’s being done,” Obama said during his visit to Austin. “Folks around here are doing something right.”

We are doing something right – these lists speak for themselves:

  • Best City for Small Business nationally by The Business Journals
  • #1 large city for young entrepreneurs according to Under30CEO
  • #1 among the 100 largest U.S. metros based on amount recovered from pre-recession peak to the present based on employment, unemployment, output, and house prices according to Brookings Institution
  • #3 fastest-growing tech job market according to Dice.com
  • #3 ”Best Cities for Good Jobs” list according to Forbes

In his recent visit to Austin, President Obama stopped by Capital Factory, an incubator for technology startups where he learned about Austin’s technology community, and was introduced to Infochimps. Wanting to move to Austin?

Come Work With Us >>





6fefa857 2e95 4742 9684 869168ac7099 The President + Infochimps + Austin




Source: http://austintexas.gov/rankings

Tell Your Children to Learn Hadoop

I spent some time with several vendors and users of Hadoop, the formless data repository that is the current favorite of many dot coms and the darling of the data nerds. It was instructive. Moms and Dads, tell your kids to start learning this technology now. The younger the better.

I still know relatively little about the Hadoop ecosystem, but it is a big tent and getting bigger. To grok it, you have to cast aside several long-held tech assumptions. First, that you know what you are looking for when you build your databases: Hadoop encourages pack rats to store every log entry, every Tweet, every Web transaction, and other Internet flotsam and jetsam. The hope is that one day some user will come with a question that can’t be answered in any way other than to comb through this morass. Who needs to spend months on requirements documents and data dictionaries when we can just shovel our data into a hard drive somewhere? Turns out, a lot of folks.

Think of Hadoop as the ultimate in agile software development: we don’t even know what we are developing at the start of the project, just that we are going to find that proverbial needle in all those zettabytes.

Hadoop also casts aside the notion that we in IT have even the slightest smidgen of control over our “mission critical” infrastructure. It also casts aside that we turn to open source code when we have reached a commodity product class that can support a rich collection of developers. That we need solid n.1 versions after the n.0 release has been debugged and straightened out. Versions which are offered by largish vendors who have inked deals with thousands of customers.

No, no, no and no. The IT crowd isn’t necessarily leading the Hadooping of our networks. Departmental analysts can get their own datasets up and running, although you really need skilled folks who have a handle on the dozen or so helper technologies to really make Hadoop truly useful. And Hadoop is anything but a commodity: there are at least eight different distributions with varying degrees of support and add-ons, including ones from its originators at Yahoo. And the current version? Try something like 0.2. Maybe this is an artifact of the open source movement which loves those decimal points in their release versions. Another company has released its 1.0 version last week, and they have been at it for several years.

And customers? Some of the major Hadoop purveyors have dozens, in some cases close to triple digits. Not exactly impressive, until you run down the list. Yahoo (which began the whole shebang as a way to help its now forlorn search engine) has the largest Hadoop cluster around at more than 42,000 nodes. And I met someone else who has a mere 30-node cluster: he was confident by this time next year he would be storing a petabyte on several hundred nodes. That’s a thousand terabytes, for those that aren’t used to thinking of that part of the metric system.

Three years ago I would have told you to teach your kids WordPress, but that seems passé, even quaint now. Now even grade schoollers can set up their own blogs and websites without knowing much code at all, and those who are sufficiently motivated can learn Perl and PHP online. But Hadoop clearly has captured the zeitgeist, or at least a lot of our data, and it poised to gather more of it as time goes on. Lots of firms are hiring too, and the demand is only growing.

Infochimps has some great resources to get you started here >>

David Strom is a world-known expert on networking and communications technologies. Whether you got your first PC at age 60 or grew up with an Apple in your crib, Strom can help you understand how to use your computers, keep them secure, and understand how to create and deploy a variety of Internet applications and services. He has worked extensively in the Information Technology end-user computing industry and has managed editorial operations for trade publications in the network computing, electronics components, computer enthusiast, reseller channel and security markets.




6fefa857 2e95 4742 9684 869168ac7099 Tell Your Children to Learn Hadoop




Image Source: huffingtonpost.com

See You @ CloudCon Expo

CloudCon 300x300 See You @ CloudCon ExpoCloudCon Expo & Conference, May 14-15, San Francisco

CloudCon Expo & Conference brings you the opportunity to learn best practices and strategies for cloud deployment. A perfect event designed for IT professionals and decision makers looking to implement cloud technology to achieve benefits like reliability, adaptability and cost reduction. Infochimps will be exhibiting at CloudCon as well as participating in the following sessions:

Title: PaaS on Software Defined Data Centers (SDDC)
Description: Discover the emerging area of Software-defined Data Center. This model provides the culmination of server, storage, and network virtualization where resource pools — regardless of their physical location — are automatically provisioned to fit the demands of an organization’s applications. The model promises an unprecedented level of flexibility and simplicity for companies embracing cloud computing, but the porous nature of the cloud also exposes companies to a greater array of security threats.
Where: Grand Ballroom E
When: Wed, May 15 @ 10:30am-11:30am PT
Speakers:

  • Jim Kaskade, CEO of Infochimps
  • Brandon Hoff, Director of Product at Emulux

Title: POWER PANEL: The Business of Cloud & Big Data
Description: As Cloud computing is becoming a de facto standard, we are collectively storing massive amount of data in the cloud. Companies are now trying to mine this immense source of information. Assuming everything is stored in the cloud, interesting questions arise. Listen to the thought leaders as they discuss business issues and lessons learnt by companies in Cloud and Big Data space.
Where: Grand Ballroom
When: Wed, May 15 @ 12:15pm-1:00pm PT
Speakers:

  • Moderator: Jim Kaskade, CEO of Infochimps
  • Laurance Guillory, CEO of Racemi
  • Chris C. Kemp, CEO of Nebula
  • Martin Mikos, CEO of Eucalyptus Systems
  • Ron Bodkin, CEO of Think Big Analytics

Stop by the Expo, Booth #11, to chat about Big Data or set up a 1-1 meeting with one of our Big Data experts, see you there!




6fefa857 2e95 4742 9684 869168ac7099 See You @ CloudCon Expo



Ironfan Takes Chef to the Next Level

Using Ironfan with Chef

Ironfan Ironfan Takes Chef to the Next Level Think like a system diagram, not your package installer.
Ironfan builds on Chef’s robust systems integration framework with enhanced tools, powerful abstractions, and built-in best practices.

  • Ironfan automates not only machine configuration, but entire clusters — allowing you to quickly describe your entire data system, including tools for storage, computation, data ingestion, scraping, monitoring, and logging.
  • Ironfan’s enhanced automatic integration ability means the entire stack auto wires itself together, making bringing systems to life feel like waving a magic wand.
  • Ironfan doesn’t just make it easy for initial deployment. It enables powerfully fast changes to configuration, addition of new components, and orchestration of elastic resources that can be spun up and down as needed.

Build an entire Hadoop cluster from scratch in less than an hour with just a handful of code lines, and deploy it in minutes with a single command.

READ 300x80 Ironfan Takes Chef to the Next Level




6fefa857 2e95 4742 9684 869168ac7099 Ironfan Takes Chef to the Next Level



Splice Data Scientist DNA Into Your Existing Team

IT World Splice Data Scientist DNA Into Your Existing TeamAs organizations continue to grapple with Big Data demands, they may find that business managers who understand data may meet their “data scientist” needs better than the hard core data technologists

There’s little doubt that data-derived insight will be a key differentiator in business success, and even less doubt that those who produce such insight are going to be in very high demand. Harvard Business Review called “data scientist” the “sexiest” job of the 21st century, and McKinsey predicts a shortfall of about 140,000 by 2018. Yet most companies are still clueless as to how they’re going to meet this shortfall.

Unfortunately, the job description for a data scientist has become quite lofty. Unless your company is Google-level cool, you’re going to struggle to hire your Big Data dream team (well, at least right now), and few firms out there could recruit them for you. Ultimately, most organizations will need to enlist the support of existing staff to achieve their data-driven goals, and train them to become data scientists. To accomplish this, you must determine the basic elements of data scientist “DNA” and strategically splice it into the right people.

READ 300x80 Splice Data Scientist DNA Into Your Existing Team

 

 

Serial entrepreneur Jim Kaskade, CEO of Infochimps, the company that is bringing Big Data to the cloud, has been leading startups from their founding to acquisition for more than ten years of his 25 years in technology. Prior to Infochimps, Jim was an Entrepreneur-in-Residence at PARC, a Xerox company, where he established PARC’s Big Data program, and helped build its Private Cloud platform. Jim also served as the SVP, General Manager and Chief of Cloud at SIOS Technology, where he led global cloud strategy. Jim started his analytics and data-warehousing career working at Teradata for 10 years, where he initiated the company’s in-database analytics and data mining programs.




229fa9b4 2ea6 4535 8a80 e041d110204c Splice Data Scientist DNA Into Your Existing Team



VMworld Panel Voting Closes Today

VMworld VMworld Panel Voting Closes Today

It’s that time of year again, VMworld panel voting!

Panel voting closes today, May 6th at 5:00pm PDT.

Infochimps would love your support for the following sessions:

VOTE 300x71 VMworld Panel Voting Closes Today

 

 

Thank you for showing your support and see you at VMworld.




229fa9b4 2ea6 4535 8a80 e041d110204c VMworld Panel Voting Closes Today



Big Data and Banking – More than Hadoop

Jims Bank 300x224 Big Data and Banking – More than Hadoop

Fraud is definitely top of mind for all banks. Steve Rosenbush at the Wall Street Journal recently wrote about Visa’s new Big Data analytic engine which has changed the way the company combats fraud. Visa estimates that its new Big Data fraud platform has identified $2 billion in potential annual incremental fraud savings. With Big Data, their new analytic engine can study as many as 500 aspects of a transaction at once. That’s a sharp improvement from the company’s previous analytic engine, which could study only 40 aspects at once. And instead of using just one analytic model, Visa now operates 16 models, covering different segments of its market, such as geographic regions.

Do you think Visa, or any bank for that matter, uses just batch analytics to provide fraud detection? Hadoop can play a significant role in building models. However, only a real-time solution will allow you to take those models and apply them in a timeframe that can make an impact.

The banking industry is based on data – the products and services in banking have no physical presence – and as a consequence, banks have to contend with ever-increasing volumes (and velocity, and variety) of data. Beyond the basic transactional data concerning debits/credits and payments, banks now:

  • Gather data from many external sources (including news) to gain insight into their risk position;
  • Chart their brand’s reputation in social media and other online forums.

This data is both structured and unstructured, as well as very time-critical. And, of course, in all cases financial data is highly sensitive and often subject to extensive regulation. By applying advanced analytics, the bank can turn this volume, velocity, and variety of data into actionable, real-time and secure intelligence with applications including:

  • Customer experience
  • Risk Management
  • Operations Optimization

It’s important to note that applying new technologies like Hadoop is only a start (it addresses 20% of the solution). Turing your insights into real-time actions will require additional Big Data technologies that help you “operationalize” the output of your batch analytics.

Customer Experience

Customer Experience Management Customer Centric Organization copy 300x211 Big Data and Banking – More than HadoopBanks are trying to become more focused on the specific needs of their customers and less on the products that they offer. They need to:

  • Engage customers in interactive/personalized conversations (real-time)
  • Provide a consistent, cross-channel experience including real-time touch points like web and mobile
  • Act at critical moments in the customer sales cycle (in the moment)
  • Market and sell based on customer real-time activities

Noting a general theme here? Big Data can assist banks with this transformation and reduce the cost of customer acquisition, increase retention, increase customer acceptance of marketing offers, increase sales by targeted marketing activities, and increase brand loyalty and trust. Big Data presents a phenomenal opportunity. However, the definition of Big Data HAS to be broader then Hadoop.

Big Data promises the following technology solutions to help with this transformation:

  • Single View of Customer (all detailed data in one location)
  • Targeted Marketing with micro-segmentation (sophisticated analytics on ALL of the data)
  • Multichannel Customer Experience (operationalizing back out to all the customer touch points)

Risk Management

Quality Risk Management Big Data and Banking – More than HadoopRisk management is also critically important to the bank. Risk management needs to be pervasive within the organizational culture and operating model of the bank in order to make risk-aware business decisions, allocate capital appropriately, and reduce the cost of compliance. Ultimately, this means making data analytics as accessible as it is at Yahoo! If the bank could provide a “data playground” where all data sources were readily available with tools that were easy to use…well, lets just say that new risk management products would be popping up left and right.

Big Data promises a way of providing the organization integrated risk management solutions, covering:

 

  • Financial Risk (Risk Architecture, Data Architecture, Risk Analytics, Performance & reporting)
  • Operational Risk & Compliance
  • Financial Crimes (AML, Fraud, Case Management)
  • IT Risk (Security, Business Continuity and Resilience)

The key is to focus on one use-case first, and expand from there. But no matter which risk use-case you attack first, you will need batch, ad hoc, and real-time analytics.

Operations Optimization

operations management Big Data and Banking – More than HadoopLarge banks often become unwieldy organizations through many acquisitions. Increasing flexibility and streamlining operations is therefore even more important in today’s more competitive banking industry. A bank that is able to increase their flexibility and streamline operations by transforming their core functions will be able to drive higher growth and profits; develop more modular back-room office systems; and respond quickly to changing business needs in a highly flexible environment.

This means that banks need new core infrastructure solutions. Examples might involve reducing loan origination times by standardizing its loan processes across all entities using Big Data. Streamlining and automating these business processes will result in higher loan profitability, while complying with new government mandates.

Operational leverage improves when banks can deliver global, regional and local transaction and payment services efficiently and also when they use transaction insights to deliver the right services at the right price to the right clients.

Many banks are seeking to innovate in the areas of processing, data management and supply chain optimization. For example, in the past, when new payment business needs would arise, the bank would often build a payments solution from scratch to address it, leading to a fragmented and complex payments infrastructure. With Big Data technologies, the bank can develop an enterprise payments hub solution that gives a better understanding of product and payments platform utilization and improved efficiency.

Are you a bank and interested in new Big Data technologies like Hadoop, NoSQL datastores, and real-time stream processing? Interested in one integrated platform of all three?

Jim Kaskade serves as CEO of Austin-based Infochimps, the leading Big Data Platform-as-a-Service provider. Jim is a visionary leader within both large as well as small company environments with over 25 years of experience building hi-tech businesses, leading startups in cloud computing enterprise software, software as a service (SaaS), online and mobile digital media, online and mobile advertising, and semiconductors from their founding to acquisition.




229fa9b4 2ea6 4535 8a80 e041d110204c Big Data and Banking – More than Hadoop



[New Whitepaper] Real-Time Data Aggregation

Fast response times generate costs savings and greater revenue. Enterprise data architectures are incomplete unless they can ingest, analyze, and react to data in real-time as it is generated. While previously inaccessible or too complex — scalable, affordable real-time solutions are now finally available to any enterprise.

StormKafka1 e1366923782399 [New Whitepaper] Real Time Data Aggregation

Read Infochimps’ newest whitepaper on how Infochimps Cloud::Streams is a proprietary stream processing framework based on four years of experience with sourcing and analyzing both bulk and in-motion data sources. It offers a linearly and fault-tolerant stream processing engine that leverages a number of well-proven web-scale solutions built by Twitter and Linkedin engineers, with an emphasis on enterprise-class scalability, robustness, and ease of use.

In this whitepaper, you’ll learn:

  • Definitions & History – batch processing, stream processing
  • Comparison of Stream vs. Batch for Selected Use Cases – includes industry use case: aviation
  • Why Cloud::Streams is the leading stream processing framework

DOWNLOAD1 [New Whitepaper] Real Time Data Aggregation




229fa9b4 2ea6 4535 8a80 e041d110204c [New Whitepaper] Real Time Data Aggregation



Infochimps Recognized in Inaugural Big Data 100 List

CRN Big Data 100 Infochimps Recognized in Inaugural Big Data 100 ListInfochimps is proud to be named among UBM Tech Channel’s CRN 2013 Big Data 100 list, developed by the CRN editorial team to include “vendors that have demonstrated an ability to innovate in bringing to market products and services that help businesses manage Big Data.” The list consists of 3 categories: business analytics, data management, and infrastructure and services.

Infochimps was named within the Big Data infrastructure and services category – identified as 1 out of 25 “IT vendors who can do it all, from data storage hardware and software, to management tools, to business analytics.” We are proud to be recognized alongside other innovative companies such as Amazon Web Services, Oracle, and Rackspace.

Thank you CRN for understanding the struggle with increasing volume, speed and variety of information being generated today; identifying Infochimps Enterprise Cloud as a solution to help companies address their Big Data needs.




229fa9b4 2ea6 4535 8a80 e041d110204c Infochimps Recognized in Inaugural Big Data 100 List




Image Source: CRN

[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics

InfochimpsThinkBig [Webinar] Measure Twice, Build Once: Real Time Predictive AnalyticsThurs, May 9 @ 11amPT, 1pmCT, 2pm ET

Measure Twice, Build Once: Hadoop and other Big Data technologies are not solutions to business problems in and of themselves, but they do have the capability of supporting your business goals and impacting your top and bottom lines.  This webinar walks you through essential steps of identifying your business goal and then building the right infrastructure to support it. We will provide use cases of the types of data that should be collected and the real-time, predictive or insightful analytic applications needed to ensure success.

Register for this live webcast and listen to Infochimps CSO and Co-Founder, Dhruv Bansal, and Think Big Analytics Principal Architect, Douglas Moore, share successful use cases and recommendations for building real-time predictive analytics in your enterprise.

Who should attend?: This webcast is ideal for CIOs, CMOs, CEOs, Project Managers, Analysts, and IT professionals with expected or current Big Data projects at any stage.

Register Today >>




229fa9b4 2ea6 4535 8a80 e041d110204c [Webinar] Measure Twice, Build Once: Real Time Predictive Analytics