ARN

Profile: Cloudera co-founder Mike Olson

Talks on his company’s meteoric rise, the Internet of Things and the future of Big Data
Mike Olson, Cloudera CSO and co-founder

Mike Olson, Cloudera CSO and co-founder

Having worked with such database legends as Mike Stonebraker at Berkeley in the 1980s, through to working on Oracle databases and kicking off the first ‘true’ Hadoop specialised Big Data company with Facebook, Google and Yahoo luminaries, Mike Olson is a true Big Data visionary.

Mike Olson sat down with ARN to discuss how he founded the leading Hadoop company in the world, the future of Big Data and the Internet of Things and the social change it can reap.

Tell us how you got started in IT...

Born in Minneapolis, raised in KC Missouri. Dad worked for a hospital firm. I still consider myself to be mid-western. I don’t know what the Australian equivalent is, but we’re nice people. We work hard, we’re honest, and not very full of ourselves. I go back every year visit my family, I still love it.

When I was growing up, my stepdad bought an Apple II computer. Serial number 125 – built by the Steves in a garage. That’s how I learned to computer program around 1976-77. I knew that I wanted to go to college and study electrical engineering, so I applied to the University of California, Berkeley and a few other schools. It had (and still has) a great engineering program, and, most importantly, they were a long way away from home. That was 1979, I’ve lived there ever since.

During your time at Berkeley, you worked with some of the early legends in your field, how did that come about?

Well I got a summer job working for a guy called Bob Fabry, so I was working on the Berkeley Unix Project, but in no interesting way. Just making tape backups and whatnot.

Coming up on 1982, I was a good hacking in the labs kinda guy, but a middling student. I was worried about declaring a major so I decided to save my money and take off to see Europe in the spring. I told myself id be serious when I came back, declare my major and get serious.

I ran out of money during my European travels, I happened to be in Amsterdam at the time. I was too embarrassed to call my parents and ask for help, so I got myself a job in a Mexican restaurant cooking. It was a fantastic city, I was having a fantastic time, and I ended up staying until 1985.

A Mexican chef? That’s quite a detour, right there…

Yes, I did go back and got a job with a database company with some friends in 1986. In 1988 I finally went back to Berkeley as an undergraduate – my boss basically forced me. She dragged me down there to meet Mike Stonebraker, and he gave me a job.

So from 1988-1991 I was working on the Postgres database – its well known these days, Stonebraker’s own version of Ingres. He convinced me to stay around and get my masters, which I got in 1992. I stayed one more year to get my PhD when Stonebraker recruited the entire research team, except me, to work at Illustra. I was doing my PhD and finally realised I didn’t actually like research – I liked my team. So I dropped out of Berkeley for the last time.

I joined his company, Illustra, in 1993. That was bought by Informix in 1995. Bumped around at Informix for a bit, then ended up at BerkelyDB, a company my friends started in 1998. After eight years of success with no outside investment, before Oracle acquired it in 2006.

So you have this extensive background in databases, how did you make the transition to Hadoop, and ride the Big Data wave?

I met some key people: From Facebook, Jeff Hammerbacher, who was driving their data strategy; Yahoo’s Amr Awadallah, who was running all their Big Data back end infrastructure, and Christoph Bisciglia from Google doing their evangelism and new systems management.

All three of them were really excited about Hadoop, and so we banded together to form Cloudera. Our timing was really good. Before anyone else had realised that Hadoop was going to be a big deal, we kind of got everyone that was really thinking about it together in one room.

We took our initial funding round from Ping Li, at Accel Partners. He had decided he wanted to back a Hadoop company, and he really was the only investor thinking that way.

Then the global financial markets totally cratered. Which sounds like a bad thing, but it wasn’t for us. What it meant was nobody could raise venture capital for anything – for a good 12 months. We fortunately already had $5 million dollars in the bank - we were off to the races, we had no competition at all. It basically gave us a year uncontested to get to know customers, to explore our strategy, and to build the product. But it has been an absolutely unbelievable 7 years.

From there, our story is reasonably well attested – lots of growth. Only one time in your career do you get a chance to do something like this, I just think we were incredibly lucky to be where we were, when we were, and to have seized the opportunity.

The change in the industry over the past seven years has just been breathtaking.

Page Break

So why specifically Hadoop, when so little was known about it at the time?

Look, Hadoop back then was a technology that very few people knew much about, outside of Google. Google invented this Hadoop technology back in the early 2000s, and really it was to sell more advertisements. That’s the thinking behind it.

It turned out to be transportable, to more traditional industries, to be as flexible as we need to capture all of the data that is being generated. Fast forward to today, and we’re talking to giant telecommunications companies, retailers and financial services players both here and overseas. The use cases are just amazing. It's hard to believe that data management is now sexy.

The explosion in data has led to a dramatic expansion in what you can do with it.

It now drives business value, as well as some pretty compelling social returns. Hospitals are now delivering better care for patients, getting better results for chronic disease at lower cost. Energy is being produced and distributed more cheaply, effectively and people are using less of it.

All of this because we can analyse data on consumption, production, we’re able to explore how disease onsets, and how it progresses, and able to reason and drive value from that data.

What do you mean by ‘social returns'?

For example, patients at paediatric clinics have only one way to express distress – crying. Nobody was capturing the sound. Simply by capturing that aural environment, including the kids cry, they can correlate that with all the rest of the data they have got. They can measure the kids' progress over time as they get treated, and they can then reason about the quality of outcome based upon the baby’s care.

The nurses in this hospital went nuts when they realised what data was available to them right now. They now have data about baby’s stress, and can work on solutions.

Yes, of course Big Data is going to help with fraud and financial services, and we’re going to optimise our supply chains, and the rest. But I don’t often get to work on problems like this – I mean, who’s not rooting for the babies in this story?

We’re going to drive social change, not just business value. And we’re not even getting started yet.

You’re referring to the Internet of Things here, right?

Yes, hospital sensors. Building sensors, their temperature gauges, power meters, their elevators - all sensors that will be connected in next five years. Traffic lights, speed cameras. That’s when Big Data is really going to happen.

So this is what you all wanted to chase at Cloudera?

The key driver for all of us [the founders] was the conviction that what had happened with Google and Facebook, namely this explosion of information driven by machines, was going to happen to banks, insurance companies, hospitals, and other traditional non-IT enterprises.

We have a very strong presence in a number of very high value verticals. Financial services, telecommunications, retail, healthcare, energy. Those are spaces where we have seen customers adopt the platform and do really innovative things.

In the old days, humans made data. For example, you buy something and a transaction record happens. Nowadays, you and I don’t have to be involved. Machines talk to machines, sensors in buildings report to machines.

Think about how much of modern stockmarket trading is done these days. The vast majority of the volume is program trading. Machines trading with machines. Capturing and analysing that volume of data from those machines…

Page Break

Where to next?

The pace of innovation in the data analysis market is absolutely breathtaking right now.

Machine learning is an area of enormous investment, these days you’re seeing companies like Google and Baidu, that are buying these deep learning companies that do neural networking. I think we’re still going to see an ongoing explosion of activity there.

Even just from a cost saving perspective, if I leave a petabyte of data somewhere, I want to know that I can do even more with it next year than I can this year. What Hadoop was 5 years ago is very different to the platform you see today.

What’s your channel strategy?

We are 70 per cent direct sales today – but we obviously have our partnerships.

Let’s be honest, no one rolls out a Cloudera in a bare naked datacentre. There’s always HP or Cisco gear. There’s warehousing, document analysis. The relationships with those big vendors are all good.

We’re pleased with the team we have hired here, and we know it isn't big enough. And we know we can't run this region from the US. Our recent hire Richard Jones, is brilliant, and has great experience in the local eco system. We have 32 staff locally, excluding Intel.

That’s good point, Intel has been a key partner for Cloudera for a while. How did that come about?

Our partnership with Intel allows us to look at their chip roadmap for the next 10 years, so that as soon as the silicon hits the street, our software can take advantage of it. Intel’s CIO Kim Stevenson sits on our board, and her company owns 18 per cent of us.

We kind of raised $740m by accident, without having to go to market.

So are you looking at an IPO? Or an outright acquisition? IBM has made a huge shift into data analytics through Watson et al – are they a threat?

It would be disastrous if IBM swooped in and brought the company. If IBM did make a move Kim would recuse herself, saddle up and wail into the fray. That way, Intel only has to buy 82 per cent of the company, IBM has to buy 100 per cent - that gives us insurmountable protection from aggressive acquistions.

When you have that kind of cash you can buy companies as need be. If we do have an IPO one day, it won't be a fundraiser. We’re in a position where we can invest in our platform, in the industry, and in growth, outside the glare of the public market.

We are the largest vendor in this space. By every meaningful topline metric, we are at least a factor of two bigger than the other guys. If you take out our shareholders, then that number becomes five. We’re outperforming the metrics we’re tracking. We’re very pleased with where we are.

It's not about us vs IBM, or Pivotal. We’ve all got to realise, we’re still just building this industry.

Allan Swann is the Editor of ARN, published by IDG Communications Australia. Follow Allan on Twitter @allanswann, and at Google+.