Episode 54: Justin Borgman, CEO of Starburst, the Company Behind the Presto Project
Description
Intro
Mike: Hello, and welcome to Open Source Underdogs. I’m your host Mike Schwartz, and this is the episode 54, with Justin Borgman, Chairman, CEO, and Co-Founder of Starburst, the company behind the Presto Data Access Project.
Before we get started, I have a quick request – we all want to help open-source founders and startups. I make the podcast, but I need your help to get the word out, so tell your friends, post on LinkedIn, tweet out a link, post on Hacker News, or follow me and share one of my posts on LinkedIn, whatever you think makes sense, go for it.
One of the themes of Machiavelli’s the Prince is Virtu e Fortuna—virtu meaning excellence in your domain, and fortuna meaning luck, whether good or bad. I really like how the story of Starburst exemplifies this 500-year-old insight.
Justin has a ton of domain virtu. He has deep technical knowledge, but he’s also on the lookout to harness fortuna. He’s one of the few podcast guests to acknowledge it. And Starburst earns its name because it’s one of the most stellar open-source business success stories I’ve heard in the last few years.
There’s so many great insights in this episode, a lot to think about. So, without further ado, let’s get on with the interview.
What Is Presto?
Mike: Justin, thanks for
joining the podcast today.
Justin: Hey, Mike, super glad
to be with you.
Mike: Before we dive into the business stuff, I find it’s helpful
to talk a little bit about the technology. Can you start by giving a brief
history of the Presto project? What it’s good at, and how the community
coalesced around it?
Justin: It was really back in 2012 for developers at Facebook, Martin, Dain, David, and Eric came together to create a new infrastructure project that would be a faster way of querying data at Facebook. Facebook, of course, collects massive amounts of data, hundreds of petabytes worth of data , and needed a faster alternative to a prior project that they also developed and they called Hive.
Hive was a SQL engine for Hadoop, and it just wasn’t fast enough. So, Presto was created to be a faster means of accessing that data. But it has one really important differentiation in addition to the speed, which is the ability to access data anywhere. So, it’s like a database without storage – that’s kind of one way to think about it.
So, it looks at storage in other systems, which could be Hadoop, it could be S3 and AWS, it could be a traditional database, like Oracle, or Teradata, or Snowflake. And regardless of where that data lives, Presto can reach it, query it, and deliver SQL-based analytics.
So, that’s kind of what makes it special, is the ability to access the data everywhere. And that’s gained particular momentum, I would say more recently, as many large enterprises have data silo problems, where they have data in a bunch of different databases, and are now perhaps moving to the Cloud in some fashion.
Mike: And if I’m not mistaken, high concurrency is one of the areas that make sort of this data access plain different?
Justin: Yes, exactly, it’s very fast, and can support high concurrency. And in a lot of ways, this technology was sort of, I like to say built in reverse, in the sense that it was tested at ridiculous scale from day one. You know, very often, when you start something new, you don’t really know how it’ll work at scale until you get people using it. But because it was really born out of the internet companies, Facebook, and Uber, Airbnb, and Netflix were all early adopters to use the technology, it was really tested, and at scale, and as a result delivers great performance and concurrency.
Origin Story
Mike: Starburst is not your first company, you are part of a team
at the company called Hadapt that’s sold to Teradata in about three and a half
years, I think.
Justin: Yep.
Mike: How did that experience lead you to Presto?
Justin: In a lot of ways, this is really a continuation of that journey that began 10 years ago. So, that was 2010 that I started Hadapt. Hadapt was a spin-out actually from Yale University and the computer science department – there’s some research called HadoopDB, which was pretty pioneering research at the time, in terms of thinking about Hadoop as a data warehousing solution, and being able to deliver fast SQL analytics on top of Hadoop.
So, we spun that out, raised Venture Capital, built that business over nearly four years, as you mentioned, and then sold it to Teradata. We had ups and downs, definitely lessons learned through that experience. And I think, really, my discovery of Presto after arriving at Teradata in 2014 was kind of an exciting opportunity to reimagine the strategy that we had with Hadapt.
So, Hadapt was the SQL engine for Hadoop, Presto is a SQL engine for anything essentially, allows you to access data anywhere.it was an opportunity to basically take all the lessons learned from the first experience and start to apply them over again.
It was actually my team from Hadapt that ended up contributing a tremendous amount of software to Presto, and working with the guys at Facebook, who created it to really make it an enterprise-grade piece of technology. And I think, as we started to see Presto get more and more capable, and see more and more people use it, that was what created the idea in our head that maybe there was a business to be formed around this.
Community Engagement
Mike: It’s a really interesting opportunity, and I can’t actually think of another example like it, but when I’m talking about open source, I sometimes talk about three types of open-source companies. One would be volunteer, where a bunch of guys or girls get together and write some piece of software that they love, but not necessarily for a business.
And then, I talk about corporate open source, where there’s some piece of software, where a company funds it, but it’s not their core business, but then, they realize that makes sense for them to collaborate like Kubernetes, let’s say ,and Google, and these pure-play, open-source companies, where the company behind it is developing it, and they’re the main contributors.
And so, lots of great open-source projects come out of this corporate open-source area, the podcast that is mostly focused on pure-play because they were trying to help entrepreneurs and founders start open source, use open source as part of their business model. But you’ve sort of, like, created a very interesting situation, where you have a mix of corporate and pure-play because you’re benefiting from, not just the community, but, really, Facebook is a big contributor to the project to — I heard almost 50/50. So, how’s that really evolved, and how do you continue to encourage this very symbiotic relationship?
Justin: You’re right. Preston has a very interesting history to it, an interesting journey. It started as a small project at Facebook. When we got involved at Teradata, we were able to apply a few million dollars a year of R&D budget into advancing that as well. And then, of course, you’ve got a few other companies contributing also along the way.
And, as a result, all of that kind of accelerates the development of the project. And I think that maybe what’s most unique here is not only that Facebook created great infrastructure software as a byproduct of their business – they’ve certainly done that before – but rather that there was kind of a commercial partner very early on, and myself, and my team at Teradata thinking about the commercial applications of this.
So, you know, back in 2014, Presto was still in its early days, Facebook wasn’t trying to monetize it obviously, that’s not their business, but we were already thinking about how this could be used by Fortune 500 customers, and what difference this could make to their business. And I think that led to its very enterprise-applicable evolution, and set us up really well to eventually commercialize this in 2017, when we left Teradata, the creators of Presto joined us from Facebook. And we went off on our way to build this business.
Idea Incubation
Mike: So, you were working on Presto while you’re at Teradata. And did Teradata ask for any equity, or how did that work when you told Teradata, “We want to start this company basically working at Teradata? Like, what was that like?
Justin: Yeah, well, what was interesting about that – and I guess just to set the context, I think Teradata, from 2014, when they acquired my company through to probably today, has gone through various iterations of kind of rethinking their overall strategy, in terms of how they evolved into this next generation of sort of Big Data platforms. Because they had great success in the ‘80s, ‘90s, and early 2000s, as this kind of monolithic data warehouse, where you would ingest everything and store it in one place.
But obviously that became very expensive over time. And the appliance model, hardware and software combined, wasn’t necessarily set up for this future as people move to the cloud. So, they’ve gone through a lot of iterations. And it was really in that iterative process, where they weren’t really clear where they wanted to go, t




