How Redpanda Extracts Business Value from Data Events with Alex Gallego
Alex Gallego, CEO & Founder of Redpanda, joins Corey on Screaming in the Cloud to discuss his experience founding and scaling a successful data streaming company over the past 4 years. Alex explains how it’s been a fun and humbling journey to go from being an engineer to being a founder, and how he’s built a team he trusts to hand the production off to. Corey and Alex discuss the benefits and various applications of Redpanda’s data streaming services, and Alex reveals why it was so important to him to focus on doing one thing really well when it comes to his product strategy. Alex also shares details on the Hack the Planet scholarship program he founded for individuals in underrepresented communities.
Alex Gallego is the founder and CEO of Redpanda, the streaming data platform for developers. Alex has spent his career immersed in deeply technical environments, and is passionate about finding and building solutions to the challenges of modern data streaming. Prior to Redpanda, Alex was a principal engineer at Akamai, as well as co-founder and CTO of Concord.io, a high-performance stream-processing engine acquired by Akamai in 2016. He has also engineered software at Factset Research Systems, Forex Capital Markets and Yieldmo; and holds a bachelor’s degree in computer science and cryptography from NYU.
- Redpanda: https://redpanda.com/
- Twitter: https://twitter.com/emaxerrno
- Redpanda community Slack: https://redpandacommunity.slack.com/join/shared_invite/zt-1xq6m0ucj-nI41I7dXWB13aQ2iKBDvDw
- Hack The Planet Scholarship: https://redpanda.com/scholarship
Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.
Corey: Tired of slow database performance and bottlenecks on MySQL or PostgresSQL when using Amazon RDS or Aurora? How’d you like to reduce query response times by ninety percent? Better yet, how would you like to get me to pronounce database names correctly? Join customers like Zscaler, Intel, Booking.com, and others that use OtterTune’s artificial intelligence to automatically optimize and keep their databases healthy. Go to ottertune dot com to learn more and start a free trial. That’s O-T-T-E-R-T-U-N-E dot com.
Corey: Welcome to Screaming in the Cloud, I’m Corey Quinn, and this promoted guest episode is brought to us by our friends at Redpanda, which I’m thrilled about because I have a personal affinity for companies that have cartoon mascots in the form of animals and are willing to at least be slightly creative with them. My guest is Alex Gallego, the founder and CEO over at Redpanda. Alex, thanks for joining me.
Alex: Corey, thanks for having me.
Corey: So, I’m not asking about the animal; I’m talking about the company, which I imagine is a frequent source of disambiguation when you meet people at parties and they don’t quite understand what it is that you do. And you folks are big in the data streaming space, but data streaming can mean an awful lot of things to an awful lot of people. What is it for you?
Alex: Largely it’s about enabling developers to build applications that can extract value of every single event, every click, every mouse movement, every transaction, every event that goes through your network. This is what Redpanda is about. It’s like how do we help you make more money with every single event? How do we help you be more successful? And you know, happy to give examples in finance, or IoT, or oil and gas, if it’s helpful for the audience, but really, to me, it’s like, okay, if we can give you the framework in which you can build a new application that allows you to extract value out of data, every single event that’s going through your network, to me, that’s what a streaming is about. It large, it’s you know, data contextualized with a timestamp and largely, a sort of a database of event streaming.
Corey: One of the things that I find curious about the space is that usually, companies wind up going one of two directions when you’re talking about data streaming. Either there, “Oh, just send it all to us and we’ll take care of it for you,” or otherwise, it’s a, great they more or less ship something that you’ve run in your own environment. In the olden days of data centers, that usually resembled a box of some sort. You’re one of those interesting split-the-difference companies where you offer both models. Do you find that one of those tends to be seeing more adoption these days or that there’s an increasing trend toward one direction or the other?
Alex: Yeah. So, right now, I think that to me, the future of all these data-intensive products—whether you’re a database or a streaming engine—will, because simply of cost of networks transferred between the hybrid clouds and your accounts, sending a gigabyte a second of data between, let’s say, you know, your data center and a vendor, it’s just so expensive that at some point, from just a cost perspective, like, running the infrastructure, it’s in the millions of dollars. And so, running the data inside your VPC, it’s sort of the next logical evolution of how we’ve used to consume services. And so, I actually think it’s just the evolution: people would self-host because of costs and then they would use services because of operational simplicity. “I don’t want to spend team skills and time building this. I want to pay a vendor.”
And so, BYOC, to be honest—which is what we call this offering—it was about [laugh] sidestepping the costs and of being stuck in the hybrid clouds, whether it’s Google or Amazon, where you’re paying egress and ingress costs and it’s just so expensive, in addition to this whole idea of data residency or data sovereignty and privacy. It’s like, yeah, why not both? Like, if I’m an engineer, I want low latency and I don’t want to pay you to transfer this thing to the next rack. I mean, my computer’s probably, like, you know, a hundred feet away from my customer's computer. Like, why [laugh] way is that so complicated? So, you know, my view is that the future of data-intensive products will be in this form of where it—like, data planes are actually owned by companies, and then you offer that as a Software as a Service.
Corey: One of the things that catches an awful lot of companies with telemetry use cases—or data streaming as another example of that—by surprise when they start building their own cloud-hosted offering is that they’re suddenly seeing a lot more cross-AZ data charges than they would have potentially expected. And that’s because unlike cross-region or the really expensive version of this with egress, it’s a penny in and a penny out per gigabyte in most of AWS regions. Which means that that isn’t also bound strictly to an AWS organization. So, you have customers co-located with you and you’re starting to pay ingress charges on customers throwing their data over to you. And, on some level, the most economical solution for you is well, we’re just going to put our listeners somewhere else far away so that we can just have them pay the steep egress fee but then we can just reflect it back to ourselves for free.
And that’s a terrible pattern, but it’s a byproduct of the absolutely byzantine cross-AZ data transfer pricing, in fact, all of the data transfer pricing that is at least AWS tends to present. And it shapes the architectural decisions you make as a result.
Alex: You know, as a user, it just didn’t make sense. When we launched this product, the number of people that says like, “Why wouldn’t your charge for, you know, effectively renting [unintelligible 00:05:14 ], and giving a markup to your customers?” That’s we don’t add any value on that, you know? I think people should really just pay us for the value that we create for them. And so, you know, for us competing with other companies is relatively easy.
Competing with MSK is it’s harder because MSK just has this, you know, muscle where they don’t charge you for some particular network traffic between you. And so, it forces companies like us that are trying to be innovative in the data space to, like, put our services in that so that we can actually compete in the market. And so, it’s a forcing function of the hybrid clouds having this strong muscle of being able to discount their services in a way that companies just simply don’t have access to. And then, you know, it becomes—for the others—latency and sovereignty.
Corey: This is the way that effectively all of AWS has first-party offerings of other things go. Replication traffic between AZs is not chargeable. And when I asked them about that, they say, “Oh, yeah. We just price that into the cost of the service.” I don’t know that I necessarily buy that because if I try and run this sort of thing on top of EC2, it would cost me more than using their crappy implementation of it, just in data transfer alone for an awful lot of use cases.