Discover
The GeekNarrator
The GeekNarrator
Author: Kaivalya Apte
Subscribed: 29Played: 356Subscribe
Share
© Kaivalya Apte
Description
The GeekNarrator podcast is a show hosted by Kaivalya Apte who is a Software Engineer and loves to talk about Technology, Technical Interviews, Self Improvement, Best Practices and Hustle.
Connect with Kaivalya Apte https://www.linkedin.com/in/kaivalya-apte-2217221a
Tech blogs: https://kaivalya-apte.medium.com/
Wanna talk? Book a slot here: https://calendly.com/speakwithkv/hey
Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show.
Cheers
Connect with Kaivalya Apte https://www.linkedin.com/in/kaivalya-apte-2217221a
Tech blogs: https://kaivalya-apte.medium.com/
Wanna talk? Book a slot here: https://calendly.com/speakwithkv/hey
Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show.
Cheers
106 Episodes
Reverse
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinExploring Cloud Databases, Scalability, and Simple Engineering with Sam Lambert, CEO of PlanetScaleIn this episode of The Geek Narrator podcast, we welcome Sam Lambert, CEO and Co-Founder of PlanetScale, known for creating the world's fastest and most scalable cloud database. Sam shares his insights on databases, operational excellence, and simple engineering. We discuss topics such as scalability, Postgres versus MySQL, and replication. Sam also talks about handling complexity in engineering, the unique features of Vites, and how PlanetScale achieves high availability. Don't miss this deep dive into the future of cloud databases. Like, share, and subscribe to support the channel!Chapters:00:00 Introduction and Episode Overview01:13 Meet Sam Lambert: Background and Career02:42 Balancing Work and Social Media05:48 The Philosophy of Simple Engineering14:21 The Slotted Counter Pattern at GitHub18:27 Postgres vs MySQL: Design Flaws and Philosophical Differences28:58 Sharding and Scaling with Vitess37:01 Database Branching and Schema Changes38:50 Common Practices in Startups39:07 Challenges with Data Branching40:45 Legal and Ethical Considerations42:31 Staging Environments vs. Dev Branches45:26 Trade-offs in Cloud Databases52:41 Replication and Durability01:00:02 Ensuring High Availability01:08:04 Backup Strategies and Testing01:10:41 Conclusion and Final ThoughtsLearn about PlanetScale: https://planetscale.com/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this captivating episode, we sit down with Joran Dirk Greef, the mastermind behind Tiger Beetle, a groundbreaking financial transactions database. Joran shares his journey of innovation, highlighting the challenges and triumphs of creating a system that is not only faster but also safer. Dive into the philosophy of Tiger Style, a unique methodology that emphasizes quality and performance, ensuring that software development is both efficient and effective. Joran's insights into trust, discipline, and the relentless pursuit of excellence offer valuable lessons for anyone in the tech industry. Whether you're a developer, entrepreneur, or tech enthusiast, this episode is packed with inspiration and practical wisdom. Don't miss out on this opportunity to learn from one of the leading minds in software engineering.Chapters:00:01:37 Introduction to Tiger Beetle 00:02:27 Philosophy of Tiger Style 00:03:38 Challenges in Software Development00:04:43 Importance of Trust and Quality 00:09:43 Static Allocation in Software 00:16:53 AI in Software Development 00:23:53 Business Philosophy and Innovation 00:31:53 The Future of Software DevelopmentFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this episode, host Kaivalya Apte interviews Ankit Sultana, a staff engineer at Uber with extensive experience in Apache Pinot, a real-time analytics platform. They discuss the high-level architecture, ingestion processes, and query mechanisms of Apache Pinot. Ankit provides a historical context, detailing the evolution of Apache Pinot from its origins at LinkedIn to its widespread adoption. They discuss the key components of Pinot, explaining the roles of Pinot servers, brokers, controllers, and the dependency on Zookeeper. Ankit also explained how data flows into Apache Pinot and the technicalities of its real-time ingestion and querying capabilities. Chapters:00:00 Introduction and Episode Overview03:30 Understanding Apache Pinot03:49 Apache Pinot's Historical Background05:20 Real-Time Analytics with Apache Pinot11:06 Apache Pinot's Architecture and Components17:05 Tenancy and Data Ingestion in Apache Pinot30:22 Understanding Real-Time Replication and Consumer Groups30:52 Pinot's Offset Tracking and Segment Creation31:59 Handling Server Restarts and Segment Transitions32:50 Dealing with Kafka Duplicates and Deduplication Features35:13 Ingestion Process and Mutable vs Immutable Segments39:18 Memory Management and Segment Flushing40:10 Advantages of Keeping Mutable Segments Longer42:21 Introduction to Pinot's Query Engines42:50 Single Stage Engine: Architecture and Optimizations54:49 Multi-Stage Engine: Flexibility and Challenges58:13 Conclusion and Next StepsImportant Links:* Good high-level overview on Pinot: https://www.youtube.com/watch?v=F8Q_pGIH9yY* Apache Pinot 101 by Tim: https://www.youtube.com/playlist?list=PLihIrF0tCXdfN6y-twj9KtWaXM1GH4RSe* Multistage Physical Optimizer, the new optimizer that we built at Uber and open-sourced: https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/physical-optimizer* Multistage Lite Mode: https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/multistage-lite-mode* Time Series Engine Talk at RTA Summit: https://www.youtube.com/watch?v=kgseiambgesFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummaryIn this conversation, Ian discusses the evolution and significance of Unikernels and NanoVMs, emphasizing their potential to enhance security and performance in cloud computing. He explains the historical context of operating systems, the limitations of traditional systems, and how Unikernels offer a streamlined alternative. Ian also highlights the unique features of NanoVMs, their integration capabilities, and the challenges faced in the ecosystem. The discussion concludes with insights into the future of Unikernels and the ongoing developments in the field.takeaways.Unikernels are a specialized type of operating system designed for cloud environments.The evolution of operating systems has led to the need for more efficient solutions like Unikernels.Unikernels can significantly reduce security vulnerabilities compared to traditional systems.NanoVMs provide a unique approach to Unikernels with a focus on performance and security.Integrations with existing tools and libraries are crucial for the adoption of Unikernels.The ecosystem around Unikernels is still developing, with many opportunities for growth.Unikernels eliminate the need for complex orchestration and management layers.The future of Unikernels includes tighter integrations with cloud services and improved developer experiences.Security features in Unikernels are designed to address modern threats effectively.The potential for Unikernels to transform application deployment is significant, with many untapped possibilities.Chapters00:00 Introduction to Unikernels and NanoVMs04:24 The Evolution of Operating Systems11:24 Understanding Unikernels vs. Traditional Systems17:20 Security Implications of Unikernels26:17 NanoVMs: Architecture and Unique Features38:44 Security Concerns in Unikernels41:05 Integration and Support for GPUs44:02 Cloud Support and Deployment45:51 Avoiding Bloat in Integrations51:54 Developer's Perspective on Unikernels59:18 Limitations and Future of UnikernelsImportant Links:https://ops.cityhttps://nanos.orghttps://repo.ops.cityhttps://nanovms.com/dev/tutorialsFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummaryIn this conversation, Philipp discusses the innovations behind CedarDB, a database system designed from scratch to optimize performance for modern hardware. He explains the foundational principles of compiling SQL to machine code, the importance of parallel processing, and the challenges of maintaining Postgres compatibility. The discussion also covers the system's approach to handling transactional and analytical workloads, data ingestion processes, query optimization strategies, and future developments including schema evolution and disaggregated storage.Takeaways:- CedarDB is built from the ground up to utilize modern hardware effectively.- The system compiles SQL directly to machine code for performance.- Parallel processing is a key feature, allowing efficient use of multiple cores.- CedarDB aims to be Postgres compatible while innovating on performance.- Transactional workloads are handled efficiently without sacrificing analytical capabilities.- Data ingestion is optimized for both row-oriented and columnar formats.- The system uses optimistic concurrency control to manage write conflicts.- Query optimization leverages statistics to improve join performance.- Future developments include schema evolution and disaggregated storage.- CedarDB is designed to be flexible and adaptable for various workloads.Chapters00:00 Introduction to CDRDB and Background of Philipp05:36 Compiling SQL to Machine Code for Performance11:25 General Purpose vs. Analytical Databases16:51 Transactional Workloads and Hybrid Storage Engine54:29 Understanding B-Tree and Columnar Storage01:02:18 Data Duplication and Memory Efficiency01:08:43 Indexing Strategies and B-Tree Optimization01:15:57 Handling Write Conflicts and Transaction Management01:24:10 Query Optimization and Join Strategies01:33:28 Future Developments in Schema Evolution and StorageImportant Links:CedarDB: https://cedardb.com/The Umbra research project: https://umbra-db.com/SQL Query Compilation: http://www.vldb.org/pvldb/vol4/p539-neumann.pdfOptimistic B-Trees: https://cedardb.com/blog/optimistic_btrees/Our B-Tree storage engine: https://cedardb.com/blog/colibri/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
Read more about Kafka Diskless-topics, KIP by Aiven:KIP-1150: https://fnf.dev/3EuL7mvSummary:In this conversation, Kaivalya Apte and Alexis Schlomer discuss the internals of query optimization with the new project optd. They explore the challenges faced by existing query optimizers, the importance of cost models, and the advantages of using Rust for performance and safety. The discussion also covers the innovative streaming model of query execution, feedback mechanisms for refining optimizations, and the future developments planned for optd, including support for various databases and enhanced cost models.Chapters00:00 Introduction to optd and Its Purpose03:57 Understanding Query Optimization and Its Importance10:26 Defining Query Optimization and Its Challenges17:32 Exploring the Limitations of Existing Optimizers21:39 The Role of Calcite in Query Optimization26:54 The Need for a Domain-Specific Language40:10 Advantages of Using Rust for optd44:37 High-Level Overview of optd's Functionality48:36 Optimizing Query Execution with Coroutines50:03 Streaming Model for Query Optimization51:36 Client Interaction and Feedback Mechanism54:18 Adaptive Decision Making in Query Execution54:56 Persistent Memoization for Enhanced Performance57:12 Guided Scheduling in Query Optimization59:55 Balancing Execution Time and Optimization01:01:43 Understanding Cost Models in Query Optimization01:04:22 Exploring Storage Solutions for Query Optimization01:07:13 Enhancing Observability and Caching Mechanisms01:07:44 Future Optimizations and System Improvements01:18:02 Challenges in Query Optimization Development01:20:33 Upcoming Features and Roadmap for optdReferences:- NeuroCard: learned Cardinality Estimation: https://vldb.org/pvldb/vol14/p61-yang.pdf- RL-based QO: https://arxiv.org/pdf/1808.03196- Microsoft book about QO: https://www.microsoft.com/en-us/research/publication/extensible-query-optimizers-in-practice/- Cascades paper: https://15721.courses.cs.cmu.edu/spring2016/papers/graefe-ieee1995.pdf- optd source code: https://github.com/cmu-db/optd- optd website (for now): https://db.cs.cmu.edu/projects/optd/For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #queryoptimization #sql #postgres
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummaryIn this conversation, Nitish Tiwari discusses Parseable, an observability platform designed to address the challenges of managing and analyzing large volumes of data. The discussion covers the evolution of observability systems, the design principles behind Parseable, and the importance of efficient data ingestion and storage in S3. Nitish explains how Parseable allows for flexible deployment, handles data organization, and supports querying through SQL. The conversation also touches on the correlation of logs and traces, failure modes, scaling strategies, and the optional nature of indexing for performance optimization.References:Parseable: https://www.parseable.com/GitHub Repository: https://github.com/parseablehq/parseableArchitecture: https://parseable.com/docs/architecture Chapters:00:00 Introduction to Parseable and Observability Challenges05:17 Key Features of Parseable12:03 Deployment and Configuration of Parseable18:59 Ingestion Process and Data Handling32:52 S3 Integration and Data Organisation35:26 Organising Data in Parseable38:50 Metadata Management and Retention39:52 Querying Data: User Experience and SQL44:28 Caching and Performance Optimisation46:55 User-Friendly Querying: SQL vs. UI48:53 Correlating Logs and Traces50:27 Handling Failures in Ingestion53:31 Managing Spiky Workloads54:58 Data Partitioning and Organisation58:06 Creating Indexes for Faster Reads01:00:08 Parseable's Architecture and Optimisation01:03:09 AI for Enhanced Observability01:05:41 Getting Involved with ParseableFor memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#database #s3 #objectstorage #opentelemetry #logs #metrics
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this conversation, Kaivalya Apte and Rajesh Pandey talk about the engineering behind AWS Lambda, exploring its architecture, use cases, and best practices. They discuss the challenges of event handling, concurrency, and load balancing, as well as the importance of observability and testing in serverless environments. The conversation highlights the innovative solutions AWS Lambda provides for developers, emphasizing the balance between simplicity and complexity in cloud computing.Chapters:00:00 Introduction to AWS Lambda04:36 Use Cases and Best Practices for AWS Lambda09:34 Event Handling and Queue Management19:41 Idempotency and Event Duplication Challenges29:39 Cold Starts and Performance Optimization34:37 Statelessness and Resource Management in Lambda42:18 Understanding Micro-VMs and Cold Starts45:14 Resource Management and Recommendations for Developers47:04 Scaling and Back Pressure in Serverless Systems51:33 Cellular Architecture and Fairness in Resource Allocation55:23 Handling Problematic Events and Poison Pills01:01:03 Testing and Operational Readiness in Lambda01:14:11 Preparing for High Traffic EventsReferences:Handling Billions of invocations: https://aws.amazon.com/blogs/compute/handling-billions-of-invocations-best-practices-from-aws-lambda/Firecracker: https://firecracker-microvm.github.io/AWS Lambda: https://aws.amazon.com/lambda/Connect with Rajesh: https://x.com/RPandeyViewshttps://www.linkedin.com/in/rajeshpandeyiiit/Don't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#aws #awslambda #serverless #distributedsystems #scalability #reliability
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this episode of The Geek Narrator podcast, host Kaivalya Apte interviews Kyle Kingsbury, a renowned expert in database and distributed systems safety analysis. They discuss the world of testing distributed systems, the challenges faced, common bugs and patterns. Kyle shares insights on the importance of understanding system documentation, the role of formal verification, and the balance between performance and safety in testing. He also provides valuable advice for aspiring engineers in the field of distributed systems.Chapters:00:00 Introduction to Kyle Kingsbury and His Work06:59 Common Bugs in Distributed Systems12:37 Functional Bugs vs Safety Bugs17:54 Changes in Testing Over the Years26:03 False Positives and Negatives in Testing32:33 The Importance of Experimentation in Testing39:28 Tools and Technologies for Testing48:58 The Role of Formal Verification57:04 Reusability of TestsImportant links:Distributed systems class: https://github.com/aphyr/distsys-classWrite your own distributed system: https://github.com/jepsen-io/maelstromJepsen Analyses: https://jepsen.io/analysesKey takeaways:- Reading documentation is a crucial first step in testing systems.- Testing distributed systems involves understanding their semantics and guarantees.- Common bugs often arise from mismanagement of definite versus indefinite failures.- Testing strategies for cloud-based systems require cooperation with providers.- Performance testing can reveal unexpected behaviours in systems under stress.- Formal verification remains a challenging but valuable tool in ensuring system safety.- The testing process is iterative and requires collaboration with engineering teams.- Aspiring engineers should immerse themselves in practical experiences to build intuition.For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinDon't forget to like, share, and subscribe for more insights!=============================================================================Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator=============================================================================Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#databasearchitecture #distributedsystems #cloudcomputing #testing #jepsen
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this conversation, Kaivalya Apte and Simon Eskildsen talk about vector databases, particularly focusing on TurboPuffer. They discuss the importance of vector search, embeddings, and the challenges associated with building efficient search engines. The conversation covers various aspects such as cost considerations, chunking strategies, multi-tenancy, and performance optimization. Simon shares insights on the future of vector search and the significance of observability and metrics in database performance. The discussion emphasizes the need for practical application and experimentation in understanding these technologies.Chapters:00:00 Introduction to Vector Databases10:34 Understanding Vectors and Embeddings15:03 Example: Designing a Search Engine for Podcasts27:53 Scaling Challenges in Vector Search36:46 Indexing and Querying in TurboPuffer38:12 Understanding Indexing and Query Planning45:45 Exploring Index Types and Their Performance50:27 Data Ingestion and Embedding Retrieval54:19 Use Cases and Challenges in Vector Search01:01:22 Metrics and Observability in Vector Databases01:03:52 Future Trends in Vector Search and DatabasesReferences:How do build a database on Object Storage? https://youtu.be/RFmajOeUKnETurbopuffer https://turbopuffer.com/Continous Recall measurement: https://turbopuffer.com/blog/continuous-recallTurbopuffer architecture: https://turbopuffer.com/architecture
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Jacopo and Ciro discuss their journey in building Bauplan, a platform designed to simplify data management and enhance developer experience. They explore the challenges faced in data bottlenecks, the integration of development and production environments, and the unique approach of Bauplan using serverless functions and Git-like versioning for data. The discussion also touches on scalability, handling large data workloads, and the critical aspects of reproducibility and compliance in data management. Chapters:00:00 Introduction03:00 The Data Bottleneck: Challenges in Data Management06:14 Bridging Development and Production: The Need for Integration09:06 Serverless Functions and Git for Data17:03 Developer Experience: Reducing Complexity in Data Management19:45 The Role of Functions in Data Pipelines: A New Paradigm23:40 Building Robust Data Solutions: Versioning and Parameters30:13 Optimizing Data Processing: Bauplan Runtime46:46 Understanding Control Planes and Data Management48:51 Ensuring Robustness in Data Pipelines52:38 Data Quality and Testing Mechanisms54:43 Branching and Collaboration in Data Development57:09 Scalability and Resource Management in Data Functions01:01:13 Handling Large Data Workloads and Use Cases01:09:05 Reproducibility and Compliance in Data Management01:16:46 Future Directions in Data Engineering and Use CasesLinks and References:Bauplan website:https://www.bauplanlabs.com
In this episode of The Geek Narrator podcast, Lalit Suresh, CEO of Feldera, joins us to share insights on incremental view maintenance and its significance in modern data processing.We have discussed the challenges posed by distributed systems, the mathematical foundation of DBSP, and how Feldera's architecture addresses these challenges. Performance optimization, handling late events, and the future of stream processing, the importance of SQL in creating efficient data workflows - its all in here.Chapters00:00 Introduction to Incremental View Maintenance06:30 Challenges in Distributed Systems11:46 Batch Processing vs Stream Processing16:27 Understanding DBSP: The Mathematical Foundation27:46 Architecture of Feldera and Data Flow39:23 Partitioning and Storage Layer in Feldera42:51 Understanding Co-Design Storage Layers45:52 Foreground and Background Workers in DBSP49:16 Tuning Background Workers for Performance49:41 Synchronous Compute Model and View Propagation51:35 Zsets and Batch Processing in Stream Workloads54:00 Data Model Optimization in Feldera57:22 Handling Late Events and Lateness in Feldera01:01:18 Watermarks and Lateness Annotations01:04:20 Error Handling and Idempotency in Feldera01:11:05 Feldera's Differentiators and Future Roadmap
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Alex from Red Panda discusses his engineering background, the challenges faced in reliability engineering, and the journey of building a better streaming system. He emphasizes the importance of understanding latency and performance in engineering systems, the market position of Red Panda in relation to Kafka, and the complexities involved in optimizing codebases for better performance. In this conversation, Alex discusses Red Panda's architecture, focusing on its thread architecture, memory allocation mechanics, and the importance of protocol correctness. He highlights how Red Panda stands out in the data systems landscape by eliminating unnecessary complexities and optimizing performance across various latency spectrums. The discussion also touches on the future of data processing, emphasizing the shift towards agentic workloads and the integration of analytical and operational layers.Chapters00:00 Introduction11:07 Building a Better Streaming System19:10 Market Position and Competition25:06 Optimizing Latency and Performance32:38 Understanding Complexity in Codebases33:36 Thread Architecture and Concurrency Models39:39 Memory Allocation Mechanics47:31 Protocol Correctness and Optimization Strategies56:27 Red Panda's Unique Position in Data Systems01:02:05 The Future of Data Processing and Agentic WorkloadsBlogs:TPC buffers: https://www.redpanda.com/blog/tpc-buffershttps://www.redpanda.com/blog/always-on-production-memory-profiling-seastarhttps://www.redpanda.com/blog/end-to-end-data-pipelines-types-benefits-and-process------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet.Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#streaming #kafka #redpanda #c++ #databasesystems #SQL #distributedsystems #memoryallocation #garbagecollection
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this episode, we talk to Søren Schmidt, Co-Founder and CEO of Prisma, discussing the evolution of Prisma from a backend as a service to a popular ORM and now to Prisma Postgres. He shares insights into the challenges faced during this journey, the importance of user feedback, and the innovative architecture of Prisma Postgres, which leverages micro VMs for performance optimization. The conversation also touches on the complexities of managing data centers and the strategies employed to ensure a seamless user experience. In this conversation, Søren Schmidt discusses the details about Postgres snapshots, their impact on performance, and the mechanisms for fault tolerance. He explains how Pulse change data capture works and how Prisma Postgres simplifies database management for users. Chapters00:00 Introduction to Prisma and Its Evolution03:00 The Journey from ORM to Prisma Postgres06:00 Simplifying Database Management09:01 Understanding Prisma Postgres Architecture12:12 The Role of Accelerate in Query Routing14:51 Optimizing Query Processing with Micro VMs18:12 Maintaining Postgres Integrity in a Micro VM Environment21:07 User Experience and Community Feedback23:57 Challenges of Data Center Management27:09 Cold Starts and Performance Optimization34:30 Understanding Snapshots in Postgres38:55 Snapshot Mechanisms and Fault Tolerance44:09 Change Data Capture with Pulse55:07 Transitioning to Prisma Postgres58:45 Community and Getting Started with Prisma PostgresSome blogs worth checking out:https://www.prisma.io/blog/prisma-postgres-the-future-of-serverless-databaseshttps://www.prisma.io/blog/cloudflare-unikernels-and-bare-metal-life-of-a-prisma-postgres-queryhttps://www.prisma.io/blog/announcing-prisma-postgres-early-accessPrisma Postgres relies heavily on the Unikraft project. There is a good introductory talk here: https://www.youtube.com/watch?v=n4wOyAuNhl0And some very technical papers here: https://unikraft.org/community/papersThe best way to get started with Prisma Postgres is to go straight to https://www.prisma.io/ ------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this episode, Kaivalya Apte and Frederic Branczyk talk about observability, focusing on continuous profiling and the role of eBPF. They discuss the evolution of profiling techniques, the importance of systematic data collection, and the challenges faced in maintaining low overhead while gathering detailed performance metrics.Frederic shares insights from his extensive experience with Prometheus and Kubernetes, emphasizing the transformative impact of continuous profiling on software performance optimization. This conversation delves into the intricacies of eBPF (Extended Berkeley Packet Filter) and its applications in profiling and performance analysis. The discussion covers the capabilities of eBPF in extending the kernel safely, the mechanisms of user space profiling, and the handling of process terminations. It also explores memory and network profiling techniques, the challenges of profiling in different programming environments, and the limitations of eBPF in certain use cases. The conversation concludes with valuable resources for those interested in learning more about eBPF and profiling techniques.Chapters:00:00 Introduction to Observability and Profiling01:17 Frederic's Background and Expertise02:11 The Importance of Continuous Profiling06:46 The Value of Continuous Profiling11:20 Understanding Profiling Data19:09 Data Structures and Performance in Profiling32:35 The Role of eBPF in Profiling42:48 Introduction to eBPF and Its Capabilities48:32 User Space Profiling and Memory Management51:39 Handling Process Termination and Agent Recovery55:27 Memory and Network Profiling Techniques01:01:33 Profiling in Different Programming Environments01:11:47 Use Cases and Limitations of eBPF in Profiling01:13:54 Resources for Learning eBPF and Profiling Techniques------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Unmesh Joshi discusses the patterns of distributed systems. He emphasizes the importance of understanding the context in which patterns are applied, the need to read code to grasp their implementation, and the common pitfalls that developers face when applying patterns without a clear understanding of the underlying problems. Chapters00:00 Introduction to Distributed Systems and Patterns05:39 Understanding Patterns in Distributed Systems19:23 Bridging Theory and Practice in Distributed Systems28:56 The Role of Developers in Understanding Patterns31:58 Understanding Patterns in Software Development40:58 The Human Aspect of Software Design44:37 Iterative Development and Real-World Applications49:03 The Future of Patterns in Cloud-Native Systems55:07 Common Misunderstandings of Distributed PatternsInteresting quotes:"Patterns capture wisdom of generations.""Reading code is the best way to understand.""Patterns help you see beyond abstractions.""Understanding patterns helps bridge the gap.""Expert generalists can operate across verticals.""There are no simple systems in the cloud era.""Patterns can add complexity if misunderstood.""Patterns are always useful within a context.""Design and development are human activities.""The deconstruction of databases is happening.""Paxos is the most misunderstood pattern."Unmesh Joshi :https://in.linkedin.com/in/unmesh-joshi-9487635Catalog of Patterns: https://martinfowler.com/articles/patterns-of-distributed-systems/I hope you liked the episode, if you did please like, share and subscribe. ------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet.Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#distributedsystems #patterns #softwarearchitecture #consensus #algorithms #coding #patterns #softwaredevelopment #ThoughtWorks #softwareengineering #cloud #computing #software
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/join
Membership will get you access to member only videos, exclusive notes and monthly 1:1 with me.
Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA
------------------------------------------------------------------------------------------------------------------------------------------------------------------
About this episode:
------------------------------------------------------------------------------------------------------------------------------------------------------------------
In this episode of the Geek Narrator podcast, host Kaivalya Apte interviews Marc Brooker, a distinguished engineer at AWS, about Aurora D-SQL. They discuss Marc's journey at AWS, the evolution of Aurora D-SQL, and the customer-centric approach that led to its development.
Marc explains the choice of PostgreSQL as the foundation for DSQL, the architecture of the database, and the importance of snapshot isolation and concurrency control. The conversation goes into the technical aspects of DSQL, including the write process and how atomicity is maintained, providing listeners with a comprehensive understanding of this innovative database solution. This conversation also goes deep into the intricacies of database design, focusing on fault tolerance, replication strategies, and the role of Firecracker VMs in enhancing scalability. Marc Brooker discusses the architecture of Aurora D-SQL, emphasizing the importance of transaction management, the challenges of active-active deployments, and the trade-offs involved in database design. The discussion also highlights various use cases for Aurora DSQL, including its suitability for micro-services and serverless architectures, while addressing scenarios where it may not be the best fit.
Chapters
00:00 Introduction to Aurora DSQL and Marc Brooker's Journey
03:38 The Evolution of Aurora DSQL at AWS
09:24 Customer-Centric Development and Technological Enablers
12:50 Why PostgreSQL? The Choice Behind DSQL
16:39 High-Level Architecture of DSQL
22:07 Understanding Snapshot Isolation and Concurrency Control
28:45 The Write Process and Atomicity in DSQL
38:50 Designing Fault Tolerance in Databases
47:38 Replication and Transaction Commit Strategies
54:35 Active-Active Deployment and Fault Tolerance
01:00:14 Role of Firecracker VM in Scalability
01:09:27 Use Cases and Trade-offs of Aurora D-SQL
Marc's Blog: https://brooker.co.za/blog/
Marc on Aurora DSQL : https://brooker.co.za/blog/2024/12/03/aurora-dsql.html
AWS's documentation on Aurora DSQL : https://aws.amazon.com/rds/aurora/dsql/features/
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Like building real stuff?
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Link to other playlists. LIKE, SHARE and SUBSCRIBE
------------------------------------------------------------------------------------------------------------------------------------------------------------------
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#sql #postgres #databasesystems #aws #awsdevelopers #spanner #google #cockroachdb #yugabytedb #cap #scalability #WAL #DistributedSystems #Cloud #aurora
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/join
Membership will get you access to member only videos, exclusive notes and monthly 1:1 with me.
Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA
------------------------------------------------------------------------------------------------------------------------------------------------------------------
About this episode:
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hey folks - In this episode we have Jelte with us, who is the main contributor to the pg_duckdb project, which is a postgres extension to add the #duckdb power to our beloved #postgresql.
We will try to understand how it works? Why is it needed and what's the future of pg_duckdb?
If you love #Postgres or #Duckdb or just understanding #database internals then this episode will give you pretty solid insights into Postgres query processing, Duckdb analytics, Postgres extension ecosystem and so on.
Basics:
pg_duckdb is a Postgres extension that embeds DuckDB's columnar-vectorized analytics engine and features into Postgres. We recommend using pg_duckdb to build high performance analytics and data-intensive applications.
Chapters:
00:00 Introduction to PG-DuckDB
03:40 Understanding the Integration of DuckDB with Postgres
06:23 Architecture of PG-DuckDB: Query Processing Explained
10:02 Configuring DuckDB for Analytics Queries
15:37 Managing Workloads: Transactional vs. Analytical
21:02 Observability and Debugging in DuckDB
25:58 Data Deletion and GDPR Compliance
30:46 Schema Management and Migration Challenges
33:14 Managing Schema Changes in Databases
35:21 Upgrading Database Extensions
36:33 Enhancing Data Reading Methods
38:33 Future Features and Improvements
45:54 Use Cases for PGDuckDB
50:03 Challenges in Building the Extension
55:25 Getting Involved with PGDuckDB
Important links:
The duckdb discord server, which has a pg_duckdb channel inside it: https://discord.duckdb.org/
repo: https://github.com/duckdb/pg_duckdb
good-first-issue issues: https://github.com/duckdb/pg_duckdb/issues?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Like building real stuff?
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Link to other playlists. LIKE, SHARE and SUBSCRIBE
------------------------------------------------------------------------------------------------------------------------------------------------------------------
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#sql #postgres #databasesystems
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/join
Membership will get you access to member only videos, exclusive notes and monthly 1:1 with me.
Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA
------------------------------------------------------------------------------------------------------------------------------------------------------------------
About this episode:
------------------------------------------------------------------------------------------------------------------------------------------------------------------
In this episode we are talking to Peter and Qian, co-founders of DBOS. The conversation covers the challenges of creating fault-tolerant applications, the architecture of DBOS, and how it addresses reliability at multiple layers.
Chapters:
00:00 Introduction to the Geeknerder Podcast
00:29 Meet the Co-Founders of DBOSS
01:25 The Core Problem: Building Reliable Systems
02:05 How DBOSS Solves Reliability Issues
04:29 Understanding DBOSS Architecture
06:09 Deep Dive into DBOSS Library
08:36 Postgres and State Management
18:31 Handling Parallel Steps and Performance Concerns
26:00 Observability and Version Control
30:18 Running Multiple Code Versions
30:58 Managing Workflow Versions
32:03 Surgery on Workflow States
33:15 Library Annotations and Durable Execution
34:24 Migrating to the Cloud Version
37:23 Handling Email Workflows
42:41 Transactional Guarantees with Postgres
48:44 Technical Challenges and Multi-Tenancy
54:12 Real-World Use Cases and Benefits
59:45 Conclusion and Final Thoughts
Some important links:
- Main website: https://www.dbos.dev/
- DBOS docs: https://docs.dbos.dev/
- Open-source DBOS Transact libraries:
- Python: https://github.com/dbos-inc/dbos-transact-py
- TypeScript: https://github.com/dbos-inc/dbos-transact-ts
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Like building real stuff?
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Link to other playlists. LIKE, SHARE and SUBSCRIBE
------------------------------------------------------------------------------------------------------------------------------------------------------------------
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Deep Dive into Databases with Peter Zaitsev | The GeekNarrator Podcast
Join host Kaivalya Apte and special guest Peter Zaitsev from Percona on this episode of the Geeknerder Podcast. They discuss Peter's fascinating journey into the world of databases, founding Percona, and the evolution of open source database solutions. Topics include the rise of PostgreSQL, the comparison between MySQL and PostgreSQL, database observability, the impact of cloud and Kubernetes on database management, licensing changes in popular databases like Redis, and career advice for database administrators and developers. Stay tuned for insights on the future of databases, observability strategies, and the role of AI in database management.
00:00 Introduction and Guest Welcome
00:14 Peter's Journey into Databases
04:15 The Rise of PostgreSQL vs MySQL
18:17 Challenges in Managing Database Clusters
24:36 Common Developer Mistakes with Databases
30:59 MongoDB's Success and Future
34:53 Redis and Licensing Changes
37:07 Elastic's License Change and Its Impact
38:25 Redis Fork and Industry Collaboration
40:27 Kubernetes and Cloud-Native Databases
47:47 Challenges in Database Upgrades and Migrations
54:58 Load Testing and Observability
01:09:02 Future of Database Administration and Development
01:15:13 Conclusion and Final Thoughts
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!




