DiscoverODSC's Ai X PodcastOpen Table Formats Reshaping the Data Industry: A Deep Dive with Ryan Blue
Open Table Formats Reshaping the Data Industry: A Deep Dive with  Ryan Blue

Open Table Formats Reshaping the Data Industry: A Deep Dive with Ryan Blue

Update: 2024-03-18
Share

Description

Explore how open table formats are in the process of transforming the data industry. Take a deep dive into the unprecedented change presented by enabling data warehouses to share storage and the way it will help shape the future with Ryan Blue, Co-creator of Apache Iceberg.


Ryan Blue is a pivotal figure in the world of data engineering. His remarkable journey has seen him make substantial contributions to data infrastructure at companies like Netflix and Cloudera, and now at Tabular, a company he co-founded. 


Ryan is also an active member of the Apache Software Foundation and a committer in the Apache Parquet, Spark, Avro, and Iceberg communities. His leadership in developing Apache Iceberg has helped transform data lakes into more structured and reliable data environments, influencing how companies scale their data operations in the cloud.


Sponsored by: https://odsc.com/ 

Find more ODSC lightning interviews, webinars, live trainings, certifications, bootcamps here – https://aiplus.training/ 


Questions: 


1. An overview of data warehouses, data lakes, and data warehouses

2. Creation story behind Apache Iceberg and design philosophy

3. The Open Table Format

4. Bringing the project to the Apache Software Foundation (ASF)?

5. How has "open" benefitted Apache Iceberg?

6. The underlying architecture of the open table format and Apache Iceberg 

7. How do open table formats optimize query performance 

8. How Open Table formats maintain consistency and isolation in distributed environments (ACID)

9. Schema evolution with Open Table formats

10. Time travel and its applications

11. Traditional data formats cost and optimizing storage costs for data lakes,

12. Challenges and considerations organizations should be aware of when adopting open table formats

13. Open Table Formats for Machine Learning and ML workflow pipelines

14. How open table formats are reshaping the data industry

15. How are open table formats improving data management for organizations? Can

16. Ensuring a successful implementation of Open Table formats

17. What’s next for open table formats and Apache Iceberg? 


Some useful links: 


Visit the Apache Iceberg website for documentation, guides, and community resources: https://iceberg.apache.org/


Learn more about Tabular here - tabular.io


The Case for Independent Storage - https://tabular.io/blog/the-case-for-independent-storage/


Iceberg in Modern Data Architecture - https://tabular.io/blog/iceberg-in-modern-data-architecture/

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Open Table Formats Reshaping the Data Industry: A Deep Dive with  Ryan Blue

Open Table Formats Reshaping the Data Industry: A Deep Dive with Ryan Blue