DiscoverAWS re:Think PodcastEpisode 27: Enhancing RAG based Gen AI Applications with Unstructured Data
Episode 27: Enhancing RAG based Gen AI Applications with Unstructured Data

Episode 27: Enhancing RAG based Gen AI Applications with Unstructured Data

Update: 2024-07-02
Share

Description

Today we join Maria Khalusova, Staff Developer Advocate with Unstructured.IO, to discuss how companies can unlock their Unstructured Data to deliver better results from their Large Language Models. We talk about how Unstructured Data can enhance the performance of RAG applications, RAG vs Fine Tuning, data Chunking, Multi-Modal models and more. 

AWS Hosts: Nolan Chen & Malini Chatterjee

Unstructured Enterprise Platform beta signup: 

https://unstructured.io/platform


Embedding models MTEB Leaderboard: 

https://huggingface.co/spaces/mteb/leaderboard


2019 Deloitte report (source of the statistics that only 18% of organizations were using unstructured data):

https://www2.deloitte.com/us/en/insights/topics/analytics/insight-driven-organization.html


80% of data is unstructured, source: https://mitsloan.mit.edu/ideas-made-to-matter/tapping-power-unstructured-data


Papers showing RAG outperforming fine-tuning: 

https://arxiv.org/abs/2312.05934

https://arxiv.org/abs/2401.08406


Email Your Feedback: rethinkpodcast@amazon.com

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Episode 27: Enhancing RAG based Gen AI Applications with Unstructured Data

Episode 27: Enhancing RAG based Gen AI Applications with Unstructured Data