Episode 27: Enhancing RAG based Gen AI Applications with Unstructured Data
Description
Today we join Maria Khalusova, Staff Developer Advocate with Unstructured.IO, to discuss how companies can unlock their Unstructured Data to deliver better results from their Large Language Models. We talk about how Unstructured Data can enhance the performance of RAG applications, RAG vs Fine Tuning, data Chunking, Multi-Modal models and more.
AWS Hosts: Nolan Chen & Malini Chatterjee
Unstructured Enterprise Platform beta signup:
https://unstructured.io/platform
Embedding models MTEB Leaderboard:
https://huggingface.co/spaces/mteb/leaderboard
2019 Deloitte report (source of the statistics that only 18% of organizations were using unstructured data):
https://www2.deloitte.com/us/en/insights/topics/analytics/insight-driven-organization.html
80% of data is unstructured, source: https://mitsloan.mit.edu/ideas-made-to-matter/tapping-power-unstructured-data
Papers showing RAG outperforming fine-tuning:
https://arxiv.org/abs/2312.05934
https://arxiv.org/abs/2401.08406
Email Your Feedback: rethinkpodcast@amazon.com