Meta (Llama 2) vs Open AI
Description
On this episode: Llama, AI models, Weights Leakage, Comparison, Usage, Meta vs Open AI, Data management by Meta, Open Source AI, AI research groups, Instruction, fine-tuning, RLHF, Hosting Llama, AWS Bedrock, Azure AI Models, local finetuning, and guardrails, ChatGPT, GPT-4, GPT-3, GPT-3.5
Watch Using AI on YouTube (and see our daft AI-generated background images):
https://www.youtube.com/@genieai
Links:
- Llama Access Form (meta.com): https://ai.meta.com/resources/models-and-libraries/llama-downloads/
- Llama 2 on Hugging face: https://huggingface.co/meta-llama
- Llama 2 7B Chat on Hugging Face https://huggingface.co/spaces/huggingface-projects/llama-2-7b-chat
- Meta uses copyright ignored books on AI training: https://www.reuters.com/technology/meta-used-copyright-ignored-books-ai-training-despite-its-own-lawyers-warnings-authors-2023-12-12/
- Meta and IBM's Open Source AI partnership and it's lack of inclusion of OpenAI, Google, Microsoft: https://www.theguardian.com/technology/2023/dec/05/open-source-ai-meta-ibm
- Yann LeCun's social media attack on OpenAI, Google, Microsoft etc.: https://www.reddit.com/r/technology/s/IumG20ZOKz
- Model Weights Leakage: https://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misuse
- “We have no moat” - Google AI researcher: https://www.semianalysis.com/p/google-we-have-no-moat-and-neither
- Restrictions on using Llama: https://spectrum.ieee.org/open-source-llm-not-open
Welcome to another episode of Using AI. I'm your usual host, Alex Denne, and today, I'm accompanied by Alex Pap and Nitish Mutha (Founder of Legaltech Genie AI).
We start by introducing Llama and discuss its weights leaking incident. We also elaborate how Llama compares to other AI models and explain how to use it. The conversation takes a turn towards the Meta vs Open AI dispute, shedding light on their differences and impact in this space. We also discuss Meta's data management and how it can actually come up trumps on both privacy strategy here, and non-copyrighted multi-lingual training data.