Yannic Kilcher Videos (Audio Only)

177 Episodes

Reverse

Efficient Streaming Language Models with Attention Sinks (Paper Explained)

2023-10-1732:26

#llm #ai #chatgpt How does one run inference for a generative autoregressive language model that has been trained with a fixed context size? Streaming LLMs combine the performance of windowed attention, but avoid the drop in performance by using attention sinks - an interesting phenomenon where the token at position 0 acts as an absorber of "extra" attention. OUTLINE: 0:00 - Introduction 1:20 - What is the problem? 10:30 - The hypothesis: Attention Sinks 15:10 - Experimental evidence 18:45 - Streaming LLMs 20:45 - Semantics or position? 22:30 - Can attention sinks be learned? 27:45 - More experiments 30:10 - Comparison to Big Bird Paper: https://arxiv.org/abs/2309.17453 Abstract: Deploying Large Language Models (LLMs) in streaming applications such as multi-round dialogue, where long interactions are expected, is urgently needed but poses two major challenges. Firstly, during the decoding stage, caching previous tokens' Key and Value states (KV) consumes extensive memory. Secondly, popular LLMs cannot generalize to longer texts than the training sequence length. Window attention, where only the most recent KVs are cached, is a natural approach -- but we show that it fails when the text length surpasses the cache size. We observe an interesting phenomenon, namely attention sink, that keeping the KV of initial tokens will largely recover the performance of window attention. In this paper, we first demonstrate that the emergence of attention sink is due to the strong attention scores towards initial tokens as a ``sink'' even if they are not semantically important. Based on the above analysis, we introduce StreamingLLM, an efficient framework that enables LLMs trained with a finite length attention window to generalize to infinite sequence lengths without any fine-tuning. We show that StreamingLLM can enable Llama-2, MPT, Falcon, and Pythia to perform stable and efficient language modeling with up to 4 million tokens and more. In addition, we discover that adding a placeholder token as a dedicated attention sink during pre-training can further improve streaming deployment. In streaming settings, StreamingLLM outperforms the sliding window recomputation baseline by up to 22.2x speedup. Code and datasets are provided at this https URL. Authors: Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution (Paper Explained)

2023-10-1746:44

#ai #promptengineering #evolution Promptbreeder is a self-improving self-referential system for automated prompt engineering. Give it a task description and a dataset, and it will automatically come up with appropriate prompts for the task. This is achieved by an evolutionary algorithm where not only the prompts, but also the mutation-prompts are improved over time in a population-based, diversity-focused approach. OUTLINE: 0:00 - Introduction 2:10 - From manual to automated prompt engineering 10:40 - How does Promptbreeder work? 21:30 - Mutation operators 36:00 - Experimental Results 38:05 - A walk through the appendix Paper: https://arxiv.org/abs/2309.16797 Abstract: Popular prompt strategies like Chain-of-Thought Prompting can dramatically improve the reasoning abilities of Large Language Models (LLMs) in various domains. However, such hand-crafted prompt-strategies are often sub-optimal. In this paper, we present Promptbreeder, a general-purpose self-referential self-improvement mechanism that evolves and adapts prompts for a given domain. Driven by an LLM, Promptbreeder mutates a population of task-prompts, and subsequently evaluates them for fitness on a training set. Crucially, the mutation of these task-prompts is governed by mutation-prompts that the LLM generates and improves throughout evolution in a self-referential way. That is, Promptbreeder is not just improving task-prompts, but it is also improving the mutationprompts that improve these task-prompts. Promptbreeder outperforms state-of-the-art prompt strategies such as Chain-of-Thought and Plan-and-Solve Prompting on commonly used arithmetic and commonsense reasoning benchmarks. Furthermore, Promptbreeder is able to evolve intricate task-prompts for the challenging problem of hate speech classification. Authors: Chrisantha Fernando, Dylan Banarse, Henryk Michalewski, Simon Osindero, Tim Rocktäschel Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Retentive Network: A Successor to Transformer for Large Language Models (Paper Explained)

2023-10-0528:25

#ai #retnet #transformers Retention is an alternative to Attention in Transformers that can both be written in a parallel and in a recurrent fashion. This means the architecture achieves training parallelism while maintaining low-cost inference. Experiments in the paper look very promising. OUTLINE: 0:00 - Intro 2:40 - The impossible triangle 6:55 - Parallel vs sequential 15:35 - Retention mechanism 21:00 - Chunkwise and multi-scale retention 24:10 - Comparison to other architectures 26:30 - Experimental evaluation Paper: https://arxiv.org/abs/2307.08621 Abstract: In this work, we propose Retentive Network (RetNet) as a foundation architecture for large language models, simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection between recurrence and attention. Then we propose the retention mechanism for sequence modeling, which supports three computation paradigms, i.e., parallel, recurrent, and chunkwise recurrent. Specifically, the parallel representation allows for training parallelism. The recurrent representation enables low-cost O(1) inference, which improves decoding throughput, latency, and GPU memory without sacrificing performance. The chunkwise recurrent representation facilitates efficient long-sequence modeling with linear complexity, where each chunk is encoded parallelly while recurrently summarizing the chunks. Experimental results on language modeling show that RetNet achieves favorable scaling results, parallel training, low-cost deployment, and efficient inference. The intriguing properties make RetNet a strong successor to Transformer for large language models. Code will be available at this https URL. Authors: Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Reinforced Self-Training (ReST) for Language Modeling (Paper Explained)

2023-10-0553:06

#ai #rlhf #llm ReST uses a bootsrap-like method to produce its own extended dataset and trains on ever higher-quality subsets of it to improve its own reward. The method allows for re-using the same generated data multiple times and thus has an efficiency advantage with respect to Online RL techniques like PPO. Paper: https://arxiv.org/abs/2308.08998 Abstract: Reinforcement learning from human feedback (RLHF) can improve the quality of large language model's (LLM) outputs by aligning them with human preferences. We propose a simple algorithm for aligning LLMs with human preferences inspired by growing batch reinforcement learning (RL), which we call Reinforced Self-Training (ReST). Given an initial LLM policy, ReST produces a dataset by generating samples from the policy, which are then used to improve the LLM policy using offline RL algorithms. ReST is more efficient than typical online RLHF methods because the training dataset is produced offline, which allows data reuse. While ReST is a general approach applicable to all generative learning settings, we focus on its application to machine translation. Our results show that ReST can substantially improve translation quality, as measured by automated metrics and human evaluation on machine translation benchmarks in a compute and sample-efficient manner. Authors: Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Ksenia Konyushkova, Lotte Weerts, Abhishek Sharma, Aditya Siddhant, Alex Ahern, Miaosen Wang, Chenjie Gu, Wolfgang Macherey, Arnaud Doucet, Orhan Firat, Nando de Freitas Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

[ML News] LLaMA2 Released | LLMs for Robots | Multimodality on the Rise

2023-08-2844:10

#mlnews #llama2 #openai Your regular irregular update on the world of Machine Learning. References: https://twitter.com/ylecun/status/1681336284453781505 https://ai.meta.com/llama/ https://about.fb.com/news/2023/07/llama-2-statement-of-support/ https://247wallst.com/special-report/2023/08/12/this-is-the-biggest-social-media-platform-ranking-the-worlds-largest-networking-sites/4/ https://github.com/Alpha-VLLM/LLaMA2-Accessory https://together.ai/blog/llama-2-7b-32k?s=09&utm_source=pocket_saves https://github.com/imoneoi/openchat https://twitter.com/lmsysorg/status/1686794639469371393?s=09&t=sS3awkbavmSMSmwp64Ef4A&utm_source=pocket_saves https://huggingface.co/lmsys/vicuna-13b-v1.5-16k https://blog.google/outreach-initiatives/public-policy/google-microsoft-openai-anthropic-frontier-model-forum/ https://www.earthdata.nasa.gov/news/impact-ibm-hls-foundation-model?utm_source=pocket_reader https://huggingface.co/ibm-nasa-geospatial/Prithvi-100M https://ai.meta.com/blog/generative-ai-text-images-cm3leon/ https://www.deepmind.com/blog/rt-2-new-model-translates-vision-and-language-into-action?utm_source=twitter&utm_medium=social&utm_campaign=rt2 https://arxiv.org/abs/2307.14334 https://sites.research.google/med-palm/ https://open-catalyst.metademolab.com/?utm_source=twitter&utm_medium=organic_social&utm_campaign=opencatalyst&utm_content=card https://open-catalyst.metademolab.com/demo https://www.anthropic.com/index/claude-2?utm_source=pocket_reader https://claude.ai/login https://audiocraft.metademolab.com/?utm_source=pocket_saves https://venturebeat.com/programming-development/stability-ai-launches-stablecode-an-llm-for-code-generation/ https://stability.ai/blog/stablecode-llm-generative-ai-coding https://twitter.com/JeffDean/status/1686806525862608896?s=09&t=LG2z9ok9QExHbSy0fvBsxA&utm_source=pocket_saves https://sites.research.google/open-buildings/ https://twitter.com/deliprao/status/1687283117873106946?s=09&t=1NmC-B55Z8IuF_HTuGOo7w&utm_source=pocket_saves https://arxiv.org/pdf/2308.01320.pdf https://twitter.com/javilopen/status/1687795349719547905?utm_source=pocket_saves https://research.nvidia.com/labs/par/Perfusion/ https://ar5iv.labs.arxiv.org/html/2307.14936 https://www.linkedin.com/feed/update/urn:li:activity:7093463974750371840/?utm_source=pocket_saves https://huggingface.co/syzymon/long_llama_3b_instruct https://arxiv.org/abs/2307.03170 https://dynalang.github.io/ https://github.com/mlfoundations/open_flamingo https://twitter.com/akshay_pachaar/status/1687079353937698816?s=09&t=fos8QSCsGEEM6dMflhq0Mg&utm_source=pocket_saves https://github.com/OpenBMB/ToolBench https://llm-attacks.org/ https://arstechnica.com/information-technology/2023/07/openai-discontinues-its-ai-writing-detector-due-to-low-rate-of-accuracy/ https://sites.google.com/view/steve-1 https://github.com/Shalev-Lifshitz/STEVE-1 https://erichartford.com/dolphin https://huggingface.co/ehartford/dolphin-llama-13b https://www.mosaicml.com/blog/long-context-mpt-7b-8k https://twitter.com/camenduru/status/1688045780244848640?s=09&t=ubJ2Qtz-TG6Xo3_GMtt2Cw&utm_source=pocket_saves https://github.com/IDEA-Research/DWPose https://twitter.com/tri_dao/status/1680987577913065472?s=09&t=Q181vFmM6d3nDq-5BwfDeg&utm_source=pocket_saves https://tridao.me/publications/flash2/flash2.pdf https://thehackernews.com/2023/07/wormgpt-new-ai-tool-allows.html https://www.tomshardware.com/news/ai-steals-data-with-keystroke-audio https://arxiv.org/pdf/2308.01074.pdf https://www.foxnews.com/politics/ai-test-flight-air-force-unmanned-wingman-aircraft https://www.theverge.com/2023/8/2/23817406/white-castle-soundhound-ai-sliders https://www.google.com/search?sca_esv=556495916&q=food+delivery+bot+kicked&tbm=vid&source=lnms&sa=X&ved=2ahUKEwjZ6PDPrdmAAxUThf0HHWzrBGgQ0pQJegQIChAB&cshid=1691920142432720&biw=2327&bih=1180&dpr=2.2 https://www.youtube.com/watch?v=--n_NhmXnfc https://www.thesun.co.uk/tech/20793591/coop-delivery-robots-cambridge-kicked-by-workers-tiktok/

How Cyber Criminals Are Using ChatGPT (w/ Sergey Shykevich)

2023-08-2829:08

#cybercrime #chatgpt #security An interview with Sergey Shykevich, Threat Intelligence Group Manager at Check Point, about how models like ChatGPT have impacted the realm of cyber crime. https://threatmap.checkpoint.com/ Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Recipe AI suggests FATAL CHLORINE GAS Recipe

2023-08-2807:05

#llm #safety #gpt4 A prime example of intellectual dishonesty of journalists and AI critics. Article: https://gizmodo.com/paknsave-ai-savey-recipe-bot-chlorine-gas-1850725057 My Recipe AI: https://github.com/yk/recipe-ai Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

DeepFloyd IF - Pixel-Based Text-to-Image Diffusion (w/ Authors)

2023-08-2853:31

#ai #diffusion #stabilityai An interview with DeepFloyd members Misha Konstantinov and Daria Bakshandaeva on the release of the model IF, an open-source model following Google's implementation of Imagen. References: https://www.deepfloyd.ai/deepfloyd-if https://huggingface.co/DeepFloyd https://twitter.com/_gugutse_ https://twitter.com/_bra_ket Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

[ML News] GPT-4 solves MIT Exam with 100% ACCURACY | OpenLLaMA 13B released

2023-08-2831:04

#gpt4 #mit #ai A new paper claims to use GPT-4 to solve 100% of a set of MIT university exercises. Some people are skeptic and their investigations reveal more than one problem with this paper... OUTLINE: 0:00 - ChatGPT gives out Windows 10 keys 0:30 - MIT exam paper 2:50 - Prompt engineering 5:30 - Automatic grading 6:45 - Response by other MIT students 8:30 - Unsolvable questions 10:50 - Duplicates 13:30 - Cascading the heuristics 22:40 - Other problems 29:25 - OpenLLaMA 13B published References: https://twitter.com/immasiddtweets/status/1669721470006857729/photo/1https://arxiv.org/abs/2306.08997https://arxiv.org/pdf/2306.08997.pdfhttps://flower-nutria-41d.notion.site/No-GPT4-can-t-ace-MIT-b27e6796ab5a48368127a98216c76864https://github.com/idrori/MITQ/commit/3feee1026318e537c0ad27968001ef76e4a36890https://twitter.com/hardmaru/status/1670246674760077312https://twitter.com/giffmana/status/1670258748286472193https://twitter.com/T3816440886465/status/1670127224131862531https://twitter.com/qrdl/status/1669856336652414977https://www.chegg.com/homework-help/questions-and-answers/consider-mdp-set-possible-states-mathcal-s-0-1-2-3-set-possible-actions-mathcal-b-c--rewar-q111042613https://github.com/openlm-research/open_llamahttps://huggingface.co/openlm-research/open_llama_13b Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (Explained)

2023-08-2835:44

#stablediffusion #ai #watermark Watermarking the outputs of generative models is usually done as a post-processing step on the model outputs. Tree-Ring Watermarks are applied in the latent space at the beginning of a diffusion process, which makes them nearly undetectable, robust to strong distortions, and only recoverable by the model author. It is a very promising technique with applications potentially beyond watermarking itself. OUTLINE: 0:00 - Introduction & Overview 1:30 - Why Watermarking? 4:20 - Diffusion Models Recap 13:40 - Inverting Diffusion Models 17:05 - Tree-Ring Watermarking 26:15 - Effects of Tree-Ring Watermarks 30:00 - Experimental Results 32:40 - Limitations 34:40 - Conclusion Paper: https://arxiv.org/abs/2305.20030 Abstract: Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this paper, we introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs. Unlike existing methods that perform post-hoc modifications to images after sampling, Tree-Ring Watermarking subtly influences the entire sampling process, resulting in a model fingerprint that is invisible to humans. The watermark embeds a pattern into the initial noise vector used for sampling. These patterns are structured in Fourier space so that they are invariant to convolutions, crops, dilations, flips, and rotations. After image generation, the watermark signal is detected by inverting the diffusion process to retrieve the noise vector, which is then checked for the embedded signal. We demonstrate that this technique can be easily applied to arbitrary diffusion models, including text-conditioned Stable Diffusion, as a plug-in with negligible loss in FID. Our watermark is semantically hidden in the image space and is far more robust than watermarking alternatives that are currently deployed. Code is available at this https URL. Authors: Yuxin Wen, John Kirchenbauer, Jonas Geiping, Tom Goldstein Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

RWKV: Reinventing RNNs for the Transformer Era (Paper Explained)

2023-08-2801:02:16

#gpt4 #rwkv #transformer We take a look at RWKV, a highly scalable architecture between Transformers and RNNs. Fully Connected (June 7th in SF) Promo Link: https://www.fullyconnected.com/?promo=ynnc OUTLINE: 0:00 - Introduction 1:50 - Fully Connected In-Person Conference in SF June 7th 3:00 - Transformers vs RNNs 8:00 - RWKV: Best of both worlds 12:30 - LSTMs 17:15 - Evolution of RWKV's Linear Attention 30:40 - RWKV's Layer Structure 49:15 - Time-Parallel vs Sequence Mode 53:55 - Experimental Results & Limitations 58:00 - Visualizations 1:01:40 - Conclusion Paper: https://arxiv.org/abs/2305.13048 Code: https://github.com/BlinkDL/RWKV-LM Abstract: Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scaling in memory and computational requirements but struggle to match the same performance as Transformers due to limitations in parallelization and scalability. We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of Transformers with the efficient inference of RNNs. Our approach leverages a linear attention mechanism and allows us to formulate the model as either a Transformer or an RNN, which parallelizes computations during training and maintains constant computational and memory complexity during inference, leading to the first non-transformer architecture to be scaled to tens of billions of parameters. Our experiments reveal that RWKV performs on par with similarly sized Transformers, suggesting that future work can leverage this architecture to create more efficient models. This work presents a significant step towards reconciling the trade-offs between computational efficiency and model performance in sequence processing tasks. Authors: Bo Peng, Eric Alcaide, Quentin Anthony, Alon Albalak, Samuel Arcadinho, Huanqi Cao, Xin Cheng, Michael Chung, Matteo Grella, Kranthi Kiran GV, Xuzheng He, Haowen Hou, Przemyslaw Kazienko, Jan Kocon, Jiaming Kong, Bartlomiej Koptyra, Hayden Lau, Krishna Sri Ipsit Mantri, Ferdinand Mom, Atsushi Saito, Xiangru Tang, Bolun Wang, Johan S. Wind, Stansilaw Wozniak, Ruichong Zhang, Zhenyuan Zhang, Qihang Zhao, Peng Zhou, Jian Zhu, Rui-Jie Zhu Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

2023-08-2829:28

#gpt4 #ai #prompt Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting. OUTLINE: 0:00 - Introduction 1:20 - From Chain-of-Thought to Tree-of-Thought 11:10 - Formalizing the algorithm 16:00 - Game of 24 & Creative writing 18:30 - Crosswords 23:30 - Is this a general problem solver? 26:50 - Ablation studies 28:55 - Conclusion Paper: https://arxiv.org/abs/2305.10601 Abstract: Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL. Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

OpenAI suggests AI licenses (US Senate hearing on AI regulation w/ Sam Altman)

2023-08-2816:12

#ai #openai #gpt4 US Senate hearing on AI regulation. MLST video on the hearing: https://www.youtube.com/watch?v=DeSXnESGxr4 Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

[ML News] Geoff Hinton leaves Google | Google has NO MOAT | OpenAI down half a billion

2023-08-2839:06

#google #openai #mlnews Updates from the world of Machine Learning and AI Great AI memes here: https://twitter.com/untitled01ipynb OUTLINE: 0:00 - Google I/O 2023: Generative AI in everything 0:20 - Anthropic announces 100k tokens context 0:35 - Intro 1:20 - Geoff Hinton leaves Google 7:00 - Google memo leaked: we have no moat 11:30 - OpenAI loses 540M 12:30 - Google AI: Product first 15:50 - Ilya Sutskever on safety vs competition 18:00 - AI works cannot be copyrighted 19:40 - OpenAI tries to trademark GPT 20:30 - StarCoder: accessible code model 21:40 - RedPyjama & OpenLlama 22:55 - Mosaic 7B model 23:50 - YoloNAS 24:10 - Mojo programming language 25:30 - Random helpful things 37:40 - DeepMind soccer robots References: https://twitter.com/weirddalle/status/1649908805788893185https://www.nytimes.com/2023/05/01/technology/ai-google-chatbot-engineer-quits-hinton.htmlhttps://www.technologyreview.com/2023/05/01/1072478/deep-learning-pioneer-geoffrey-hinton-quits-google/https://archive.ph/TrPoHhttps://twitter.com/DanHendrycks/status/1654560913939374080https://twitter.com/ylecun/status/1654930029569101824https://twitter.com/homehttps://twitter.com/ylecun/status/1654931495419621376https://twitter.com/pkedrosky/status/1653955254181068801https://www.semianalysis.com/p/google-we-have-no-moat-and-neitherhttps://twitter.com/untitled01ipynb/mediahttps://www.theinformation.com/articles/openais-losses-doubled-to-540-million-as-it-developed-chatgpthttps://archive.ph/bKsdMhttps://www.washingtonpost.com/technology/2023/05/04/google-ai-stop-sharing-research/https://twitter.com/giffmana/status/1654962145707130880https://twitter.com/Ken_Goldberg/status/1651309843804987393https://tsdr.uspto.gov/documentviewer?caseId=sn97733259&docId=PTD20230418160641&s=09#docIndex=1&page=1https://twitter.com/osanseviero/status/1654230764513370112https://huggingface.co/bigcode/starcoderhttps://huggingface.co/spaces/bigcode/bigcode-model-license-agreementhttps://twitter.com/hardmaru/status/1654649036333514753https://www.together.xyz/blog/redpajama-models-v1https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-3B-v1https://github.com/openlm-research/open_llamahttps://www.mosaicml.com/blog/mpt-7bhttps://github.com/Deci-AI/super-gradients/blob/master/YOLONAS.mdhttps://www.modular.com/mojohttps://www.aicrowd.com/challenges/hackaprompt-2023https://learnprompting.org/https://developer.nvidia.com/blog/nvidia-enables-trustworthy-safe-and-secure-large-language-model-conversational-systems/?ncid=prsy-552511https://blogs.nvidia.com/blog/2023/04/25/ai-chatbot-guardrails-nemo/https://lmql.ai/#distributionhttps://github.com/gventuri/pandas-ai?utm_source=pocket_readerhttps://lamini.ai/blog/introducing-laminihttps://github.com/deep-floyd/IFhttps://huggingface.co/spaces/DeepFloyd/IFhttps://twitter.com/FaramaFound/status/1650952295901720576https://txt.cohere.com/embedding-archives-wikipedia/?hsa_acc=509563538&hsa_ad=242008083&hsa_cam=626636963&hsa_grp=205646033&hsa_net=linkedin&hsa_ver=3&hss_channel=lcp-24024765https://arxiv.org/abs/2304.12210https://github.com/h2oai/h2ogpthttps://huggingface.co/h2oai/h2ogpt-oasst1-512-20bhttps://github.com/h2oai/h2o-llmstudiohttps://ai.facebook.com/blog/ai-dataset-animating-kids-drawings/https://www.camel-ai.org/https://github.com/lightaime/camel?utm_source=pocket_readerhttps://huggingface.co/Writer/camel-5b-hfhttps://laion.ai/blog/paella/https://magazine.sebastianraschka.com/p/finetuning-large-language-modelshttps://pickapic.io/https://github.com/yuvalkirstain/heroku_apphttps://huggingface.co/datasets/yuvalkirstain/PickaPichttps://future.snorkel.ai/poster-contest/https://twitter.com/d_feldman/status/1649466422018318338/photo/4https://twitter.com/DeepMind/status/1651897358894919680https://arxiv.org/abs/2304.13653https://twitter.com/SmokeAwayyy/status/1652712832738422784 If you want to support me, the best thing to do is to share out the content :)

Scaling Transformer to 1M tokens and beyond with RMT (Paper Explained)

2023-08-2824:33

#ai #transformer #gpt4 This paper promises to scale transformers to 1 million tokens and beyond. We take a look at the technique behind it: The Recurrent Memory Transformer, and what its strenghts and weaknesses are. OUTLINE: 0:00 - Intro 2:15 - Transformers on long sequences 4:30 - Tasks considered 8:00 - Recurrent Memory Transformer 19:40 - Experiments on scaling and attention maps 24:00 - Conclusion Paper: https://arxiv.org/abs/2304.11062 Abstract: This technical report presents the application of a recurrent memory to extend the context length of BERT, one of the most effective Transformer-based models in natural language processing. By leveraging the Recurrent Memory Transformer architecture, we have successfully increased the model's effective context length to an unprecedented two million tokens, while maintaining high memory retrieval accuracy. Our method allows for the storage and processing of both local and global information and enables information flow between segments of the input sequence through the use of recurrence. Our experiments demonstrate the effectiveness of our approach, which holds significant potential to enhance long-term dependency handling in natural language understanding and generation tasks as well as enable large-scale context processing for memory-intensive applications. Authors: Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

OpenAssistant RELEASED! The world's best open-source Chat AI!

2023-08-2821:05

#openassistant #chatgpt #mlnews Try the chat: https://open-assistant.io/chat Homepage: https://open-assistant.io Dataset: https://huggingface.co/datasets/OpenAssistant/oasst1 Code: https://github.com/LAION-AI/Open-Assistant Paper (temporary): https://ykilcher.com/oa-paper Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

OpenAssistant First Models are here! (Open-Source ChatGPT)

2023-08-2816:52

#openassistant #chatgpt #gpt4https://open-assistant.io/chathttps://huggingface.co/OpenAssistanthttps://github.com/LAION-AI/Open-Assistant Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

The biggest week in AI (GPT-4, Office Copilot, Google PaLM, Anthropic Claude & more)

2023-08-2841:01

#mlnews #gpt4 #copilot Your weekly news all around the AI world Check out W&B courses (free): https://wandb.courses/ OUTLINE: 0:00 - Intro 0:20 - GPT-4 announced! 4:30 - GigaGAN: The comeback of Generative Adversarial Networks 7:55 - ChoppedAI: AI Recipes 8:45 - Samsung accused of faking space zoom effect 14:00 - Weights & Biases courses are free 16:55 - Data Portraits 18:50 - Data2Vec 2.0 19:50 - Gated Models on Hugging Face & huggingface.js 22:05 - Visual ChatGPT 23:35 - Bing crosses 100 million daily active users 24:50 - Casual Conversations Dataset 25:50 - Anthropic AI Safety Research 27:30 - Magnushammer & more advances in AI-assisted math 30:30 - LLaMA license change PR 32:00 - Self-Instruct dataset 33:35 - PaLM-E: Multimodal Pathways 35:45 - USM: Universal Speech Model 37:25 - GILGEN: Grounded Text-to-Image 39:55 - Fruit Fly Connectome released References: https://www.heise.de/news/GPT-4-kommt-naechste-Woche-und-es-wird-multimodal-Vorankuendigung-von-Microsoft-7540383.htmlhttps://mingukkang.github.io/GigaGAN/https://www.choppedai.com/https://www.reddit.com/r/Android/comments/11nzrb0/samsung_space_zoom_moon_shots_are_fake_and_here/https://imgur.com/ULVX933https://imgur.com/9XMgt06https://imgur.com/9kichAphttps://imgur.com/RSHAz1lhttps://imgur.com/PIAjVKphttps://imgur.com/xEyLajWhttps://imgur.com/3STX9mZhttps://imgur.com/ifIHr3Shttps://imgur.com/bXJOZgIhttps://dataportraits.org/https://arxiv.org/abs/2303.03919https://arxiv.org/pdf/2303.03919.pdfhttps://ai.facebook.com/blog/ai-self-supervised-learning-data2vec/https://github.com/facebookresearch/fairseq/tree/main/examples/data2vechttps://huggingface.co/docs/hub/models-gatedhttps://huggingface.co/abouthttps://github.com/huggingface/huggingface.js?utm_source=pocket_readerhttps://github.com/microsoft/visual-chatgpthttps://arxiv.org/abs/2303.04671https://github.com/microsoft/visual-chatgpt/blob/main/visual_chatgpt.pyhttps://huggingface.co/spaces/RamAnanth1/visual-chatGPThttps://www.engadget.com/microsoft-bing-crossed-100-million-daily-active-users-080138371.htmlhttps://ai.facebook.com/blog/casual-conversations-v2-dataset-measure-fairness/https://ai.facebook.com/datasets/casual-conversations-v2-dataset/https://www.anthropic.com/index/core-views-on-ai-safetyhttps://arxiv.org/abs/2303.04488https://arxiv.org/pdf/2303.04488.pdfhttps://arxiv.org/abs/2303.04910https://arxiv.org/pdf/2303.04910.pdfhttps://twitter.com/astro_wassim/status/1633645134934949888https://ai.papers.bar/paper/ede58b1ebca911ed8f9c3d8021bca7c8https://arxiv.org/pdf/2303.03192.pdfhttps://www.theverge.com/2023/3/8/23629362/meta-ai-language-model-llama-leak-online-misusehttps://knightcolumbia.org/blog/the-llama-is-out-of-the-bag-should-we-expect-a-tidal-wave-of-disinformationhttps://github.com/facebookresearch/llama/pull/184https://huggingface.co/datasets/yizhongw/self_instructhttps://openai.com/policies/terms-of-usehttps://palm-e.github.io/https://pickapic.io/https://ai.googleblog.com/2023/03/universal-speech-model-usm-state-of-art.htmlhttps://arxiv.org/abs/2303.01037https://github.com/BlinkDL/RWKV-LM?utm_source=pocket_readerhttps://gligen.github.io/https://github.com/microsoft/GLIPhttps://arxiv.org/abs/2301.07093https://huggingface.co/spaces/gligen/demohttps://www.sciencealert.com/the-first-ever-complete-map-of-an-insect-brain-is-truly-mesmerizinghttps://en.wikipedia.org/wiki/Tidal_locking Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :)

GPT-4 is here! What we know so far (Full Analysis)

2023-08-2834:09

#gpt4 #chatgpt #openai References: https://openai.com/product/gpt-4https://openai.com/research/gpt-4https://cdn.openai.com/papers/gpt-4.pdf Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

This ChatGPT Skill will earn you $10B (also, AI reads your mind!)

2023-08-2843:27

#mlnews #chatgpt #llama ChatGPT goes around the world and is finally available via API. Stunning mind-reading performed using fMRI and Stable Diffusion. LLaMA weights leak and hilarity ensues. GTC23 is around the corner! ERRATA: It's a 4090, not a 4090 ti 🙃 OUTLINE: 0:00 - Introduction 0:20 - GTC 23 on March 20 1:55 - ChatGPT API is out! 4:50 - OpenAI becomes more business-friendly 7:15 - OpenAI plans for AGI 10:00 - ChatGPT influencers 12:15 - Open-Source Prompting Course 12:35 - Flan UL2 20B 13:30 - LLaMA weights leaked 15:50 - Mind-Reading from fMRI 20:10 - Random News / Helpful Things 25:30 - Interview with Bryan Catanzaro Participate in the GTC Raffle: https://ykilcher.com/gtc References: GTC 23 on March 20 https://www.nvidia.com/gtc/https://ykilcher.com/gtc ChatGPT API is out! https://twitter.com/gdb/status/1630991925984755714https://openai.com/blog/introducing-chatgpt-and-whisper-apishttps://twitter.com/greggyb/status/1631121912679002112https://www.haihai.ai/chatgpt-api/ OpenAI becomes more business-friendly https://twitter.com/sama/status/1631002519311888385https://techcrunch.com/2023/02/21/openai-foundry-will-let-customers-buy-dedicated-capacity-to-run-its-ai-models/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAFL1O8s22qBsEtytYZWR7O2VlTa9nAGhdZPFfeQfZCDWjkNBIac7WlDikRNLEH1tqSszUN02ouqRyyCsShDa1kQyUbiApD1IUPfgmHXZxgIMFxr8bwr8BuBa7sK55dYqMRFFbE7YILuBn_rmj7aJI1tp7GAXubODfCUaKvOkoOYjhttps://www.bain.com/vector-digital/partnerships-alliance-ecosystem/openai-alliance/ OpenAI plans for AGI https://openai.com/blog/planning-for-agi-and-beyond ChatGPT influencers https://www.youtube.com/watch?v=4kp7oVTu9Ckhttps://www.youtube.com/watch?v=k13v8jp8H5ohttps://www.linkedin.com/posts/eniascailliau_create-an-online-course-100-ai-ugcPost-7036969935796891648-H_uj/https://www.linkedin.com/posts/linasbeliunas_must-know-ai-tools-ugcPost-7035700089947836416-Qri4/https://twitter.com/LinusEkenstam/status/1629879567514238976https://www.linkedin.com/posts/imarpit_50-awesome-chatgpt-prompts-ugcPost-7036905788631646209-2CU-/ Open-Source Prompting Course https://learnprompting.org/ Flan UL2 20B https://www.yitay.net/blog/flan-ul2-20bhttps://huggingface.co/google/flan-ul2 LLaMA weights leaked https://github.com/facebookresearch/llama/pull/73https://github.com/facebookresearch/llama/pull/73/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5https://github.com/ChristopherKing42https://open-assistant.io/dashboard Mind-Reading from fMRI https://sites.google.com/view/stablediffusion-with-brain/?s=09https://www.nature.com/articles/s41562-022-01516-2?utm_content=animation Random News https://www.wired.com/story/alphabet-layoffs-hit-trash-sorting-robots/https://huggingface.co/blog/fast-mac-diffusershttps://pyribs.org/https://twitter.com/rowancheung/status/1630569844654460928https://pimeyes.com/enhttps://cacti-framework.github.io/https://twitter.com/bhutanisanyam1/status/1630980866775330819https://www.linkedin.com/in/bryancatanzaro/ Links: Homepage: https://ykilcher.com Merch: https://ykilcher.com/merch YouTube: https://www.youtube.com/c/yannickilcher Twitter: https://twitter.com/ykilcher Discord: https://ykilcher.com/discord LinkedIn: https://www.linkedin.com/in/ykilcher If you want to support me, the best thing to do is to share out the content :) If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this): SubscribeStar: https://www.subscribestar.com/yannickilcher Patreon: https://www.patreon.com/yannickilcher Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2 Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

#box-pro-ellipsis-173540799500011{-webkit-line-clamp:2;}Yannic Kilcher Videos (Audio Only)