ML Cult

ML Cult

<p><b>A curated podcast covering the latest machine learning developments, text, and audio is generated using AI.<br></b><br></p>

October 27th, 2023 - AI Unleashed: Decoding Sycophancy, Mastering Control, and Crafting 3D Realities

Towards Understanding Sycophancy in Language ModelsControlled Decoding from Language ModelsHyperFields: Towards Zero-Shot Generation of NeRFs from TextSupport the show

10-27
08:32

October 26th, 2023 - Frontiers of AI: From Quantum Compression to Visionary Transformers

LLM-FP4: 4-Bit Floating-Point Quantized TransformersDetecting Pretraining Data from Large Language ModelsConvNets Match Vision Transformers at ScaleA Picture is Worth a Thousand Words: Principled Recaptioning Improves Image GenerationQMoE: Practical Sub-1-Bit Compression of Trillion-Parameter ModelsSupport the show

10-26
14:15

October 25th, 2023 - Pixel to Perception: Matryoshka Synthesis, GPT-3's Linguistic Mysteries, Woodpecker's Visual Refinement, and SAM-CLIP's Vision Evolution

Matryoshka Diffusion ModelsDissecting In-Context Learning of Translations in GPTsWoodpecker: Hallucination Correction for Multimodal Large Language ModelsSAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial UnderstandingSupport the show

10-25
11:12

October 24th, 2023 - Neural Visions Unveiled: From FreeNoise's Video Clarity, HallusionBench's Reality Check, to FlashEdit's Instant Image Refinements

FreeNoise: Tuning-Free Longer Video Diffusion Via Noise ReschedulingHallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality ModelsLocalizing and Editing Knowledge in Text-to-Image Generative ModelsSupport the show

10-24
06:35

October 23th, 2023 - Unlocking AI's Potential: From Open Waters to Self-Enhancing Miniature Models

H2O Open Ecosystem for State-of-the-art Large Language ModelsLet's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small ModelsTeaching Language Models to Self-Improve through Interactive DemonstrationsSupport the show

10-23
06:35

October 4th, 2023 - NeuroFrontiers: Pensive Processors, Natural Evolution, and the New Age of Linguistic Titans

Think before you speak: Training Language Models With Pause TokensTowards Self-Assembling Artificial Neural Networks through Neural Developmental ProgramsEfficient Streaming Language Models with Attention SinksLarge Language Models Cannot Self-Correct Reasoning YetSmartPlay : A Benchmark for LLMs as Intelligent AgentsSupport the show

10-04
13:09

October 3nd, 2023 - Evolution in Text: Self-Improvement, Synthesis, and Scrutiny

Enable Language Models to Implicitly Learn Self-Improvement From DataPixArt-alpha: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image SynthesisFELM: Benchmarking Factuality Evaluation of Large Language ModelsSupport the show

10-03
07:51

October 2nd, 2023 - Math to Motion: ToRA, Decaf, and DRaFT Transformations

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem SolvingDecaf: Monocular Deformation Capture for Face and Hand InteractionsDirectly Fine-Tuning Diffusion Models on Differentiable RewardsSupport the show

10-02
06:52

September 29th, 2023 - Masters of AI Metamorphosis: From Long-Context Linguistics to 3D Dreamscapes

Effective Long-Context Scaling of Foundation ModelsDemystifying CLIP DataVision Transformers Need RegistersQwen Technical ReportDreamGaussian: Generative Gaussian Splatting for Efficient 3D Content CreationSupport the show

09-29
16:14

September 28th, 2023 - Neural Vistas & Visual Alchemy: From NeuRBF Reconstructions to ScalarSimplicity in AI Imagery

NeuRBF: A Neural Fields Representation with Adaptive Radial Basis FunctionsEmu: Enhancing Image Generation Models Using Photogenic Needles in a HaystackShow-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationFinite Scalar Quantization: VQ-VAE Made SimpleSupport the show

09-28
08:49

September 27th, 2023 - Beyond Boundaries: Pioneering Sequences, Alignments, and Realism in AI Evolution

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer ModelsAligning Large Multimodal Models with Factually Augmented RLHFLAVIE: High-Quality Video Generation with Cascaded Latent Diffusion ModelsSupport the show

09-27
06:30

September 25th, 2023 - From Pixels to Precedents: Pioneering Visions in Color, Law, Code, and Sight

CoRF : Colorizing Radiance Fields using Knowledge DistillationThe Cambridge Law Corpus: A Corpus for Legal AI ResearchCodePlan: Repository-level Coding using LLMs and PlanningDualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token FusionSupport the show

09-25
10:47

September 22th, 2023 - Revolutionary Speeds & Precision: The Future of Neural Networks and Language Models

Parallelizing non-linear sequential models over the sequence lengthFast Feedforward NetworksLongLoRA: Efficient Fine-tuning of Long-Context Large Language ModelsA Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language ModelsBoolformer: Symbolic Regression of Logic Functions with TransformersSupport the show

09-22
13:54

September 21th, 2023 - Neural Frontiers: From FreeU's Image Mastery to Languini Kitchen's Equalized Research

FreeU: Free Lunch in Diffusion U-NetNeurons in Large Language Models: Dead, N-gram, PositionalDreamLLM: Synergistic Multimodal Comprehension and CreationKosmos-2.5: A Multimodal Literate ModelEnd-to-End Speech Recognition Contextualization with Large Language ModelsThe Languini Kitchen: Enabling Language Modelling Research at Different Scales of ComputeSupport the show

09-21
14:32

September 20th, 2023 - From Overthinking Graphs to Code Whispering and Polyglot AI: The New Frontiers of Neural Networks, Language Models, and Data Compression

Graph Neural Networks Use Graphs When They Shouldn'tLarge Language Models for Compiler OptimizationOpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from ScratchBaichuan 2: Open Large-scale Language ModelsLanguage Modeling Is CompressionFoleyGen: Visually-Guided Audio GenerationSupport the show

09-20
13:20

September 12th, 2023 - Frontiers in AI: From Pint-Sized Powerhouses and Pruned Datasets to Multilingual Mastery and Image Restoration

Textbooks Are All You Need II: phi-1.5 technical reportDiffBIR: Towards Blind Image Restoration with Generative Diffusion PriorWhen Less is More: Investigating Data Pruning for Pretraining LLMs at ScaleMADLAD-400: A Multilingual And Document-Level Large Audited DatasetFIAT: Fusing learning paradigms with Instruction-Accelerated TuningOptimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMsSupport the show

09-12
11:11

September 11th, 2023 - Neural Frontiers: Audiobooks, Virtual Cities, Summarization, and Vision Transformers Reimagined

Large-Scale Automatic Audiobook CreationCityDreamer: Compositional Generative Model of Unbounded 3D CitiesFrom Sparse to Dense: GPT-4 Summarization with Chain of Density PromptingMobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-ExpertsHigh-Quality Entity SegmentationSupport the show

09-11
09:26

September 8th, 2023 - Unlocking the Future of AI: From Master Optimizers and Budget-Friendly Giants to Truthful Decoding and Video Segmentation Breakthroughs

Large Language Models as OptimizersFLM-101B: An Open LLM and How to Train It with $100K BudgetXGen-7B Technical ReportTracking Anything with Decoupled Video SegmentationDoLa: Decoding by Contrasting Layers Improves Factuality in Large Language ModelsSupport the show

09-08
11:28

September 7th, 2023 - SLiMe, Matcha-TTS, RoboSense, and CM3Leon: Revolutionizing Vision, Speech, and Multi-Modal Intelligence for a Smarter, Faster Future

SLiMe: Segment Like MeMatcha-TTS: A fast TTS architecture with conditional flow matchingPhysically Grounded Vision-Language Models for Robotic ManipulationScaling Autoregressive Multi-Modal Models: Pretraining and Instruction TuningSupport the show

09-07
08:11

September 6th, 2023 - Unlocking the Future of AI: Lean Transformers, Memory-Efficient RLHF, Voice-Altering Text Prompts, and 3D Virtual Humans

One Wide Feedforward is All You NeedEfficient RLHF: Reducing the Memory Usage of PPOPromptTTS 2: Describing and Generating Voices with Text PromptAniPortraitGAN: Animatable 3D Portrait Generation from 2D Image CollectionsSupport the show

09-06
08:02

Recommend Channels