Discover
Dev and Doc: AI For Healthcare Podcast

Dev and Doc: AI For Healthcare Podcast
Author: Dev and Doc
Subscribed: 2Played: 10Subscribe
Share
© Dev and Doc
Description
Bringing doctors and developers together to unlock the potential of AI in healthcare. Together, we can build models that matter.
🤖👨🏻⚕️
Hello! We are Dev & Doc, Zeljko and Josh :)
Josh is a Neurologist, AI Researcher and Clinical AI Lead.
Zeljko is an AI engineer, CTO and associate professor (UCL)
-------------
Substack- https://aiforhealthcare.substack.com/
YT - https://youtube.com/@DevAndDoc
🤖👨🏻⚕️
Hello! We are Dev & Doc, Zeljko and Josh :)
Josh is a Neurologist, AI Researcher and Clinical AI Lead.
Zeljko is an AI engineer, CTO and associate professor (UCL)
-------------
Substack- https://aiforhealthcare.substack.com/
YT - https://youtube.com/@DevAndDoc
30 Episodes
Reverse
Whenever there was AI, there were benchmarks- from the turing test, to society-changing benchmarks like MNIST and ImageNet to modern problems like the ARC prize, benchmarked served a vital purpose to measure the performance of AI models. But something has shifted in modern times, in the LLM era have benchmarks lost their utility, becoming mere advertisement for big tech? Even seemingly more sophisticated benchmarks like LM Arena can be gamed by tech giants. We also deep dive into healthcare benchmarks like OpenAI's Healthbench (deeply problematic) and Microsoft's AI-DXO orchestrator agent for diagnosis. Where is this all going? How do we make the perfect benchmark? Or is the real work to be done afterwards in the real world?👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)---Timestamps00:00 Intro - The OG benchmarks - Turing test, MNIST, ImageNET06:40 Are large language models benchmarks similar to humans taking tests?10:05 Are we testing model capability vs production ready?12:00 LLM era - data contamination15:30 LM Arena - The leaderboard illusion paper - how big tech games benchmarks28:35 Goodhart's law - When a measure becomes a target, it ceases to be a good measure32:05 Some good benchmarks - games - Pokemon, ARC prize, Minecraft34:35 Medical benchmarks - OpenAI's healthbench has some big problems46:50 Microsoft AI-DXO orchestrator for case reports---Connect with UsYour Hosts:👨🏻⚕️ Doc - Dr. Joshua Au Yeung - LinkedIn🤖 Dev - Zeljko Kraljevic - TwitterFollow & Subscribe:YT: https://youtube.com/@DevAndDocSpotify: Follow us on SpotifyApple Podcasts: Listen on Apple PodcastsSubstack: https://aiforhealthcare.substack.com/For enquiries:📧 Devanddoc@gmail.com---Production Credits🎞️ Editor: Dragan Kraljević - Instagram🎨 Brand & Art: Ana Grigorovici - Behance
AI agents are here, but how did we get here in the first place? How do we build and leverage AI agents for high stakes domains like healthcare? In this episode of Dev and Doc, we go deep into the forest that is AI agents and computer control - starting from the "caveman" era of LLMs discovering tools, to cultivating intelligent models and agentic workflows. We dissect everyday agents like MANUS AI, and deep dive into how, where and when AI agents should be used. Are these agents hype or hope, is this actually the second deepseek moment?👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)Episode Timestamps:00:00 Highlight3:13 start / intro5:20 LLM's caveman era - tool usage6:46 Agents have autonomy and interact with environment11:15 workflows and agentic flows15:30 when should you be using an agent?24:27 vibe coding is like driving a car29:07 Demo - MANUS gathering financial trends, computer control35:55 Demo MANUS AI- website creation for Autism Assessment49:05 computer control factions- Freedom vs Process automation55:00 Autism website testing59:13 summary + endHosts:👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokrFind us on:YT - https://youtube.com/@DevAndDocSpotify - https://podcasters.spotify.com/pod/show/devanddocApple- https://podcasts.apple.com/gb/podcast/dev-and-doc-ai-for-healthcare-podcast/id1751495120Substack- https://aiforhealthcare.substack.com/For enquiries:📧Devanddoc@gmail.comCredits:🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
Can Claude perform a range of complex clinical tasks? Dev and Doc are here to investigate.Claude sonnet 3.7 was released less than 48 hours ago, the model is highly intelligent and is one of the best we have seen in recent memory. Definitely passes the vibe check.We give some amazing examples of coding with claude with few shot prompts, and cover technical and clinical evaluations and share our first thoughts. We even tested claude to take a patient history!NB - PLEASE don't do this at home, obviously this is a demo and we do not in any way condone or recommend using an LLM as your doctor or healthcare provider, we are just demonstrating what the future could be. If you are sick, please seek a medical professional.👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)TIMESTAMPS00:00 start + highlights01:54 Introduction08:54 Benchmarks, state of the art14:44 guardrails, refusals, AI safety and catastrophic risks22:36 show and tell- great for coding and make video games!26:54 example hospital runner30:17 Medical use cases- clinical coding, biomedical entity extraction37:04 only medical example in Claude model card- still hallucinating citations38:37 making an anatomy app40:10 forecasting clinical diagnoses43:36 taking a medical history from a patient53:33 wrap up👨🏻⚕️Doc - Dr. Joshua Au Yeung - linkedin.com/in/dr-joshua-auyeung🤖Dev - Zeljko Kraljevic twitter.com/zeljkokrYT:youtube.com/@DevAndDocSpotify:podcasters.spotify.com/pod/show/devanddocApple:podcasts.apple.com/gb/podcast/dev-and-doc-ai-for-healthcare-podcast/id1751495120Substack:aiforhealthcare.substack.comFor enquiries - 📧 Devanddoc@gmail.com🎞️ Editor - Dragan Kraljević instagram.com/dragan_kraljevic🎨 Brand design - Ana Grigorovici behance.net/anagrigorovici027d
Is it still worth doing a PhD in 2025? Is the academic system broken in this publish-or-perish landscape? When is a PhD not worth pursuing? About this Episode In this Dev and Doc episode, Zeljko (now associate professor!) and Josh (doctor, PhD drop out) talk about the good and the bad of PhD life. They provide insight into the academic world with a focus on computer science and machine learning. 👋 Connect With Us! Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :) 🎙️ Hosts 👨🏻⚕️ Doc - Dr. Joshua Au Yeung - LinkedIn 🤖 Dev - Zeljko Kraljevic - Twitter ⏳ Timestamps 00:00 - Start and highlight 01:42 - Intro 03:11 - What made you pursue PhD in the first place 05:05 - Industry or PhD first 10:00 - Positives - Moonshots 17:03 - Positives - Access to world experts and collaboration 20:55 - Positives - Open source and open science 24:49 - Positives - A good environment enables a smooth PhD 27:04 - Negatives - You are a one-man show 31:33 - Negatives - Publish or Perish 45:44 - Bring your research closer to the audience through blogs and other media, journals are legacy media 51:20 - Verdict - Is a PhD still worth it in 2025? 📢 Follow Us LinkedIn Newsletter YouTube Spotify Apple Podcasts Substack 📧 Contact Us For enquiries - devanddoc@gmail.com 🎞️ Video Production 🎬 Editor - Dragan Kraljević - Instagram 🎨 Brand Design & Art Direction - Ana Grigorovici - Behance
Dev and Doc put Deepseek R1 to the test in a technical and clinical deep dive.
👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-au-yeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
TIMESTAMPS
00:00 Highlights
04:36 Intro
08:29 response from OpenAI, Anthropic- model training costs, tightening restrictions on China, pricing wars
13:13 what an open-source deepseek means for the world.
15:38 Sam altman and Dario amodei feeling the pressure
23:10 TECHNICAL deep dive - RLHF, ppo, dpo
37:08 GRPO, R1s secret sauce
45:02 the aha moment, learning like a human?
50:25 deepseek R1 training and controversy
59:08 deepseek healthcare evaluation - Ethnic Bias
1:06:17 The diagnostic acid test (fail)
1:12:46 Coding clinical data / Medical billing (shout out SNOMED)
LinkedIn Newsletter https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7216474068085026817
YT - https://youtube.com/@DevAndDoc
Spotify - https://podcasters.spotify.com/pod/show/devanddoc
Apple- https://podcasts.apple.com/gb/podcast/dev-and-doc-ai-for-healthcare-podcast/id1751495120
Substack- https://aiforhealthcare.substack.com/
For enquiries - 📧Devanddoc@gmail.com
🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
Dev and Doc - Latest News
Dev and Doc - Latest News
It's 2025, Dev and Doc cover the latest news including Google's deep research and notebook LM, DeepMind's Promptbreeder, and Anthropic's new RAG approach. We also go through what retrieval augmented generation (RAG) is, and how this technique is advancing LLM performance.
👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
Meet the Team
👨🏻⚕️ Doc - Dr. Joshua Au Yeung - LinkedIn
🤖 Dev - Zeljko Kraljevic - Twitter
Where to Follow Us
LinkedIn Newsletter
YouTube
Spotify
Apple Podcasts
Substack
Contact Us
📧 For enquiries - Devanddoc@gmail.com
Credits
🎞️ Editor - Dragan Kraljević - Instagram
🎨 Brand Design and Art Direction - Ana Grigorovici - Behance
Episode Timeline
00:00 Highlights
00:53 News - Notebook LM, OpenAI 12 days of Christmas
07:44 Change in the meta - post-training
11:34 Optimizing prompts with DeepMind Promptbreeder
13:20 Is OpenAI losing their lead against Google
16:45 Deep research vs Perplexity
24:18 AIME and oncology
26:00 Deep research results
30:20 RAG intro
33:14 Second pass RAG
36:20 RAG didn't take off
38:40 Wikichat
39:16 How do we improve on RAG?
41:11 Semantic/topic chunking, cross-encoders, agentic RAG
51:15 Google’s Problem Decomposition
53:32 Anthropic’s Contextual Retrieval Processing
56:07 Summary and wrap up
References
Cross Encoders
Wikichat
Google's Problem Decomposition
Anthropic's Contextual Retrieval
Google AIME in Oncology
DeepMind's Promptbreeder
First Thoughts and Preliminary Insights into OpenAI's GPT o1 Strawberry in the Medical Domain
With some expected and unexpected findings, we have a "bake off" between o1 and Doc to demonstrate how o1 fares with tricky medical scenarios.
Disclaimer
Obviously, don't use AI to diagnose or treat your medical problems. If you are unwell, please seek a medical professional (AI isn't good enough just yet :)).
👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
Contributors
• 👨🏻⚕️ Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
• 🤖 Dev - Zeljko Kraljevic - https://twitter.com/zeljkokr
Follow Us
• https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7216474068085026817
• https://youtube.com/@DevAndDoc
• https://podcasters.spotify.com/pod/show/devanddoc
• https://podcasts.apple.com/gb/podcast/dev-and-doc-ai-for-healthcare-podcast/id1751495120
• https://aiforhealthcare.substack.com/
For enquiries - 📧 mailto:Devanddoc@gmail.com
Team
• 🎞️ Editor - Dragan Kraljević - https://www.instagram.com/dragan_kraljevic/
• 🎨 Brand Design and Art Direction - Ana Grigorovici - https://www.behance.net/anagrigorovici027d
Timestamps
• 00:00 - Start + Highlights
• 01:28 - Intro, What is GPT o1?
• 05:18 - What is "Reasoning" in o1?
• 12:38 - Benchmarks: o1's Successes and Failures
• 24:07 - o1 and Doctor Bake Off!
• 24:21 - The Pregnancy Acid Test for LLMs
• 26:23 - Clinical Coding
• 30:06 - Tricky Patient Scenarios
• 32:25 - Opioid Dose Conversions
Dev and Doc is joined by guest Annabelle Painter, doctor, CMO, and podcaster for the Royal Society of Medicine Digital Health Podcast. We deep dive into explainability and interpretability with concrete healthcare examples.
Check out Dr. Painter's Podcast here, she has some amazing guests and great insights into AI in healthcare! - https://spotify.link/pzSgxmpD5yb
👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
👨🏻⚕️ Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖 Dev - Zeljko Kraljevic - https://twitter.com/zeljkokr
LinkedIn Newsletter
YouTube Channel
Spotify
Apple Podcasts
Substack
For enquiries - 📧 Devanddoc@gmail.com
🎞️ Editor - Dragan Kraljević - https://www.instagram.com/dragan_kraljevic/
🎨 Brand design and art direction - Ana Grigorovici - https://www.behance.net/anagrigorovici027d
Timestamps:
00:00 - Start + highlights
03:47 - Intro
08:16 - Does all AI in healthcare need to be explainable?
15:56 - History and explanation of Explainable/Interpretable AI
20:43 - Gradient-based saliency and heat maps
24:14 - LIME - Local Interpretable Model-agnostic Explanations
30:09 - Nonsensical correlations - When explainability goes wrong
33:57 - Modern explainability - Anthropic
37:15 - Comparing LLMs with the human brain
40:02 - Clinician-AI interaction
47:11 - Where is this all going? Aligning models to ground truth and teaching them to say "I don't know"
References:
Fun Examples of when models go wrong - Nonsensical correlations
Mechanistic interpretability
Anthropic - Mapping the mind of language models
Limitations of current AI explainability approaches
Explainability does not improve automation bias in radiologists
An explainer on Foundation models for pathology, from Microsoft's Gigapath to Owkin's H-optimus-0, every company, big or small, are building pathology AI models. In this episode, Doc talks to Sean M. Hacking, assistant professor in Pathology at NYU Grossman School of Medicine and Özgür Şahin, particle physicist at CERN. Together they are building the infrastructure for digital pathology that then allows training of pathology foundational models. Find out more at https://www.pathonn.com/.
👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7216474068085026817
https://youtube.com/@DevAndDoc
https://podcasters.spotify.com/pod/show/devanddoc
https://podcasts.apple.com/gb/podcast/dev-and-doc-ai-for-healthcare-podcast/id1751495120
https://aiforhealthcare.substack.com/
👨🏻⚕️Doc - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - https://twitter.com/zeljkokr
🎞️ Editor - https://www.instagram.com/dragan_kraljevic/
🎨 Brand design and art direction - https://www.behance.net/anagrigorovici027d
00:00 Introduction
03:28 Why pathology
06:42 Transporting slides is a logistical nightmare
13:20 When particle physics and AI pathology collide
17:55 AI digital pathology - Patch-based architecture and sparse topologies
27:09 Is there enough pathology data?
29:11 Microsoft and Gigapath, transformer models for pathology
33:55 Pathology models clinical applications
43:18 Staining applications of AI
49:22 Building a digital pathology startup - Patho-NN
57:36 Using AI to see tumor grading features that humans can’t see
References:
https://www.nature.com/articles/s41586-024-07441-w
https://www.microsoft.com/en-us/research/blog/gigapath-whole-slide-foundation-model-for-digital-pathology/
https://www.nature.com/articles/s41379-021-00919-2
Doc talks to Dr Derrick Khor - Cancer Doctor, HealthTech Consultant and Linkedin Guru. We share Derrick's insights from consulting over 120 companies and a step-by-step guide on how to build a successful Healthcare company.
You can find more of Derrick and his helpful guides - https://adoptadoc.com/resources/
profile- https://www.linkedin.com/in/derrick-khor/
👋 Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
Dev&Doc is a podcast where doctors and developers deep dive into the potential of AI in healthcare.
👨🏻⚕️Doc - Dr. Joshua Au Yeung
🤖Dev - Zeljko KraljevicLinkedIn NewsletterYouTubeSpotifyAppleSubstack
For enquiries - 📧 Devanddoc@gmail.com
<p>🎞️ Editor - <a href="https://www.instagram.com/dragan_kraljevic/">Dragan Kraljević</a></p>
<p>🎨 Brand design and art direction - <a href="https://www.behance.net/anagrigorovici027d">Ana Grigorovici</a></p>
Timestamps
00:00 Highlights and intro
3:01 Start
5:10 getting into health tech
8:03 lack of clinicians in start ups
15:07 Derrick's own healthtech journey to consulting
23:37 Start ups and failure
27:35 the start up road map
32:16 are you a medical device (samd)? Intended use
40:55 clinical evidence generation
48:16 go to market, NHS DTAC
57:57 power of networking, social media, linkedin
1:02:43 top UK health tech companies to look out for
Dev and Doc deconstruct digital biomarkers! This is a fascinating and nascent field in the world of medicine, how have biomarkers transformed the way we practice medicine, and how will AI and wearables, sensors and digital fingerprints transform the way we practice in the future?
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
find us on youtube- @Dev and Doc
📙Substack: https://aiforhealthcare.substack.com/👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
Timestamp
00:00 highlights
01:50 intro
02:40 how biomarkers evolved in the last century
6:02 what is the definition of a biomarker
10:00 biomarkers can be very biased depending on who you are testing
12:31 when does a test become a biomarker
17:30 the digital age and measurements - AI vision in retina scans, digital stethoscopes
23:50 what is an “analog” biomarker vs digital biomarker?
30:10 where do biomarkers fail in evidence based medicine?
34:55 Biomarkers are pretty poor for mental health
47:57 can AI predict depression better than humans?
51:21 Digital biomarkers to detect movement disorders
01:00:04 this can change clinical trials forever
Refs
- variable definitions of biomarkers https://informatics.bmj.com/content/31/1/e100914
-digital biomarkers convergence nature paper https://www.nature.com/articles/s41746-022-00583-z
-digital stethoscope for heart failure https://www.thelancet.com/pdfs/journals/landig/PIIS2589-7500(21)00256-9.pdf
-touch screen typing depression paper https://www.nature.com/articles/s41746-022-00583-z
- Duchennes body suit biomarker https://www.nature.com/articles/s41591-022-02045-1#Sec9
- Friedreichs ataxia body suit https://www.nature.com/articles/s41591-022-02159-6?fromPaywallRec=false#Sec9
Dr Keith Grimes is a HealthTech consultant and General Practitioner working with companies to transform clinical ideas into something impactful. He worked as the digital health director in Babylon Health prior to its demise, and currently runs his own consulting firm, Curistica. This is one not to miss!
References
HealthTech consulting at Curistica www.curistica.com
Prof Amanda Goodall on leadership theory https://amandagoodall.com/
For those interested in Leadership opportunities:
-Faculty of medical leadership and management https://www.fmlm.ac.uk/
-Bite labs https://www.bitelabs.io/
<p>Dev&Doc is a podcast where doctors and developers deep dive into the potential of AI in healthcare.<br>
👨🏻⚕️Doc - <a href="https://www.linkedin.com/in/dr-joshua-auyeung/">Dr. Joshua Au Yeung</a><br>
🤖Dev - <a href="https://twitter.com/zeljkokr">Zeljko Kraljevic</a><br>
<a href="https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7216474068085026817">LinkedIn Newsletter</a><br>
<a href="https://youtube.com/@DevAndDoc">YouTube</a><br>
<a href="https://podcasters.spotify.com/pod/show/devanddoc">Spotify</a><br>
<a href="https://podcasts.apple.com/gb/podcast/dev-and-doc-ai-for-healthcare-podcast/id1751495120">Apple</a><br>
<a href="https://aiforhealthcare.substack.com/">Substack</a><br>
For enquiries - 📧 <a href="mailto:Devanddoc@gmail.com">Devanddoc@gmail.com</a>
</p>
🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
Timestamps
00:00 start
1:10 Career career career - GP, babylon health, digital consultancy
6:40 working as a rural GP in Scotland
9:21 time is the biggest factor of clinical impact
12:11 finding impact through data
21:29 leading by example
23:52 Should doctors be leading healthtech businesses?
30:10 why do healthtech start-ups not have clinicians earlier?
36:30 Babylon failure - importance of having clinical influence at the top
43:55 experience being grilled on BBC newsnight
49:45 lessons learnt from the downfall of Babylon
52:25 6 values of consulting firm Curistica
55:51 common problems in start ups
59:36 how AI will change the healthcare landscape
How do we reach the holy grail of a clinically safe LLM for healthcare? Dev and Doc are back to discuss news with Meta's LlaMA model and potential of healthcare LLMs finetuned on top like BioLlaMa. We discuss the key steps in building a clinically safe LLM for healthcare for healthcare and how this was pursued by Hippocratic AI's latest model - Polaris.
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
The podcast 🎙️
🔊Spotify: https://podcasters.spotify.com/pod/show/devanddoc
📙Substack: https://aiforhealthcare.substack.com/
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
References
Hippocratic AI LLM- https://arxiv.org/pdf/2403.13313
BioLLM tweet - https://twitter.com/aadityaura/status/1783662626901528803
Foresight lancet paper -https://www.thelancet.com/journals/landig/article/PIIS2589-7500(24)00025-6/fulltext
Linear processing units- https://wow.groq.com/lpu-inference-engine/
Timestamps
00:00 Start
01:10 Intro- llama3 , a chatGPT level model in our hands
06:53 Linear processing units to run LLMs
09:42 BioLLM for medical question and answering
11:13 quality and size of dataset, using youtube transcripts
12:41 Question and answering pairs do not reflect the real world - holy grail of healthcare llm
18:43 Dev has Beef with hippocratic AI
20:25 Step1 Training a clinical foundational model from scratch
22:43 Step 2 Instruction tuning with multi-turn simulated conversation
24:15 Step 3 training the model to guide model in tangential conversations
27:42 Focusing on the hospital back office and specialist nurse phone calls
33:02 Evaluating Polaris - clinical safety LLM , bedside manner, medical safety advice
In this special episode we share a live recording of our live podcast episode at the Rewired UK conference, where NHS, industry and policy markers unite.
We discuss current LLMs from a technical and practical perspective. Dive into how to build Foundational models for the National health service and our experiences. We were also privileged to be joined by head of digital at Cambridge University Hospital NHS trust, Dr. Wai Keong Wong on how to evaluate AI products and discussions on automating administrative tasks for clinicians with ambient clinical documentation.
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
The podcast 🎙️
🔊Spotify: https://podcasters.spotify.com/pod/show/devanddoc
📙Substack: https://aiforhealthcare.substack.com/
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
00:00 intro
02:05 AI vs doctors - are language models ready to replace doctors?
05:22 the tranformer models and attention
08:51 human labour for reinforcement learning
11:00 building the NHS LLM, key concepts
13:55 foresight GPT - predicting the next clinical event in a patient timeline.
16:29 is text enough?
17:19 £3.8B investment into NHS digitisation and admin automation - ambient clinical documentation
20:14 how do you evaluate AI products for the NHS?
26:24 how do you vet the tech companies and future proof your purchase?
27:23 do clinicians need more digital health education?
28:41 transparency of AI models and benchmarks
31:30 question - EHR data created by AI leads to homogenisation and errors
34:03 question - training on structured vs unstructured EHR data
38:06 question - LLMs as a brain. How do we give it a body?
41:05 framework for ai deployment
What do Prompt engineers have in common with telephone operators in the 1870s?
Spoiler - they're both dying professions
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
The podcast 🎙️
🔊Spotify: https://podcasters.spotify.com/pod/show/devanddoc
📙Substack: https://aiforhealthcare.substack.com/
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
🎞️ Editor- Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction - Ana Grigorovici https://www.behance.net/anagrigorovici027d
00:00 Highlights
01:10 Intro - where did prompt engineering go wrong?
4:10 what is prompt engineering fundamentally?
10:54 LLMs training data reflects prompt engineering
12:32 prompts are model dependent
14:02 prompts that make you think
18:26 combining expert and generalist medical models for doctors
19:49 Diagnostic reasoning prompts, is it interpretable?
26:55 can we find prompts more elegantly/ systematically ?
28:42 Will prompts become obsolete? Models that self discover prompts
31:09 Telephone operators and Prompt engineers - death of a profession
Refs
Prompt "hacks" (oh man) - https://learnprompting.org/docs/intermediate/chain_of_thought
Diagnostic prompt interpretability paper - https://www.nature.com/articles/s41746-024-01010-1
self - discover https://arxiv.org/abs/2402.03620
telephone operators - https://www.history.com/news/rise-fall-telephone-switchboard-operators
How do we align AI models for healthcare? 👨⚕️ And importantly, the moral codes and ethics that we practice everyday, how does the LLM deal with ethical scenarios like the trolley problem for example? This is a fascinating topic and one we spend a lot of time thinking about.
In this episode Dev and Doc, Zeljko Kraljevic and I cover all the up to date topics around reinforcement learning, the benefits and where it can go wrong. We also discuss different RL methods including the algorithms used to train ChatGPT (RLHF).
Dev and Doc is a Podcast where developers and doctors join forces to deep dive into AI in healthcare. Together, we can build models that matter.
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua...
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
The podcast 🎙️
🔊Spotify: https://open.spotify.com/show/3QO5Lr3...
📙Substack: https://aiforhealthcare.substack.com/
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
🎞️ Editor-
Dragan Kraljević https://www.instagram.com/dragan_kral...
🎨Brand design and art direction -
Ana Grigorovici
https://www.behance.net/anagrigorovic...00:00 Highlights
01:27 start
4:38 aligning ethics of ai models
7:04 doctors ethical choices daily
8:00 RLHF and AI training methods
16:29 reinforcement learning
19:35 Preference model -rewarding models correctly can make or break the success
27:05 exploiting reward function, model degradation (and how to fix it)
Ref
AI intro paper - https://pn.bmj.com/content/23/6/476
Open AI RLHF paper - https://arxiv.org/abs/1909.08593
War and peace of LLMs! - https://arxiv.org/abs/2311.17227
In this episode Doc goes on an adventure to chair an LLM/ generative AI conference session and reflects on his experience. Dev and Doc also discuss big news on meta's Llama3 and Code LlaMa.
Dev and Doc is a Podcast where developers and doctors join forces to deep dive into AI in healthcare. Together, we can build models that matter.
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
The podcast 🎙️
🔊Spotify: https://open.spotify.com/show/3QO5Lr3w4Rd6lqwlfKDaB7?si=e7915d844994403e
📙Substack: https://aiforhealthcare.substack.com/
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
🎞️ Editor-
Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction -
Ana Grigorovici
https://www.behance.net/anagrigorovici027d
00:00 Highlight
00:36 Start
1:57 Are researchers just using Generative AI to get presentations /publications?
6:18 Hype cycles , lack of real world clinical studies using LLMs
8:08 LlaMa3 , Code LlaMa announcement and insights
13:30 Google bard / Gemini ultra second on leaderboard
17:30 wrap up and end
Dev And Doc are back ! Here we break down the biggest highlights of 2023, and AI predictions for 2024.
Dev and Doc is a Podcast where developers and doctors join forces to deep dive into AI in healthcare. Together, we can build models that matter.
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
00:00 start
01:01 Intro, Advancing LLMs in healthcare
07:10 Ambient note documentation in Medicine
10:52 Meta LLaMa are the good guys ?
14:40 GPT store
19:40 Overhyped Google Gemini model
26:17 AGI again
29:05 6 big predictions Open source vs Closed source models
38:55 AI in healthcare- LLM clinical trials , AI drug discovery
42:05 end
References
GPT store- https://openai.com/blog/introducing-the-gpt-store
Hugging face predictions- https://twitter.com/ClementDelangue/status/1729158744762626310
AI drug discovery (blog post to paper) - https://news.mit.edu/2023/using-ai-mit-researchers-identify-antibiotic-candidates-1220
Google AMIE blog - https://blog.research.google/2024/01/amie-research-ai-system-for-diagnostic_12.html
The podcast 🎙️
🔊Spotify: https://open.spotify.com/show/3QO5Lr3w4Rd6lqwlfKDaB7?si=e7915d844994403e
📙Substack: https://aiforhealthcare.substack.com/
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
🎞️ Editor-
Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction -
Ana Grigorovici
https://www.behance.net/anagrigorovici027d
We have conversations between doctors and developers exploring the potential of AI in healthcare
Josh is a training Neurologist in the NHS, and AI researcher in St Thomas' hospital and King's College Hospital. He is also a PhD student at King's College London.
Zeljko is an AI researcher and PhD student at King's College London, as well as a CTO for a natural language processing company.
In this episode, Dev and Doc sit down to discuss artificial general intelligence from the perspective of a neurologist and computer scientist. We dive into the current developments around AGI , the 2 controversial schools of thought, LLMs and neuroscience, and give hot takes about whether we will ever reach AGI.
Dev and Doc is a Podcast where developers and doctors join forces to deep dive into AI in healthcare. Together, we can build models that matter.
👨🏻⚕️Doc - Dr. Joshua Au Yeung - https://www.linkedin.com/in/dr-joshua-auyeung/
🤖Dev - Zeljko Kraljevic https://twitter.com/zeljkokr
Hey! If you are enjoying our conversations, reach out, share your thoughts and journey with us. Don't forget to subscribe whilst you're here :)
00:00 start
01:05 intro
03:46 two camps of AGI - Yann Lecun vs Geoffrey Hinton, Architecture vs Data
07:47 Do emergent capabilities of LLMs pose a threat to humanity?
08:45 Intelligence and AGI - neuroscience and computer science approach
16:59 LLMs vs the human brain
24:16 Do AIs need a human touch? - Intrinsic personalities, temperaments, motivations, joy and reward
The podcast 🎙️
🔊Spotify: https://open.spotify.com/show/3QO5Lr3w4Rd6lqwlfKDaB7?si=e7915d844994403e
📙Substack: https://aiforhealthcare.substack.com/
🎞️ Editor-
Dragan Kraljević https://www.instagram.com/dragan_kraljevic/
🎨Brand design and art direction -
Ana Grigorovici
https://www.behance.net/anagrigorovici027d
Comments