AI Engineering for Art — with comfyanonymous, of ComfyUI

Update: 2025-01-04

Description

Applications for the NYC AI Engineer Summit, focused on Agents at Work, are open!

When we first started Latent Space, in the lightning round we’d always ask guests: “What’s your favorite AI product?”. The majority would say Midjourney. The simple UI of prompt → very aesthetic image turned it into a $300M+ ARR bootstrapped business as it rode the first wave of AI image generation.

In open source land, StableDiffusion was congregating around AUTOMATIC1111 as the de-facto web UI. Unlike Midjourney, which offered some flags but was mostly prompt-driven, A1111 let users play with a lot more parameters, supported additional modalities like img2img, and allowed users to load in custom models. If you’re interested in some of the SD history, you can look at our episodes with Lexica, Replicate, and Playground.

One of the people involved with that community was comfyanonymous, who was also part of the Stability team in 2023, decided to build an alternative called ComfyUI, now one of the fastest growing open source projects in generative images, and is now the preferred partner for folks like Black Forest Labs’s Flux Tools on Day 1. The idea behind it was simple: “Everyone is trying to make easy to use interfaces. Let me try to make a powerful interface that's not easy to use.”

Unlike its predecessors, ComfyUI does not have an input text box. Everything is based around the idea of a node: there’s a text input node, a CLIP node, a checkpoint loader node, a KSampler node, a VAE node, etc. While daunting for simple image generation, the tool is amazing for more complex workflows since you can break down every step of the process, and then chain many of them together rather than manually switching between tools. You can also re-start execution halfway instead of from the beginning, which can save a lot of time when using larger models.

To give you an idea of some of the new use cases that this type of UI enables:

* Sketch something → Generate an image with SD from sketch → feed it into SD Video to animate

* Generate an image of an object → Turn into a 3D asset → Feed into interactive experiences

* Input audio → Generate audio-reactive videos

Their Examples page also includes some of the more common use cases like AnimateDiff, etc. They recently launched the Comfy Registry, an online library of different nodes that users can pull from rather than having to build everything from scratch. The project has >60,000 Github stars, and as the community grows, some of the projects that people build have gotten quite complex:

The most interesting thing about Comfy is that it’s not a UI, it’s a runtime. You can build full applications on top of image models simply by using Comfy. You can expose Comfy workflows as an endpoint and chain them together just like you chain a single node. We’re seeing the rise of AI Engineering applied to art.

Major Tom’s ComfyUI Resources from the Latent Space Discord

Major shoutouts to Major Tom on the LS Discord who is a image generation expert, who offered these pointers:

* “best thing about comfy is the fact it supports almost immediately every new thing that comes out - unlike A1111 or forge, which still don't support flux cnet for instance. It will be perfect tool when conflicting nodes will be resolved”

* AP Workflows from Alessandro Perili are a nice example of an all-in-one train-evaluate-generate system built atop Comfy

* ComfyUI YouTubers to learn from:

* ComfyUI Nodes to check out:

* https://github.com/kijai/ComfyUI-IC-Light

* https://github.com/MrForExample/ComfyUI-3D-Pack

* https://github.com/PowerHouseMan/ComfyUI-AdvancedLivePortrait

* https://github.com/pydn/ComfyUI-to-Python-Extension

* https://github.com/THtianhao/ComfyUI-Portrait-Maker

* https://github.com/ssitu/ComfyUI_NestedNodeBuilder

* https://github.com/longgui0318/comfyui-magic-clothing

* https://github.com/atmaranto/ComfyUI-SaveAsScript

* https://github.com/ZHO-ZHO-ZHO/ComfyUI-InstantID

* https://github.com/AIFSH/ComfyUI-FishSpeech

* https://github.com/coolzilj/ComfyUI-Photopea

* https://github.com/lks-ai/anynode

* Sarav: https://www.youtube.com/@mickmumpitz/videos ( applied stuff )

* Sarav: https://www.youtube.com/@latentvision (technical, but infrequent)

* look for comfyui node for https://github.com/magic-quill/MagicQuill

* “Comfy for Video” resources

* Kijai (https://github.com/kijai) pushing out support for Mochi, CogVideoX, AnimateDif, LivePortrait etc

* Comfyui node support like LTX https://github.com/Lightricks/ComfyUI-LTXVideo , and HunyuanVideo

* FloraFauna AI and Krea.ai

* Communities: https://www.reddit.com/r/StableDiffusion/, https://www.reddit.com/r/comfyui/

Full YouTube Episode

As usual, you can find the full video episode on our YouTube (and don’t forget to like and subscribe!)

Timestamps

* 00:00:04 Introduction of hosts and anonymous guest

* 00:00:35 Origins of Comfy UI and early Stable Diffusion landscape

* 00:02:58 Comfy's background and development of high-res fix

* 00:05:37 Area conditioning and compositing in image generation

* 00:07:20 Discussion on different AI image models (SD, Flux, etc.)

* 00:11:10 Closed source model APIs and community discussions on SD versions

* 00:14:41 LoRAs and textual inversion in image generation

* 00:18:43 Evaluation methods in the Comfy community

* 00:20:05 CLIP models and text encoders in image generation

* 00:23:05 Prompt weighting and negative prompting

* 00:26:22 Comfy UI's unique features and design choices

* 00:31:00 Memory management in Comfy UI

* 00:33:50 GPU market share and compatibility issues

* 00:35:40 Node design and parameter settings in Comfy UI

* 00:38:44 Custom nodes and community contributions

* 00:41:40 Video generation models and capabilities

* 00:44:47 Comfy UI's development timeline and rise to popularity

* 00:48:13 Current state of Comfy UI team and future plans

* 00:50:11 Discussion on other Comfy startups and potential text generation support

Transcript

Alessio [00:00:04 ]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.

swyx [00:00:12 ]: Hey everyone, we are in the Chroma Studio again, but with our first ever anonymous guest, Comfy Anonymous, welcome.

Comfy [00:00:19 ]: Hello.

swyx [00:00:21 ]: I feel like that's your full name,

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

The Agent Network — Dharmesh Shah

2025-03-2801:38:24

Building Snipd: The AI Podcast App for Learning

2025-03-1401:17:47

⚡️The new OpenAI Agents Platform

2025-03-1125:38

⚡️How Claude 3.7 Plays Pokémon

2025-03-0437:38

Open Operator, Serverless Browsers and the Future of Computer-Using Agents

2025-02-2801:01:33

The Inventors of Deep Research

2025-02-1801:01:57

Bee AI: The Wearable Ambient Agent

2025-02-1301:08:52

The AI Architect — Bret Taylor

2025-02-1101:36:19

Agent Engineering with Pydantic + Graphs — with Samuel Colvin

2025-02-0601:04:04

The Agent Reasoning Interface: o1/o3, Claude 3, ChatGPT Canvas, Tasks, and Operator — with Karina Nguyen of OpenAI

2025-02-0101:08:40

Outlasting Noam Shazeer, crowdsourcing Chat + AI with >1.4m DAU, and becoming the "Western DeepSeek" — with William Beauchamp, Chai Research

2025-01-2601:15:46

Everything you need to run Mission Critical Inference (ft. DeepSeek v3 + SGLang)

2025-01-1901:00:04

[Ride Home] Simon Willison: Things we learned about LLMs in 2024

2025-01-1201:13:23

Beating Google at Search with Neural PageRank and $5M of H200s — with Will Bryk of Exa.ai

2025-01-1056:00

AI Engineering for Art — with comfyanonymous, of ComfyUI

2025-01-0455:04

Latent.Space 2024 Year in Review

2024-12-3101:52:02

2024 in Agents [LS Live! @ NeurIPS 2024]

2024-12-2548:59

2024 in Synthetic Data and Smol Models [LS Live @ NeurIPS]

2024-12-2428:36

2024 in Post-Transformers Architectures (State Space Models, RWKV) [LS Live @ NeurIPS]

2024-12-2443:02

2024 in Open Models [LS Live @ NeurIPS]

2024-12-2342:24

00:00

1.0x

AI Engineering for Art — with comfyanonymous, of ComfyUI

swyx + Alessio

We and our partners use cookies to personalize your experience, to show you ads based on your interests, and for measurement and analytics purposes. By using our website and our services, you agree to our use of cookies as described in our Cookie Policy.

#box-pro-ellipsis-174372404313856{-webkit-line-clamp:2;}AI Engineering for Art — with comfyanonymous, of ComfyUI

AI Engineering for Art — with comfyanonymous, of ComfyUI

swyx + Alessio

AI Engineering for Art — with comfyanonymous, of ComfyUI