Discover
Stable Discussion Podcast
Stable Discussion Podcast
Author: What's possible with AI today and what to expect tomorrow
Subscribed: 0Played: 6Subscribe
Share
© Stable Discussion
Description
Artificial Intelligence is changing our world and we help better understand what this means to all of us. We'll look at what's possible and where is the technology still not there yet.
blog.stablediscussion.com
blog.stablediscussion.com
35 Episodes
Reverse
I played around with the new Google Agentic IDE, Antigravity, on launch day and created a few features for an app I’ve been playing with.If you’re unfamiliar this video is a helpful overview of the features:My initial impressions offer a more nuanced experience than the chipper attitude of this presentation which should help you get a balanced perspective.This Substack is reader-supported. To receive new posts and support our work, consider becoming a free or paid subscriber.Using AntigravityAgent ManagerThis interface feels like a move in the right direction. It offers a means of managing the work done by an agent, ability to see and respond to plans easily, and clear indication of changes made. I like the Agent Manager’s UI but it’s been a little buggy so far. I made some good changes but it is missing some of the context I have in my CLAUDE.md files on how I wanted it to build the app. I’m not sure if it’s reading some of my core information files or docs.KnowledgeThe knowledge base feature looks interesting but in the two medium-sized features I created it didn’t seem to think this needed to be updated. Unsure when it will feel there’s something worth of it. As with all AI memory systems I do worry about it getting the wrong idea and storing that idea for later use.Intelligent Tool ApprovalAntigravity lets the model choose when to run a tool in some cases rather than stopping to ask for permission. This is a cool concept if it works well and I’m curious to explore with this further. I worry that seemingly innocent commands may look non-destructive and get called anyway by the model despite causing some destructive change. This may require a metadata layer similar to how MCP servers have expanded their interfaces to include if a tool mutates data.Browser ToolIt’s great to see them add a tool for interacting with the code as it’s actually run. However, my initial setup of their browser didn’t seem to workout very well. I installed the extension but the agent had difficulty finding the extension. Eventually it seemed to work when I opened up a new tab after closing Chrome.Since Chrome isn’t my main browser, it’s not setup quite right for the application I was testing but it worked well on a later project. It seems to be able to record, capture screenshots, and read the console.CommentingHaving a highlight and comment system on AI plans is plain great UX. In existing solutions I find that there are often times where I’m opening a note or something to put my feedback in and then scrolling down and pasting that feedback into the chat input at the bottom of a agent chat window. When I made comments, it seemed like they were factored in appropriately when I asked the model to apply them.The pitch of this feature sounded good but I’m curious how the model thinks about incorporating feedback like this in practice.Gemini 3 Pro (High)I’ll pay it a good compliment in saying that this model felt a lot like Claude Sonnet 4.5 to the point that I felt like it was working with me in the same way I’m used to. It’s not often that happens when switching models. That said, I still didn’t get my context appropriately in the conversation and worry a bit if I have to start all of my conversations with it saying Read @CLAUDE.md before we can start working.On release day there’s always a lot of strain on these models and this was no different. There were a couple times I needed to leave and come back when working through the features to let the global limit cool off a bit. Hopefully the usage gets a bit more predictable and lets more people use this regularly.OnboardingWhen I launched Antigravity and onboarded, the onboarding crashed on the final step. Then I had to go through it again and even though I had indicated I wanted it to pull in settings from my Cursor, it ignored that and didn’t pull in any of my extensions or config. This is definitely a headwind in my adoption journey but something that is likely to get fixed with time.TakeawaysI built out two features at the same time to explore the agentic capabilities of the Antigravity IDE. One feature was to add notes for users in a feed and the other is to allow users to upload images to Google Cloud Storage using a rich text field. However, I think having the changes happening in parallel was a bad idea.Part of me assumed these changes would exist in separate worktrees or branches so that they wouldn’t conflict. Some of the demo videos made that seem like it might be the case but no, it’s the same as running two separate Claude Code instances in the same repo. Just a new UI.Ultimately, I wanted to put Antigravity through its paces, but running two things at once confused me a bit while learning a new tool. It also seems to have confused the interface too because one of the agents just stopped responding to my prompts after it tried unsuccessfully to test the feature in the browser. The other agent completed it’s work fine but in the Review tab was still showing both sets of changes which was, again, confusing.At the end of my coding session I had problems determining how to progress and what to do with the resulting conversations. There’s an ability to review the changes and provide feedback but it’s still a bit confusing how to get the IDE to commit the change from the Agent Manager view. When I did merge I also didn’t know what to do with the conversations. I wish there was an archive feature or something as deleting these conversations doesn’t feel great especially when the Knowledge doesn’t seem to update.To complete my changes, I ended up just switching back over to Claude Code as that seemed to have overall better context on what I was building and I had better muscle-memory as to how to progress a late stage change.In Antigravity, there’s a lot of really great intentionality around context but some of the control around context does feel limited. Because it’s a “smart system” there’s a lot less control. That’s helpful in someways but also makes it more difficult to understand exactly what’s going on at any time.I keep noticing some things I’ve come to appreciate about other interfaces that are missing here. One example is queued changes in Claude Code. If there’s a string of commands that make sense to just run one after another, it’ll queue them up. I find that while the auto approval works well in Antigravity, but I find there are times where I need to wait to approve several changes that were clearly known in advance that could have been approved without delays between each approval.Release Article & VideosThe release blog mentions “Gemini 3 is also much better at figuring out the context and intent behind your request” but I haven’t found this to be the case. Jumping into an existing codebase some of the core NextJS architecture I had in place for version 16 was ignored despite clear indications. That said, many of the solutions create were well done. They just didn’t retain the high context as this blog post might indicate.In the getting started video, it was refreshing to see the a Google engineer directly trust the AI with his API key. I think that’s honestly the norm in a lot of cases depending on how permissive the keys are. It enabled the AI to explore and investigate the API with context of the interface due to some Googling of the interface from the web. In all honesty, this is a pretty likely use case for most devs.Agents Testing During ResearchIt’s amazing to see the impact of the Intelligent Tool Approval when it comes to agents doing research. That’s where there’s a bit of magic in this release. This makes me think that this might be one of the better agentic interfaces to do coding research within.Nano Banana Image GenIt’s awesome to have an image generation module as powerful as Nano Banana running directly in the IDE. It can generate assets and directly add them to the application. That’s pretty incredible. (no transparent backgrounds though unfortunately)NextI’m intrigued by Antigravity but it’s still feeling a lot like a Beta of something that could be cool. It’s going to be interesting to watch other competitors in the space learn from these solutions and find ways to improve their services based on these changes. I wouldn’t recommend rolling Antigravity as your main editor for a few weeks while bugs get ironed out but I think it’s great to experiment with and potentially run research tasks on existing projects.This Substack is reader-supported. To receive new posts and support our work, consider becoming a free or paid subscriber. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
In my latest video, I share a high-level summary of building a full feature with Claude Code and AI-assisted coding — from rapid prototyping to a frustrating production bug that ate up 20% of my time. If you haven’t seen it yet, you can catch the overview here:Videos are great for telling the story at a glance focusing on the high-level summary — but they don’t always have the space to unpack the details that really outline the feel of building something with an AI.That’s what this publication is for. Unpacking the hype and diving deep.If you’re building with AI tools or just curious about the real-world bumps on the road, this deeper dive will give you practical insights and a more nuanced perspective.Lessons Learned in the Coding SessionI’ve captured all of the 88 messages I sent in this coding session in one consistent webpage where they can all be browsed at your convenience. There’s also this companion site where I outline some of the high level takeaways.In this post I’ve pulled out a few gems that highlight some tips and tricks that I think are pretty critical to having a good experience doing AI Assisted Coding.For each note, you’ll notice the note number (message number), a quoted text that indicates the message I sent the AI, and a description that details what I was doing at each step. For example, the first major message I sent was this one:Here we’re catching Claude Code up to some of the research I did to begin the feature. You’ll see the comment there and a quick note with the details. This is everything I needed to give the AI to get the feature started.Let’s look at a few other notes that are more tellingProviding Context7 to DocumentationContext is super important to get right when working with AI, so much so that there’s an entire Context Engineering movement. As such, I often leverage Context7 to pull docs from technologies I’m coding with to leverage in my code.Today’s AI models are trained at some point in the past and all contain a timestamped view of the world and pulling this context ensures we’re aligned with the latest version of any technology we’re using. This can have a huge impact on how well a feature is integrated in our code.I’m using this to ensure we’re correctly calling the API and ensuring that we’re downloading the image from that API in the right format.Note: I’m also not generating code right away. I’m using this research to put together a document that we can reference later when we build out the feature. I find this helpful to ensure we keep available key technical considerations as Compaction in our conversation (generated summaries of conversations) may cause some details to get lost.Prompt to PromptAs I’m building features that, themselves, leverage Large Language Models, I’m needing to create some prompts and tweak them as I go. I generally will tweak a bit by hand but I’ve found as I continue working with Claude Code that it’s often more explicit and creative at coming up with good prompts.Anthropic themselves also have a really great prompt creation tool on their Claude Console page for developers to take their prompts and refine them into more optimized prompts for working with their models.Generate Test DataI’ve found AI uniquely helpful in getting some test data spun up quickly for testing out features. If you let it go a bit wild you’ll sometimes be rewarded with something cool or unique! Try it out sometime!Regular Planning is KeyUsing Claude Code, planning is a pretty critical step.I will generally add planning in as a step whenever my direction shifts fundamentally. Here I’ve realized I have a new page that I need to develop that leverages the feature we’re working on but will require a much wider sense of the application to be able to pull off well. At these times I want Claude to be investigative and curious rather than ambitious and assuming. If I just let it go on a feature like this it could easily create a page I don’t want or add unrelated changes that don’t line up with the direction I’m moving towards.Give Claude Code Your Tedious and HungryI love tasking Claude Code with fixing the code around the changes that have occurred. When I use AI Assisted Coding tools I also couple those tools with a suite of analysis tools to test the code that is generated for common errors and mistakes.This is key to going fast and projects that I don’t have setup well in this space move slowly even when I use AI Assisted coding practices. It’s plain painful to get them moving as I need to be much more diligent as to what changed and why.That said, you may need to poke it a few times to do what you want. AI will often take shortcuts in order to try to focus on what it sees as it’s main goals and priorities. This means occasionally you’ll need to redirect those goals a bit to align with your own.Regularly Document WinsWhen I finally pushed through the major defect that had come up, outlined in the video, I realized we needed to update our documents to capture this. It wasn’t captured anywhere else yet and it is critical to get this captured so we can refer to the fix in the future.An Image is worth a lot of WordsPassing images to Claude Code is an amazing way to tell it exactly what you have in mind. I regularly pass images to it when I’m working to have it understand something a bit more complex or need to simply point at something for it to understand. Reference images can help a lot when coming up with a design or theming around a color.I will regularly leverage a visual vibe coding tool like Magic Patterns to build out ideas for me to quickly reference via screenshots or code snippets. I’ve got a bit more about how I use that tool here:The Overall JourneyBuilding with Claude Code is exhilarating because you can go from idea to implementation so fast. But as I’ve learned, it’s just as important to slow down sometimes — to observe, orient, and decide — before you act. AI can supercharge your coding, but it can’t replace the human insight that keeps your code cohesive and aligned with your goals.Going back through these messages has been a great way to surface some of the ways that I work with Claude Code to prototype. As you go through them, I’d love to know if you found anything interesting about the way that I work with Claude Code that I didn’t mention. Additionally, I’d love to hear how you’re working with AI-assisted coding tools. What bumps have you hit? What tricks have you found? Drop a comment or reach out — let’s keep unpacking the hype and learning together.This Substack is reader-supported. To receive new posts and support our work, consider becoming a free or paid subscriber. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
I wanted to collect a few thoughts on the recent OpenAI Dev Day 2025 announcements from my initial investigation into the tech behind the announcement. Based on a couple years of building AI integrations into applications these are my gut reactions to the presentation.If you’re unfamiliar with their presentation, it might be valuable to skim through it first.Now let’s get into it.Our TakeAgentKit and Agent Builder feels great and looks like what the future of tools for building Agents around real products and services will look like. It’s like an extension on n8n’s existing capabilities (another agent builder tool) but AgentKit is streamlined around OpenAI’s offering. This could be serendipitous if you’re already leveraging their existing file storage solution or other core features.The kit is streamlined for building quickly but I don’t really think it’s quite the powerhouse people think it is. We can see why by reviewing some of the other features in the announcement.ChatKit is an application layer toolkit for delivering a chat interface to end users by directly leveraging ChatGPT’s methodology. It has a nice set of features that manages to do many of the things that the Vercel AI SDK was already doing well. I’m a fan of some of their direction which is clearly inspired by that library.Similar to building with Swift for the Apple iOS, to leverage this kit a team will be aligning with a design system of visual components created by OpenAI. Until we get a chance to play with these components, it’s unclear how far these components can be pushed and where the limitations are. Additionally, we’ll need to wait and see how this library will evolve over time since this product space is certainly very new.OpenAI is looking to own more of the experience layer by providing an ecosystem of UX and UI tooling. Applications leveraging their agent platform will need to keep pace with changes and adjust accordingly if they adopt this approach to building. That can be an uncomfortable place to be longterm and might be a worry for early adopters.OpenAI mentioned that there would be a capability to publish Apps in ChatGPT in the future but no word yet on exactly when. The actual guidelines to publish an app are quite extensive however. They remind me of an Apple App Store-like approval process blocking publishing and becoming featured. Adherence to the style and intent of ChatGPT will be directly rewarded here.AgentKit doesn’t directly land users into ChatGPT, which can be a bit misleading if you watched the presentation. It seems like AgentKit has everything setup to build something into ChatGPT itself but in actuality AgentKit is for creating agentic experiences on separate company-specific websites.As OpenAI leans into adopting MCP, there seems to be some underlying messaging that companies don’t have a vendor lock-in around OpenAI. However, the MCP ecosystem is still missing many core services and is still maturing. I’d argue there is a lot of inherent vendor locking with AgentKit. That much is clear.Evals is one of the most compelling reasons to be excited about AgentKit. But to leverage it well teams will need a very clear vision of what an agent does and what it looks like when it does something well. That continues to be a difficult spot for product builders to define.Overall, I think AgentKit shows an interesting perspective on what agentic platforms should look like. Unless there’s a clear path towards Apps in ChatGPT I think the main adopters of these releases are going to be B2B application builders. While there’s room for a B2C path, losing brand seems like it lacks competitiveness and limits the upside potential.Existing options for building agents continue to be available and, given a team with some frontend engineering capability, those solutions aren’t as complex as OpenAI makes it sound. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
I was deep in a World of Warcraft inventory crisis the other day—bags full of random items with cryptic names. “Tangy Clam Meat” sat there taunting me. What do I even use this for?This simple gaming question sent me down two very different paths that perfectly illustrate how search has fundamentally changed. And if you’re running any kind of online business or content operation, this shift is about to upend everything you know about visibility, traffic, and revenue.A Tale of Two SearchesPath 1: The Traditional Google JourneyWhen I typed “what do I do with Tangy Clam Meat wow classic” into Google, I entered a familiar but exhausting maze:Step 1: Google shows me search results (after scrolling past AI summaries I don’t trust yet)Step 2: I click on WowHead because it ranks firstStep 3: I’m bombarded with ads—top, bottom, sides, pop-upsStep 4: I navigate their specific UI, hunting for informationStep 5: I discover I need to click on “reagent” (who knew that’s what cooking ingredients are called?)Step 6: Finally find my answer buried in the middle of the pageTotal time: Several minutes.Mental energy: Depleted.Ads encountered: Countless.Path 2: The AI ConversationThen I tried Dia‘s AI search. Same question, completely different experience:Step 1: I type my question naturallyStep 2: AI searches multiple sources simultaneouslyStep 3: I get a direct, synthesized answerStep 4: DoneBut here’s where it gets magical—and this is the part that changes everything.Context-Aware Follow-UpsNow, I had more items to check in my inventory. In the traditional model, I’d have to either:* Start a completely new search for each item* Try to navigate the same cluttered website to find more information* Open multiple tabs and repeat the entire processBut with AI search watch what happens:Me: “what about tender crocolisk meat”AI: Immediately understands I’m still asking about WoW Classic recipes and provides the answerMe: “raptor egg”AI: Knows the context (a directory of results), gives me recipe detailsMe: “small venom sac”AI: Tells me it’s not for cooking but for alchemy insteadI didn’t have to specify the game. Didn’t have to say “recipe” or “wow classic” again. The AI maintained our conversation context. I literally just typed item names—sometimes misspelled—and got exactly what I needed.This isn’t just convenience. This is a fundamental reimagining of how we interact with information.Why This MattersThe Click-Through Economy is CollapsingMany a website’s business model assumes one thing: you’ll click through to their site. But when AI provides answers directly, that assumption crumbles. Here’s what’s at stake:Revenue Streams in Critical Danger:* Display advertising (no clicks = no ad views)* Affiliate links (AI won’t pass these through)* Sponsored content (less attractive with declining user counts)But this isn’t just a story of decline. New revenue opportunities are emerging for those willing to adapt—from data licensing to AI-specific services. Watch my detailed breakdown of the revenue transformation matrix and emerging opportunities →The Paradox of More Content, Fewer VisitorsHere’s the mind-bending reality of AI Engine Optimization (AEO): You need to create more content to get fewer visitors.Why? Because AI systems need comprehensive information to reference. You’re no longer optimizing for one perfect landing page. You’re building an entire knowledge ecosystem that AI can traverse.Example: Instead of one “Tangy Clam Meat” page, gaming wikis now need:* “Where to farm Tangy Clam Meat in Westfall for Alliance players”* “Is Tangy Clam Meat worth keeping for leveling cooking 1-300?”* “Tangy Clam Meat vs Clam Meat - which recipes need which?”* “Best grinding spots for Tangy Clam Meat for level 15-20 characters”* “Can Horde players get Tangy Clam Meat or is it Alliance only?”* “Auction House pricing guide for Tangy Clam Meat by server type”Each page might only get a handful of direct visits, but they all contribute to the wiki’s visibility when someone asks an AI “what should I do with this random meat in my WoW inventory?”Admittedly this is a contrived example and I’m not sure how beneficial these questions would be for World of Warcraft directly but it’s illustrative of the kinds of content that may answer AI questions.The Metrics That Actually Matter NowThe challenge with measuring AI visibility is real. As HubSpot discovered, AI results vary based on conversation context, user history, and countless variables you can’t control. The same query produces different results depending on what questions came before it, whether memory is enabled, which AI you’re using. You can’t A/B test AI responses like Google rankings.Here’s what we do know and can measure:1. Traditional SEO Remains Your Foundation AI systems pull from search-indexed content. Without SEO visibility, you likely have no AI visibility:* Organic rankings (your baseline for being discoverable)* Indexed pages (comprehensive coverage = more AI reference material)* Domain authority (trusted sites get cited more often)2. The Volume-to-Visit Paradox Track the new reality HubSpot describes—more content, fewer visitors:* Total pages published vs. traffic per page* Coverage of long-tail questions in your space* Visitor qualification metrics (conversion rate, time to purchase)3. Visitor Quality Indicators The few humans who arrive have already done their research in AI. Monitor:* Conversion rates (should increase)* Pages per session (should decrease—they know what they want)* Support ticket sophistication (fewer “what is this?” questions)4. Competitive AI Visibility Manual checks remain your best option. Weekly sample queries about your category:* Do you appear in AI responses?* How prominently versus competitors?* Which of your pages get cited as sources?5. Content Architecture for Agents You’re now publishing for machines first. Measure:* Question-answer pairs created per topic* Structural clarity of your content (can an AI easily parse it?)* Topic interconnection (how well you link related concepts)The uncomfortable truth: we’re measuring proxy metrics because the real metric—influence within AI conversations—is largely invisible. As HubSpot notes, this is marketing for agents versus humans. The agents don’t click, don’t convert, and don’t fill out forms. But they determine whether humans ever hear about you at all.We’re in uncharted territory where success might mean accepting lower traffic while betting that the traffic you do get is exponentially more valuable.The Two-Audience StrategyYou’re now designing for two completely different consumers:AI Agents* Need structured, comprehensive data* Consume hundreds of pages to form opinions* Prefer clear, factual information* Value completeness over creativityHighly Qualified Humans* Already know about you from AI conversations* Ready to buy, not research* Need immediate value demonstration* Want streamlined conversion pathsThis is a fundamental shift from offering content (mostly for free) and profiting off of advertising. Now it’s likely if you’re making content it needs to be paid. (oh! Maybe this is a good time to check if you’re subscribed 🙃)This Substack is reader-supported. To receive new posts and support my work, consider becoming a free or paid subscriber.Practical Survival Strategies1. Build Your Answer ArchiveTransform your existing content into Q&A format. Every blog post should answer specific questions people might ask AI about your space.2. Create Conversation ChainsDesign content that naturally leads to follow-up questions. Think about the customer journey as a conversation, not a funnel.3. Establish Direct RelationshipsEmail lists, apps, communities—anything that bypasses search becomes exponentially more valuable. You need to own your audience relationship.4. Structure for MachinesWell-organized, schematized data becomes a competitive advantage. Structured data that AI can easily parse and cite will win over beautiful but chaotic content.5. Monitor AI MentionsSet up systems to track when and how AI systems mention your brand. This is your new SEO ranking.What This Means for Content CreatorsThe comfortable era of “write content → rank in Google → get traffic → monetize” is over. The new reality:* Your content might be read entirely by machines* Success happens inside AI conversations, not on your website* The few humans who visit are ready to buy, not browse* Brand building occurs in AI memory, not human memoryThis isn’t just another algorithm update. It’s a fundamental rewiring of how information flows online.The Bottom LineThe shift from search to conversation, from clicks to context, from keywords to knowledge graphs—it’s all happening now. That simple gaming question about Tangy Clam Meat revealed a seismic shift in how we find and consume information.Sites that adapt will thrive by becoming invaluable data sources for AI while creating exceptional conversion experiences for the few humans who visit. Sites that don’t adapt will simply become invisible.The question isn’t whether this change is coming—it’s already here. The question is whether you’ll evolve your strategy in time.What’s your experience with AI search? Are you seeing changes in your traffic patterns? Drop a comment below with other Substack members or join our Discord to discuss.For more deep dives into AI’s impact on digital business, subscribe to Stable Discussion. And if you want to experiment with AI-powered research yourself, check out Benny Chat. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
Over the last week, I’ve been captivated by the idea that Designers are best positioned to leverage AI on development teams. AI is changing how products are built, but there's a blind spot: designers are still standing on the sidelines, even as the tools finally let them take center stage. Most teams treat AI as the domain of engineers and data scientists, and for designers, this technical barrier makes AI seem unapproachable and out of reach.On the other end of the spectrum, there’s a subculture of “vibe coding” and hustle culture. Small teams or solo builders are cranking out rough AI prototypes, often without rigorous product development practices.But even as these experiments multiply, they rarely result in thoughtful, user-centered products—often sacrificing quality and vision for speed. This highlights a gap: while engineers and hackers can rapidly iterate on technical possibilities, what's too often missing in the process is the guiding hand of design.I’ve noted that the teams closest to the customer are best positioned to deliver real value. Designers, more than anyone, bridge the gap between WHAT a customer wants and HOW the business delivers it. This makes designers uniquely well-placed to drive and shape how AI is applied to solve practical, customer-centric problems.What’s new, and what too few teams have noticed: the AI toolchain has finally become accessible enough that designers themselves can prototype, test, and iterate—without waiting for engineering or hunting down a Python wizard.Design-Driven Product DevelopmentOver the years, I’ve formed a straightforward operating model for developing great products on engineering teams:On most teams, "Make it Work" means building quick, rough prototypes—getting something functional before worrying about polish or coherence. That may sound efficient, but by relegating design and user experience to the end of the process, these products often inherit all the awkwardness, missed opportunities, and makeshift decisions of their early versions. Design becomes an exercise in damage control and technical compromise—not in envisioning or elevating what’s possible.Teams can attempt to avoid this list order by doing big planning cycles, documenting ahead of time, or other attempts to "shortcut" the process. However, when they run into problems, these approaches often revert to doing things in this order. This is just the tried and true way of getting things done.Why not flip this script? What if, from the very beginning, designers were the ones to shape the prototype—not as a surface afterthought, but as the driving force for both how the product functions and feels? If prototyping is the process where key decisions are made, designers should be there, guiding what’s built, not just decorating it after the fact.Now, as AI makes prototyping more accessible and immediate, designers can move from concept to interactive demo without the traditional bottlenecks. This shift helps ensure that design considerations aren’t an afterthought, but baked in from the earliest steps.Some of the most innovative solutions come from design-led exploration—where a designer, by understanding both the user and the technology’s constraints, proposes an approach no one else saw. By leading with design, teams can reduce costly rework, discover what users really want earlier, and prevent soulless or awkward interfaces from ever making it out into the world.Representing Business and TechnologyDesigners bridge the gap between development and business teams. They translate technical constraints into user-centric solutions that meet business objectives. They also transform high-level business requirements into wireframes, prototypes, and visual designs for developers to build out.Negotiation is essential, not just to the design role, but across the triad of product, design, and engineering. Each group brings its own perspective, priorities, and blind spots: designers may champion user needs but sometimes underestimate technical effort; developers possess crucial implementation insight but can occasionally lose sight of broader business or user aims; even product or business leaders may bring great vision but stumble on feasibility. The healthiest teams recognize these dynamics and lean into the creative tension, surfacing their assumptions and sharing context early and often.When these disciplines disconnect, you often see familiar breakdowns: designers shut out of early technical decisions; product obsessing over features without clarity on what’s possible; and developers, at their worst, retreating into reactive “IT mode,” simply processing tickets and change requests rather than partnering in the product vision. Nearly everyone working in tech will have seen these patterns and felt the frustration they create.The opportunity, then, isn’t for designers to take over prototyping alone, but to pull the process closer to multidisciplinary influence—helping organizations build better products faster by dissolving long-standing silos.AI PrototypingAI prototyping is better than ever before. With a competitive landscape of new tools, there are many great solutions that improve over time. And with so many people looking at leveraging these tools, a variety of new techniques are being explored that continue to push what they are capable of.While coders will likely leverage IDEs (Integrated Development Environments) like Cursor or Windsurf, web-based solutions that don’t require a complex development setup tend to be easier to use. These web-based tools additionally offer the ability for teams to remix solutions with others and share prototypes across a team.These days, I prefer v0 because of their direct connection to the Next.js technology and their integrated Vercel deployments which are familiar to me. Finding a tool that matches your experience offers a significant advantage. Additionally, the design aesthetics of v0’s solution seem to be pretty good for my needs.Other tools like bolt.new and lovable.dev offer a similar suite of tools but focus differently to best match the needs of their customers. As this space continues to show huge revenue growth and remains novel to market to users, additional solutions continue to be released.Designers Building with AI PrototypingI was able to run a workshop on AI for the design team at Compass Digital. This workshop provided AI fundamentals for building personalized AI design workflows but also provided guidance on prototyping using vibe coding techniques. By the conclusion of the session, the team felt familiar with the concepts and were putting together some really interesting designs that immediately pushed at the limits of what’s possible with these prototyping tools.Designers often need to guide coders and product managers to understand what’s possible. Because coders are more focused on the code, a user experience that aims at a specific visual style often gets lost on them. Product managers are excited about what they know but are usually a bit more concrete-minded and need to be shown what’s possible. Once they get it they’re usually fully behind an idea.At my first startup, I saw firsthand how designers can expand what teams believe is possible. We were stuck on a UI detail—a post-it note display the developer thought couldn’t be built with CSS. I took a stab at it and sent over a solution. When my cofounder saw it working, he realized more was possible than he’d assumed. That moment not only solved our immediate problem but also deepened our collaborative approach to product development.Designers frequently bridge gaps between vague ideas and concrete solutions. With AI prototyping tools, they're even better equipped to overcome blockers and build stronger, more collaborative relationships with other teams.The New World of Design-Led ImplementationWe’re now establishing a new operating model:* Design to make it work* Develop to make it right* Collaborate further to make it goodWith new AI code generation tools, designers are better positioned than ever to build incredible things. These initial solutions can be the foundation of great features that enable the rest of the teams to better build and establish solutions that bring it to life.If designers remain sidelined, teams will keep shipping products that feel disjointed, generic, or frustrating to use. But if design leads without staying grounded in technical and business realities, solutions may end up beautiful but impractical or impossible to bring through to production. This new approach depends on design remaining tightly connected to both business goals and engineering constraints, stepping beyond old silos to work collaboratively from the start. Rather than throwing things over the fence to development and back again, we must establish clear outcomes and follow them all the way to conclusion.While this is likely only possible with web development teams due to the limitations of these tools, it’s extremely likely that code generation tools for apps are just around the corner. For now, web teams are living in the future and should look to benefit.If you feel like your team is excited about this future or you’d like to learn more about what AI can do for your design team, we’d love to hear from you. At Hint Services, we run workshops specifically for design teams and offer advice for clients looking to leverage AI tooling in their organizations. If this resonates with you, please drop us a line!Stable Discussion is reader-supported. To receive new posts and support our work, consider becoming a free or paid subscriber. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
The podcast returns to discuss the significant advancements in AI image generation and video creation technologies, focusing on MidJourney 6 for its photorealism and stylistic capabilities, DALLE 3's improvements alongside ChatGPT, and the emergence of Stable Diffusion 3. They highlight the rapid maturation of image generators, mentioning developments in real-time generation and the potential applications in dynamic environments like video games.The conversation also covers the advancements in video generation, specifically mentioning OpenAI's Sora. They touch on the integration of these technologies with language models, leading to more complex and multimodal AI capabilities. The discussion reflects on the broader implications of these AI advancements on creativity, productivity, and the potential for these tools to understand and generate content with a deeper grasp of context and creativity.Show Links:MidjourneySoraRing Attention with Blockwise Transformers for Near-Infinite ContextInformation articleThanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
Della and Ben return to the podcast and get caught up with months of news bringing together their highlights just in time for the holidays. Stay tuned for an interesting run through the latest major changes that excite and inspire us about AI.LinksDALL·E 3SDXL TurboPikaGPT-4 TurboAWS Bedrockllamafiletwominutepapers $1 GPT gameAda Customer ExperienceThanks for listening to Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
OpenAI released a new feature where you can create "your own GPT" experience within ChatGPT. Builders of the new GPTs can adjust ChatGPT to act differently and read from custom documentation, all without needing any coding knowledge. Additionally, there's potential to make money off of these tools, which adds significantly to the marketability of these features. However, I struggle to see a revolutionary change with GPTs.I see GPTs as something similar to a bookmark or a shortcut for an assistant. The same functionality exists easily in ChatGPT, but this is a faster means of achieving the same things. I use bookmarks frequently and after sitting with GPTs for a little while, I can see their usefulness. I just think that the value is limited.To explore their usage, I built a GPT that can help with writing emails. It templates setting up a chat around what kind of email I'm sending, how that email should be written, and a little context helping to craft the letter. This allows me to write what I want to say using shorthand and it gives me back a structure that matches the intended recipient. You can try it out here.It's a pretty handy GPT and has also helped me teach what's possible to those less familiar with the ChatGPT experience. The prompts and setup are built-in, and you can get right to chatting, which helps cut down on the confusion for those new to working with AI interfaces. Unfortunately, these aren't shareable with others who don't already have a ChatGPT Plus subscription, so the user base is limited. But being accessible to new users isn't the only measure of usefulness. To be highly useful, GPTs need to deliver a great experience to the user around a specific task. To do that, it needs to be hard to distract it from that task. Say we want critical feedback on our writing, and it responds with something true and helpful that we don't want to hear. If we argue with it, it should hold its ground or navigate the conversation in a way that can help to convince us. But GPTs can be derailed by user requests and arguments, which means they'll most likely cave to your opinion rather than help you.This makes using AI programming interfaces, like the OpenAI API, much more powerful for crafting excellent experiences. By interpreting user input in a program, each request can be modified so that the AI responds in a direct and intended way. While you need programming skills, the user experience can be significantly better.One of the most memorable experiences of a stubborn AI has been in my experiences chatting with Pi. After some conversation, I tried to practice Korean with it. The AI unfortunately believed I was joking around and making up words. I tried to correct it and told it how I was learning Korean with my girlfriend. It laughed at me and couldn't believe I had a girlfriend. (Ouch...) Nothing I could say would derail it from its belief that I was joking with it about any topic.This experience was unlike anything I'd experienced with ChatGPT. While the responses weren't following my commands, they did convince me that I was speaking with something that had its own agenda outside my own, which was compelling. Comparing that with the unconfident responses of ChatGPT responding to your criticism shows just how much more there is to explore outside a GPT-driven experience.One other major component of GPTs is the new documentation integration. GPT builders can add documents to be referenced in conversations that improve the responses and provide information that the AIs have not been trained on.However, there isn't a lot of control over how the documents are read by the GPT. Users may ask questions from the documents and get back responses that correctly reference the document but don't actually give you the knowledge that the document holds. This is because you don't have control over how the documents are read compared to hand-tuned retrieval systems. We made a YouTube video about this where you can find more information about how documents are tricky to reference with AI systems.DALL·E 3 integration into GPTs seems unique and interesting. The integration of chat and image generation means that your control over the images is lessened, but the assistant can do a lot to facilitate image generation. If we could control a bit more about how documents are referenced, there could be some interesting avenues where GPTs could define a style or direction for image generation. Again, users generally have more control when dealing with the programming interfaces directly.In all, I think GPTs provide a unique shortcut for your usual ChatGPT experience. While the reality of using them is limited, they may provide a helpful introduction to those who are less familiar with AI. Engineers, programmers, and scientists will likely see the edge cases quickly but may still benefit from some provided shortcuts. The experience isn't revolutionary, but it has some usefulness given the right thinking around what is provided.Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
With the changes at OpenAI this week, I'm assuming your newsfeed is being flooded with speculation and drama. Maybe you've been missing Game of Thrones and that's exactly what you're hoping to see. However, if you'd rather find other sources of information about AI outside the tabloids, then I have a model for you.This continues our prior post from last week that covered the first part of this model. Those were the Commentators, Professionals and Innovators creating AI content. This week's post covers the remaining two groups: Leaders and Misadopters.LeadersThese are engineers and scientists from industry-leading AI companies who form the direction of AI advancement. Seeking to compete with other companies, these groups look for highly scalable solutions that promote their businesses. The latest major releases from OpenAI this week fall directly into this category.There are some seriously incredible things being done by the major companies leading AI. But you can't separate the work from the motivations. Those companies have a specific mission and may look toward that mission over exploring the space that they're helping to develop. Sometimes this can make their announcements feel disconnected from the real world or more important than they actually are.Industry leaders are the best at marketing information about what new technologies are being created. The content is entertaining and informative and provides a good introductory view of these concepts. But if you're looking for a more objective measure of the technology, you might not find that view being publicized by those with the most to gain.Leaders to watch:OpenAI - Pretty clear driver of major change and has "GPT"-of-things as its brand, which has become somewhat synonymous with AI.GitHub Next - A team at GitHub looking at developer applications of LLM software to determine what is possible.MisadoptersThis group largely focuses on shouting down the hype surrounding AI. While some of the content that they create is diligently documented or reviewed, much of the content can be opinionated or situational. Similar to other interest groups, this group is often looking for a place to lead the conversation on technology and the development of a brand.As with many technologies, people question the need for a new technology and its ability to solve problems better than the "tried and true" way. In infancy, the technology seems to have many limitations, and that can often make it seem like we should abandon any further investment. However, those with longer-term vision can see that there is opportunity, despite the advice of Misadopters.This advice can be well-intentioned for many reasons as well. A feature may not be well understood or incorrectly posted on a forum. Teams of engineers may try to take AI practices and integrate them into their existing applications. If it doesn't work as intended, they feel they have evidence that "AI doesn't work". These teams instead need to take a step back and look at their approach to see if maybe there is something that is missing. But that can be hard when you need to release software on a deadline.AI today can be a gimmick that gets rudely tacked onto an existing application like Notion or Snapchat. The service doesn't feel like it fully aligns with the mission of the application and isn't bringing something that the users were looking for. Product leaders need to reevaluate their objectives and find solutions where AI fits more naturally into the product, rather than shoehorning it into their existing solutions.Misadopters provide a great place to find negative opinions about AI. There will continue to be a growing wealth of this type of knowledge on the internet. However, these conversations aren't the most inspirational or pragmatic. As with any of the information produced by any of these groups, be careful to evaluate why the information was created and for what purpose.I try to forget Misadopters exist.While it would be great to have a few negative points to look at, I generally don't get that excited about shouting down ideas even if they're wrong. I don't believe AI fits every niche or possibility, but I also don't think it's just hype.My assumption is that you, the reader, can sniff out what kind of content I'm referring to. If not, that's ok too. Start looking and second-guessing broad statements about how "AI is this" or "AI will do that". Keep in mind that there is a wider range of possibilities if you know where to look.Bringing It All TogetherNow that we have the definitions out of the way it's time to decide what kind of content speaks to you. You've probably already checked out one or more of the sources included, but if not, let me try to point you to where you'll be interested.If you're interested in discussing AI in general and don't want to get too technical, I think Commentators are a good source for you. You'll get conflicting viewpoints often, but that can also be the exciting drama you're looking for.For news straight from the source and to better understand the latest updates that are being released, Leaders offer a one-stop shop. Most people with some interest in AI will be keeping up to date here, and it's good to stay apprised of the latest announcements coming from this group.As someone looking to dive deep into one topic or learn something particular about AI, Professionals are the best to look towards. They've turned teaching a skill into a business and there are a lot of people hungry for your attention and putting a lot of effort into ensuring that you succeed, so you can maybe buy a course or two.If you're looking to be inspired by the things people are building and better understand how and why, look to Innovators. These are the people testing the outer limits of AI and trying things nobody has tried before. You'll feel inspired and maybe a little dumb part of the time. But that's always good motivation to get started building your own thing.Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
It's easy to show someone something interesting in ChatGPT, create an inspiring image or talk about huge multi-million dollar movies being made about AI. While it can be overwhelming, all this content shows just how many directions AI can go in. Nobody knows where this is going or what's going to happen, but many smart people are paying attention.That said, there's also a lot of noise. You'll see courses being sold and brands being bolstered. Industry leaders showing off their tooling to garner support and funding. Scientists share findings aimed at understanding and measuring how AI changes. And a flood of TikToks, tweets, and news stories cover different events with a myriad of opinions and facts.While reading I've noticed that I need some sort of categorization. An ability to filter down which sources of data are the most interesting to me. After some careful review, I’d like to share that model with you now in hopes that it helps you better understand where to find what you're looking for.A quick disclaimer before we get started: This model helps to categorize in an approachable way that does oversimplify in many cases. Models just work that way. They're approximations of a complex universe and helps rationalize difficult to grasp things.The Five Sources of AI ContentCommentatorsThis group is largely focused on opinion and perspective. They're storytellers discussing the topics surrounding AI. These can be people who are just interested or "true believers" who think the world has and will change completely from this new technology. While Commentators don't look to gain wealth from discussion directly, it may be related to the work they do as an influencer or related to producing other media.I find that a lot of the topics and discussion coming out of this group pivots around a few details without much deeper context. If you want to find emotive stories that make you feel something, this is a great place to find it. If you're looking for a more comprehensive, accurate, or deeper understanding of AI, you likely won't find it here.Commentators to look for:Stratechery - A publication we frequently reference that hosts guests from a wide range of experience and expertise. While Ben Thompson has expert advice, he isn't directly in the industry and doesn't get paid for AI work directly.ProfessionalsThis group is largely focused on offering a service or benefiting financially from AI. That could be selling a course, investing in a business for a great return, or creating a news outlet focused on some aspect of AI. While there can be a deeper discussion about how to leverage AI, they're largely interested in short-term opportunities over pursuing a long-term vision.I find that professionals are focused on a single track of thinking or experience. While they may connect AI with some other discipline, it's usually along a single line of thinking. For example: "AI art to create game assets". This limits the conversation to a specific marketable direction and can feel unauthentic for those looking for a deeper discussion.What's more, oftentimes the products being built by this group are mostly marketing and don't hold up to a closer inspection. There's An AI For That is filled with demo applications being sold as if they fully solve real problems. Be careful what a professional promises.Scientists also fall into this category. Great papers have a specific hypothesis in mind and great thinkers are looking for recognition for their ideas. While the science being done can fuel interesting conversations, findings, and ideas, it largely doesn't exist beyond a theoretical space until engineering applies it.Professionals make a great source for learning something specific or understanding how to profit from AI in the short term. The motivations are obvious, but this space can still catch you off guard because of how interesting some of the stuff sounds. If you're excited about building something or want to see a bigger vision, Professionals might not have what you're looking for.Professionals to read:Everyday AI - Aimed at a general audience, looks to understand AI trends and communicate about AI. I have found this a balancing perspective to that of my engineering background.Lilian Weng - A really fantastic blog, diving deep into the engineering and science of things to pull out the details. With mostly raw facts, it's up to the reader to put together more of the big pictures at play. One of my favourite blogs on the topic of AI.LangChain - A major technology aggregator for things AI into a pipeline for engineering systems. With many partnerships, they have largely transitioned from Innovation content to Professional content aimed at being a one-stop platform. They still aggregate interesting ideas, but much of the content has shifted.InnovatorsThis group is made up of hobbyists, engineers, and creatives who are looking at AI and thinking about what it can enable. This group is less concerned with profiting from AI and instead is interested in being a part of the movement. They have enough experience across multiple disciplines to focus on interesting problems that others might not be able to see.Rather than being focused on the scientific aspects of AI, this group is pragmatic about creating new features and experiences from these tools. They reject a common understanding of what a tool can do and look for a deeper meaning. Generative AI isn't just a tool for generating art or content, it's a tool to generate rational responses to input. The distinction matters to this group, and it helps them open up a much wider pool of opportunities.This group is great at looking into the gray of a problem and seeing things for what they are. Innovators offer a space to build something interesting and discuss interesting ideas. For strong opinions that follow a well-formed interest group, you'll need to look elsewhere.You're in the right place for this kind of content already! Other Innovators to follow:Maggie Appleton - One of the people who focus on design and craft over profitability, largely due to her work focusing on academia. She regularly collects interesting findings and posts a lot of interesting ideas and thoughts on her blog.Percy Liang - A professor at Stanford with his name on some of the most notable papers related to AI and LLM research. He helped author my favourite paper on AI agents. This system building and analysis helps really shape what's possible through applied experimentation.Part 2There are too many sources to go through in a single post. We'll cover the last two in a follow-up post next week! Hopefully, there won’t be any major announcements to disrupt that plan :)Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
As I board a plane on a trip to Korea, I can’t help but think how little AI helped to prepare me for this trip. I researched destinations on travel sites and with Google Maps. Purchasing was straightforward on booking sites where I was able to use my points. Based on previous trips, I created a list on my iPad of everything I needed to pack. Thinking back on it I can’t help thinking… as someone with such a big focus on AI, why don’t I use these tools more?To start, it’s important to think about why we use tools. We pick tools depending on the job we’re looking to complete. If you’re tired, you might look to make coffee or tea. Hungry? Maybe it’s time for a burger or some ramen. Common tools help us solve the issues we experience each day. AI tools like ChatGPT are just another tool to use.The choice of tool is at times personal but is usually determined by what we know a tool can or can’t do. I am not usually interested in using a shoe as a hammer (unless I’m missing a hammer). So the fact that I’m not using AI tools for my trip means that I don’t find them very useful for the problem I have and I have better tools. Or, at least I perceive that I have better tools.But what problem am I trying to solve? My biggest challenge is filling in the details of the trip. I know where I’m flying, I know what hotels I’ll be staying in, and I know a few places I want to go. I’m pretty decisive when I make the big decisions for this trip but the smaller details are harder for me. What train do I take to get between cities, what’s a good spot for dinner each day, or what would be a good gift to get my friends?The small details are where I need help. I’d like this help to be based on my preferences too. General help on the small stuff gets you stuck in a tourist-infested restaurant with a waiting line around the block. I need an itinerary manager to fill in the gaps in my knowledge/research and give me recommendations.I don’t find ChatGPT to be a very good itinerary manager. While the ability to converse and problem-solve is incredible, it still needs to get better to be able to fill the role. While there might be other options besides ChatGPT, the travel options on There’s An AI For That are a disappointing and too commercial. They also don’t fill out the role I have in mind either.An itinerary manager needs to have good access to verified information, adjust their recommendations to my preferences, and keep track of key pieces of information.Verified travel information online in countries outside North America is difficult to find. While ”every company has become an internet company” in the States, much of the rest of the world is still playing catch up. In Japan, software is considered a pretty boring career and much more emphasis is put on hardware development. This results in apps and websites that don’t offer the same level of services you might expect in Western countries. Google Maps still struggles to understand dense areas of many Asian cities.Further complicating things, aggregation websites can be completely different across regions. While Google is popular with English speakers, KakaoTalk is the go-to in Korea. Unless you can read Korean, it’s likely that any of the information there is going to be inaccessible even with the amazing photo translation available in Google Translate. A good itinerary manager needs to be able to combine these different sources of information to come up with a good plan.If you’re looking for better data sources, reviews and peer-to-peer forums like Reddit are often the most accurate. But this information is hard to get at without extra effort. Someone looking to learn more about any location will need to sort through relevant information that is quite old or recent information that isn’t relevant. Combing through the comments requires research, either by a person or by an AI.ChatGPT’s Bing Search isn’t going to be able to really find most of this information. While ChatGPT Plugins can provide some travel integrations to make things a little easier, the ones that currently exist for travel are lightly-veiled service offerings for renting cars or booking flights.You can paste in data that you find elsewhere on the internet but you may find that the results ChatGPT can glean from your research are too simplified to help in many cases. Hallucinations based on partial answers cause even more trouble.The most popular AI tool, ChatGPT, remains untrustworthy for the task of finding accurate data. But say we were okay with some research on our end, we’d still like to have ChatGPT give us some suggestions that are relevant to us. For that we need personalization. The ability for us to have our itinerary manager remember what kind of places we like, how we like to get places, what we should bring with us, and what we’re willing to spend money on.ChatGPT is pretty good at getting feedback. If we ask for a suggestion of activities to do in a city and then ask for specifically indoor activities, it will take that feedback and iterate on it. But the number of add-on suggestions it can remember is limited. Keep asking it to tweak things and it may forget earlier tweaks you asked it to remember.Researchers have discussed the limited memory capabilities of ChatGPT and other AI models. Even after extensions to the total size of memory and text it can work with, there are still many problems remembering details. It’s believed that training data is to blame and companies are already working to adjust training data that can work with larger text and keep more context in memory. For now, the current workaround for memory problems is to put the most important information at the beginning and end of any prompt. ChatGPT manages how prompts incorporate the history of the conversation and may inject its content into these highly coveted positions. This limits the control power users have and can make the AI more forgetful than directly interfacing with GPT-4.Because you’re working with a generalized chat application when you use ChatGPT rather than GPT-4 directly, the underlying prompts and algorithms do some work to remember things that might be important for future prompts. You don’t control what is remembered. While planning a trip, it may just completely forget important details that, if forgotten, could cause chaos to your trip plans.Until memory problems are resolved in a generalizable way, ChatGPT remains an unreliable, inaccurate, and forgetful itinerary manager. While I’m trying to plan a trip, you can extrapolate this example into many other fields and roles. It highlights just how limiting AI applications are today.The story of AI is a compelling one. We’re seeing incredible things happen all of the time. But it might take ten years or more before we get to grasp this technology to do more than generate art and writing. I’m excited and compelled to keep looking at how close we’re getting after a bit of relaxation on my trip. Catch you on the other side!Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
We're still in the early adopter phase of AI. While many companies have adopted AI to please shareholders, excite users, or encourage internal developers, there are still a lot of details to figure out. The well-known challenges are the difficulty of aligning AI tooling with human interests and keeping user data secure. But the bigger challenge for businesses is finding a consistent, reliable, and controllable AI environment to build upon.This week, DALL·E 3 released an incredible new set of image generation capabilities that are deeply integrated with their revolutionary ChatGPT interface. You can discuss the generated images and give feedback that can add a rich set of details to the images. While we haven't been able to play with the tool directly yet, it's a paradigm shift that's very exciting.Using a chat client to generate images, users will not need to worry about prompting and can instead rely on the AI model to create prompts that follow their intentions and feedback. This is something no other platform has currently, and it's a big step forward for user experience. ChatGPT has become an art assistant.This is incredible for users, but what does it look like for businesses? Businesses might look to use DALL·E 3 the way they were looking at Midjourney in the creation of AI artwork. The big benefit of DALL·E is that there will be an API to directly ask for photos, rather than needing to use a Discord client to request images. This makes it way easier to get images and reduces the manual time required to produce and organize them.But if we look at DALL·E 2, businesses should be wary. DALL·E 2 was updated and changed dramatically over the months after its initial release. Styles and consistent outputs can shift, causing what once worked to produce a branded style to no longer look the same. Imagine if a business's brand design completely changed whenever another company changed their software! This is a big issue for businesses wanting to rely on this tooling for their needs.ChatGPT similarly has this issue, and most customers are completely unaware of it. This problem gets much more obvious with images. When you've built something cool on top of these tools, you expect it'll keep working the way you built it. For your average user, it isn't such a big deal if things change, but it is an issue for businesses.This isn't unique to OpenAI, this is a trend in Big Tech. Businesses constantly need to play catch up with Google's updated SEO rules whenever they change. If they don't stay up to date they get fewer customers coming to their site. Google makes these changes to help progress the development of the web in a number of ways that are beneficial for users. But you can't quite ignore that Google makes this happen through threats and algorithmic brute force against businesses. Businesses ultimately pay the bill for these changes.OpenAI similarly forces businesses to keep up to date. This is in part because of their initiative to align AI with human interests. Older models may be problematic and are regularly trained to be more equitable and socially responsible in their responses.OpenAI is also focused on research, and that drives these consistent updates. After all, they release these models for a cheap rate so that they can get more user data for training. If they don't keep most users on the latest versions, they'll miss out on valuable data.While ChatGPT does have versions, those versions eventually get deprecated. Usually only with a few months notice. Whether you're forced to update today or in a few months, the result is still the same. Businesses need to operate on OpenAI's schedule, not their own.Fortunately, there are a number of open source alternatives being developed and expanded by other companies. Meta is very bold about releasing their latest tooling for businesses to use. This comes from their strategy of directing the market to use their tools, so they can benefit from Open Source contributions and the cheaper talent creation that comes from others outside their company adopting their tooling.Businesses need to think about how the tooling they use will develop and if they'll get locked into using a high-cost vendor. Imagine paying for electricity, and the electric company requires you to rewire your office every few months. Maybe the electricity gets a little cheaper, but it's not likely to offset the cost for many years. That's the kind of vendor some of these big tech players can be, and businesses need to be wary.The future holds us all in some balancing act. We need tools that give us more control, flexibility, reliability and consistency to get complex work done. AI gives us a compelling case to solve many of these challenges. But the tools available to companies still have a long way to go before they're ready and you don't want to have to pay the cost for their research and exploration. Reach out to experts familiar with AI before committing to a platform or service.Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
Welcome, and thanks for reading Stable Discussion! We appreciate the support we’ve received and are honoured that you’re tuning in roughly every week to see what we've been learning and discussing!As we quickly started sharing what we've been thinking, we may have missed introducing ourselves. This publication, Stable Discussion, shares expert and approachable insights about modern artificial intelligence. We post articles and discuss these exciting changes on our podcast. We aim to make our content approachable and fun while helping to spread ideas and share our perspectives.Our hosts, Ben Hofferber and Abdella Ali are both experienced consultants and engineers focused on web and mobile customer experience. We've built services and websites for many Fortune 500 companies and have seen how difficult it can be to implement cutting-edge technical solutions in real-world environments. We use our experience to take a critical look at AI and how companies and people are looking to use it.This excitement and learning encouraged us to found Hint Services. We help advise and consult with companies looking to create user experiences leveraging AI. To get a taste of what we do, look to the AI UI/UX patterns page, where we highlight the patterns we're currently excited about. We believe that generative AI enables much more than just creating chatbots.Building AI tools and sharing our discussions and ideas with you, our audience, helps us to better understand what AI is really changing. We're always trying new tools and testing those tools here to see what everyone thinks. This gives us a unique perspective that we're excited to keep sharing and developing with the larger community.Thank you again for reading and listening. You'll be hearing from us again very soon!Our Discord CommunityWe’ve recently launched a discord community for AI Enthusiasts. It’s a paid community requiring an application process. We’re looking to connect with others excited about building User Interfaces and Experiences using LLM Generation and other AI tools.Please apply if you’re interested! Subbb.me/stablediscussionContent to Get StartedBelow is some of the content we're most proud of. It's been an amazing 8 months for us, and as we reflect, these are some of the high points that most stand out.This has been our most popular and trending post so far. A great introduction to thinking about what AI really is that helps to defuse some of the hype and ground yourself in the real interesting capabilities on display in the world today.This post is one of our favourites for discussing how creatives look to use AI. We have a few follow-up posts like The Mental and Creative Cost of GPT, but this is an excellent introduction to our ideas around AI art and creativity.If this post wasn't enough of an introduction, check out our Podcast. On the latest episode, we share what we've been up to recently and help round out this introduction with some more of our focuses and ideas.This post helps shed light on the strange realities of AI and how they work today. It's surreal that these AI models act so human in ways that keep them from being our familiar cold and calculating computers. For a shift in perspective, we think this helps to better understand them.This post got our parents to try out AI and ask us many questions about how they could do new things! I can't imagine a better endorsement for an article. If you have not found an excellent way to use AI or are still learning, check it out. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
The world is currently in an AI craze over new products like ChatGPT releasing that excites us about the future of AI. These chatbots are incredibly human in their response to our questions. I've replaced asking Siri with GPT and Pi when I'm curious about a topic or an idea. That intelligence is definitely part of the "I" in AI, but it can be hard to pinpoint what AI really is.Simply, artificial intelligence is when a machine takes input data or information and uses it to respond with an action. Think about your home thermostat. Even the simplest models will warm the room when it's too cold and cool it down when it's too warm. The thermostat AI uses sensor data to gauge the temperature and then responds by activating the furnace or air conditioning.AI isn't the specific pieces of the thermostat; it's the whole machine's function. AI isn't just the measurement of the temperature; that's just sensor data. AI also isn't a generated report of the temperature of the house over a given period of time. The system that tells the report to trigger might be, but the generation of the report isn't AI. It's just raw data. Taking sensor data and triggering heating and cooling is AI.Using video games as an example, Enemies often have really simplistic AI. They might just follow the player around using basic algorithms. The most basic is a decision tree that might be used for a checkers AI. A decision tree is just a set of actions that can be taken based on what an analysis of the board finds. If a piece can be taken, take it; otherwise, make the next available move.Methods or sensors that AI use aren't AI themselves. If you think about image recognition, AI takes image data to get details about what is in the photo and then uses training on past images to determine which details are in the scene, then it reports which objects it found. The trained model isn't an AI, the report isn't an AI, but the system working together creates an AI.I find these details really helpful when talking about AI because it becomes evident that AIs have been around us for a long time. Many of our everyday tasks involve AI to make our lives easier. But if AI isn't new, why is everyone talking about it?The newest set of AI's have been built using the Transformer method of building AI models. This new practice creates a powerful and broadly intelligent system that can handle many scenarios without complex programming. This means people can more easily create and leverage AI to solve problems.These transformer-based AI can do many of the things I listed above. You could give them parameters to control the temperature in your apartment. You could have them suggest moves when playing checkers. They can also discuss politics, debate company strategy, or give you a recipe for banana bread. The breadth of these AI capabilities enables much broader adoption of AI into services and applications.But because of the breadth of potential for these new AI, they aren't necessarily better at the simple stuff we have existing tools for. That's why most applications built upon transformers add chatbots or conversation AI. The breadth of a conversation best matches the capabilities of this new technology. It's only the beginning, and I think we'll see chatbots in more places than probably makes sense as a result.This phase is similar to when everyone thought we needed an app for everything. iPhones enabled so many new and incredible things to be done. But, over time, websites could do many things apps could do. Now, a lot of the craze around building apps has cooled down. But AI is just heating up.AI isn't new, and in many places, like your home thermostat, the way they’re used can be pretty boring. Believing that AI will change everything is probably true, not because of AI, but because of what we do with AI. AI isn't just one aspect of ChatGPT; it's the ability for computers to do "smart things.”If you want to find some examples of interesting applications for AI, you can visit our UI/UX patterns page on Stable Discussion. We've just started adding examples and will continue to add new examples over time.Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
Today’s post is a guest post by friend of the show, Michael Fraser. Michael is an AI Evangelist and Innovation Strategist at Action Insight. He posts regularly on The Local Model where you can find more of his insights and work using Generative Models to change how companies think about art, workflows, and their future.Being a leader means making decisions. Lots of them. And each decision is fueled by information. But there's so much information coming to us, it can be overwhelming and cause us to miss opportunities or fail to identify problems. The solution? Tools that can help us quickly transform our messy information into clear, easy-to-understand insights. It's all about making better decisions, faster.Keep reading to find out how you can use simple approaches to get big results out of AI & GPT4.GPT can apply any framework to your problemGPT-4 is an adaptable AI model, capable of learning and applying a wide range of frameworks to various problems. The beauty of this model lies in its flexibility. No matter what framework you choose to use - whether it's a widely known one or a unique system you've devised - if you can clearly explain its structure and rules, GPT-4 can implement it. This allows for comprehensive, diverse problem-solving possibilities, catering to the specific needs and nuances of different challenges. Just remember, the key is clear and concise instruction. With that, GPT-4 can become a dynamic tool in your problem-solving arsenal.Here are three examples to illustrate:* Facts and Logic: This approach involves separating objective data (facts) from author's interpretation or reasoning (logic) in a given text.* Edward De Bono's Six Thinking Hats: This method applies different 'hats' or perspectives (white, red, black, yellow, green, blue) to analyze an issue comprehensively.* Structured Data: This framework involves turning unstructured information into a structured format like a spreadsheet, useful for data analysis and organization.Facts and Logic FrameworkGood writing often intertwines facts and logic to make compelling and persuasive arguments. As leaders, sometimes you need a clear separation to think through the matter carefully. Just ask GPT to separate them out for you.First let’s generate an article:Now let’s ask GPT to transform the same article into Facts and Logic:The article may not be as compelling to read, but its much better at helping us clearly distinguish what’s going on in their argument. Remember - you can do this with any piece of text you can copy into GPT - news, research, reports - anything.Edward De Bono’s Six Thinking HatsExploring different perspectives is a powerful way to overcome bias, ensuring we consider all angles before drawing conclusions or making decisions. However, even the most experienced leaders can find this challenging. That's where GPT-4 shines, with its ability to adopt diverse points of view based on the context it's given.A particularly powerful tool for embracing these multiple perspectives is Edward De Bono's six thinking hats framework. My partner at Action/Insight, Kate Bowers, suggested this approach, and it has proven extremely effective. The beauty of using this model with GPT-4 is its ease of application. GPT-4 already understands the six thinking hats and their associated perspectives, allowing you to tap into the breadth of insights this framework offers at will.By invoking the different 'hats', you can guide GPT-4 to scrutinize a problem from all possible angles, thus helping to combat inherent biases and promoting a more balanced understanding. In short, this tool provides a shortcut to comprehensive, insightful analysis, setting the stage for well-informed, holistic decision-making.Let’s ask GPT to wear the blue hat and give some advice on this post:Organize your data into a table or spreadsheetOne of the most powerful ways to transform your data is to have GPT take a long, rambling block of text (e.g. a transcript or customer call) and have it break it down into a structured table with whichever columns you want. Now you can quickly review all the information that was provided in a way that is clear and can support decision making going forward.Here’s a simple example, recording some details about some children’s books:This is a very powerful tool for collecting and capturing data that would otherwise be lost, and once you understand this capability you might start seeing a whole world of lost data in your organization. Get collecting!AI is for leadersMore than any technology in recent memory, AI is a technology that directly benefits you as a leader, no matter what field you are in. It helps you organize your information, transform it into any presentation that suits your purposes, and ultimately leads you to better decisions.There’s no better time than now to get started.Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work and those in our community. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
AI is at its best when it removes boring tasks and lets us lean into the work we enjoy. Imagine being an athlete that only has to compete in competitions. They never have to practice or train. They get to focus on the work they train so hard for and make adjustments to the training done on their behalf. It promotes them to a coach that still gets to perform. AI elevates our role in the work we want to do.Tedious tasks are "taxes" that often limit our interest in doing our work. Whether that's paperwork a doctor fills out or building test suites for software, every profession generally has some work that makes their work feel less enjoyable. These realities of the job limit our expectations as well as our motivation. We do fewer of the activities we want to do because there are tasks that actively block our ability to appreciate that work.Progress Drives MotivationOften we're very receptive to seeing something begin to come together. A project feels real for the first time when we can start to see what we have in mind. Painters get this when the composition is outlined for the first time, and they can begin to see where paint can start highlighting aspects of the work. Progress is a positive force to push our motivation.AI offer an incredible opportunity to see progress early. Josh Larson mentioned in an interview for Midjourney Magazine that "… there's an entire possibility space of new art and design that can only be reached if we curate after the work is finished, rather than before. The only way to unlock that space is to create incredibly quickly." Larson emphasizes that by providing artists with the means to visualize their concepts in a near-complete state sooner, they could be encouraged to pursue more daring ideas, even those they initially believed might not succeed. Rapid creation is the key to unlocking this untapped potential.That's not to say artists will use AI-generated art as a completed art piece. While Midjourney does an incredible job of arriving close to what an artist had in mind, the artistic vision requires much more specificity than AI can currently remember and act upon. Without that curation and refinement, we're left with a ghost of the artist's original vision and something that lacks a soul.Over the last week, I've been working with a few friends to build a game map for a game set in Greece. I've dabbled in 3D game design and built a few games before but haven't tried anything as ambitious as what they're doing. I realized that AI made creating game maps much easier as I worked.I'm using Unity to build out the maps, which already gives me a lot of great systems to use. But my work got a huge lift when I used Midjourney's tile prompts and a few web tools to generate textures that give a real feel of place to the world. Using very simple blocking for buildings and a few free assets, I was able to pull together something that helped motivate the whole team!Speeding up TasksA new video game, Baldur's Gate 3, is releasing this week with over 170 hours of cinematics. That's twice the length of Game of Thrones. While this is an incredible undertaking, I wonder how much content was hand-crafted. Could Baldur's Gate 3 have benefited from using AI for portions of their massive set of content?We've seen some incredible tools built for existing games that illuminate how AI might help. World of Warcraft is a Massively Multiplayer Online Role Playing Game (MMORPG) released in 2004 with an immersive world and some well-written quests. Unfortunately, that writing is usually ignored by the player base, who prefer to grind through the quests as fast as possible. Their avoidance of the plot is partially because the writing is displayed as large "walls" of text that the player must read.A plugin for the game, VoiceOver, has fixed this problem by using AI to sample the few existing voice lines in the game to generate fully voiced quests. Coupled with a few other modifications to improve the interface, this game from 2004 now has an experience that feels like games created at least a decade later. The voices are incredible and give the game a new level of depth.AI-generated voices could be a game changer for voice actors who don't need to sit in a studio for so many hours and can focus, instead, on the high points of the character — parts of the script that might need more attention. Many tedious lines can be auto-generated. AI can also help teams change the dialogue after recording to fit the setting better if something isn't working. They can do this without dragging the artist back into the studio.We shouldn't overlook the financial impact of AI generation on voice artists. Since we can use an artist's voice to say a lot of new lines, there should be ways to compensate for that work and allow them to maintain some form of artistic control over the words spoken. These are conversations currently taking place during the Hollywood strikes but will likely be ongoing discussions in the months and years ahead.We've been using AI voices on our publication here to voice out each post we publish and put that reading up on our podcast. I trained this AI with my voice, and while it doesn't sound quite like me, it's close enough for our purposes here.If you are listening to this post: Hello! You're listening to an AI! If you're reading, you can see a way to listen to the voiceover for this post at the top of this article. An added benefit of having these posts read by an AI is that it can easily find spelling and grammatical errors. Having something read aloud to you as written surfaces editing issues you can't catch many other ways. These voices are a tremendous value add to us because we don't have a lot of resources to dedicate to our work, and we can reach an entirely new audience with voiceovers that articles may never reach.Tedium Meet AIAI doesn't get tired, but we must direct them toward the right tasks. Given the right direction, they can help us stay motivated by seeing our efforts come together more quickly. This motivation allows us to put the best of ourselves into our work and keeps our energies focused on the parts of the creative process that push us.The world is changing, and while we can imagine many futures, it's exciting to see how people already use AIs today to create incredible things. William Gibson has one of my favourite quotes where he says: "The future is already here; it's just unevenly distributed." On Stable Discussion, we're helping to distribute that future, and we're excited to see how AI will continue to shape our world in the coming years. Stay tuned for more findings and perspectives on how AI works today and where we see opportunities for humans to use it to improve our lives.Thanks for reading Stable Discussion! Subscribe to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
You're late to class, didn't study, and must sit through a lecture on advanced mathematics (or a compliance training in an office, whichever you find most boring). You forgot your notebook and end up without any lecture notes. Now, an hour later, you're doing homework and try to remember what you learned.You remember things at the beginning of the lecture. Topics connect to your previous learning quite easily. But then things get a bit murky... You've got something written down, but it looks wrong. You remember, at the end of the lecture, there are a few ways to check your results. Running through these checks, you can confirm that you're wrong, but you don't know why...Without notes to reference, you're a bit lost. This is exactly the way AI feel when given too much data and no way to take notes!The Challenge of Noteless AIIn a recent paper, Lost in the Middle - How Language Models Use Long Contexts, researchers found that when ChatGPT, and other AI, are given a lot of text they tend to forget or misremember some of it. Specifically, the information in the middle of the text. Similar to our lecture, the AI are less certain about information in the middle of any given data. It's uncanny that AI continue to act so human.To help explain how this happened, the paper points to AI training. AI haven't been trained on larger chunks of information paired with accurate reflections over this information. Instead, they have been trained on shorter sections of text, often derived from how humans respond to information.Where could we get this training data? Perfect recall isn't something we expect until we think about a machine "learning to read"” Then we want exact results. We want machines to do the drudgery of figuring everything out and just tell us what we need to know or what might have missed in our own reading. We don't want machines to read how we read or act like us!But, hey. This is science, not science fiction. Because humans don't act like a prefect machine, we unfortunately don't have anything to model these machines after. It's becoming more obvious that, because these AIs are trained by humans, AI have even copied our faults.How would we help a human get over these faults? How would you keep track of key pieces of information when interacting with a large and complex set of information?Taking a Cue from Human MemoryIn another Percy Liang paper titled, Generative Agents - Interactive Simulacra of Human Behaviour, the research team simulated a world full of AI-driven people. They woke up, ate, worked, socialized, planned, and slept.A major challenge when creating this simulation was getting the citizens of this experiment to think like people. To solve this, the team constructed a complex set of memory systems. Short-term memory helped them remember where important items were, track immediate needs, and engage in conversations. Long-term memory was used to plan and remember important facts (among other things). These enabled the AI people in this simulation to act like people and even plan an ad hoc Valentine's Day parties with others.It's a fascinating paper and I recommend you read it.The AI were able to organize and stay on task when they were given some place to keep their memories stored. Memories tracked, at a high level, are summaries of what was happening around them. These memories are essentially notes on their environment!Getting AI to Remember with NotesThis all leads us to some creative thinking about how to use AI to solve problems. While we'd love to treat the AI like machines, we need to assist their ability to remember and provide them with notes that direct their actions. Helping them take notes is essential to developing good AI workflows.When we interact with ChatGPT, we're usually already keeping track of what the AI is thinking. They're using us as their long term memory, and we can remind them when they forget important information. In part, this is why chat interfaces are beneficial for this era of AI. They can sort of rely on us to perform well.However, say we want an AI to read a large section of text and help build a summary of that text. Then we have a more challenging role to fill because the AI is going to misremember or forget information. We need to help give the AI note-taking capabilities.If we split the text into smaller chunks, we can ask the AI to give us short summaries of each chunk. When we're finished, the AI can combine those pieces together to summarize what was covered. Just be careful that the pieces aren't too large when combined, otherwise we're right back where we were and the AI is being forgetful again.Helping AI build plans isn't new, and there's already been a lot of research done to understand how to get AI to follow plans that help them accomplish tasks. A popular tool for building a plan is Chain-of-Thought (CoT) Prompting. We ask the AI to write out a plan for steps to take, which it will try to follow:Question: Marty has 100 centimeters of ribbon that he must cut into 4 equal parts. Each of the cut parts must be divided into 5 equal parts. How long will each final cut be?Answer: Let's think step by step.Give this a try in ChatGPT, and you'll get back a sequence of steps and, usually, a pretty accurate result. We can leverage this pattern to get the AI to make plans for itself which can be referred back to later. These plans help AI to better organize its approach to solving problems.The Path AheadWhile this only covers a few methods to help AI handle larger sets of text, many others methods exist and continue to be explored. While many believed that providing AI with a larger "context window" (the ability to be given more text) would solve many limitations, it seems that those limitations continue to exist. At this point, I won't use the longer context windows of these models until these memory problems are resolved with new modeling or better training data.For the time being, the AI development community is going to be innovating around these limitations to engineer systems and solutions rather than just relying on LLMs to solve these problems. If you're doing something interesting to help AI better understand and remember larger sections of text, we'd love to hear about it!We'll continue to post interesting ideas about how to use AI here and on our main site, Stable Discussion. Keep checking back to keep updated on our latest thinking around these ideas!Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
Follow us on Substack! https://blog.stablediscussion.com/Stable Discussion now available on https://stablediscussion.com We've posted new UI/UX examples and patterns for integrating AI into a wide variety of products.Show NotesLive Demo of Code Interpreter added to OpenAI (Our Dataset) Great Examples: https://twitter.com/emollickAI Innovators & Devs Meetup in Toronto https://www.meetup.com/toronto-artificial-intelligence-meetup-group/ * Sponsored by Elevate (who hosted OpenAI when they came to Toronto) * Organized by friend of the show, Dan Waldie* Interview by AI App: https://www.producthunt.com/products/interviews-by-ai Midjourney 5.2 dropped (allowing you to zoom out, panning) and Ben shares a Discord Trick for MidjourneyGoogle Deepmind - https://www.theverge.com/23778745/demis-hassabis-google-deepmind-ai-alphafold-risksSuper Intelligence and how to wrap your head around the discussions taking place https://openai.com/blog/introducing-superalignment PI Chat App - https://heypi.com/talk Get full access to Stable Discussion at blog.stablediscussion.com/subscribe
In our discussions with many different people, how to get started has come up quite a bit. Many of these people aren’t working in technology and haven’t programmed before. When they try to use AI it doesn’t work for them and the results aren’t all that helpful. This gave us the idea of putting together a short guide for anyone who identifies with these problems and would like some helpful pointers to get started.Getting SetupI think the easiest place to start is by getting an account setup to experiment with ChatGPT. You can go on their site, create an account, and get ready in just a few minutes. They also have a mobile app that you can download which may be easier if you don’t usually use a computer or laptop.These same ideas will carry through to other systems like Midjourney, DALL·E, or Bard. We just find ChatGPT is one of the easier interfaces to use when getting started.For our examples we’re going to get ChatGPT to write a letter to our friend that updates them on the latest news in our life and breaks the bad news to them that we won’t be able to visit this summer in as kindly as possible. I highly recommend you follow along on ChatGPT.Don’t Look for A Single Right AnswerThe most common challenge we find people struggle with is looking for the “right answer” to using AI. If you did well in school or took standardized tests you might be geared toward finding one correct answer to how you should use AI. Unfortunately there isn’t just one correct answer. Many different questions can get you the same result. It’s similar to asking your friend a question, they’ll likely give you the same response even if you ask your friend a question in a few different ways.Instead, using AI is a lot like trying to ask a question that the AI can understand best to answer your question. If you ask someone “What is the capital?”, they might not know what you’re talking about. The AI is the same, you’ll need to be a bit more specific.The examples we give below are some good ways to get started but you might find that for what you’re trying to do there’s another better way that works for you. Don’t get too attached and try to have fun finding new ways to get what you want back from the AI.Initial PromptPrompts are the questions or queries that we send to a service like ChatGPT. For our initial prompt, we’re going to go ahead and get started writing a letter to our friend using the following:“Write a letter to my friend, Paul, telling him I got a promotion to Manager at my company, my dog died, and I won’t be able to visit them this summerIf you put this into ChatGPT you’ll notice that it does a great job getting a letter created but it likely isn’t at all what you imagined. It probably doesn’t sound like you, it likely includes details that we didn’t give, and it might not be very interesting to read.You might be tempted to then just say:“Make it better”The result is a letter that likely uses a more complex vocabulary, might have fixes some grammatical issues, and might be a little longer but doesn’t seem to address any of our issues. It’s at this point most people give up and call it a day. But if we push just a little further I think ChatGPT will surprise you!You can find my chat example hereAdding SpecificsSo we can immediately solve some of the problems we had just by providing a few more details. Let’s start a new chat and add some details that can be used to fix some of the challenges we’re having:“Write a casual letter to my friend, Paul. Tell him I got a promotion to Manager at my company, Nintendi, because of my hard work over the last year. My dog, Mari, died from old age and my family is still grieving. I won’t be able to visit him this summer due to the new responsibilities at work.”Testing this out we’ll immediately see much better results. Describing the tone of letter (casual) helps a lot and can definitely help ChatGPT know what kind of language I want to use. The other specifics help to fill the letter with details that make the letter’s contents a lot closer to what we were hoping for.You can find this chat example hereIf you imagine talking to your friend and asking them to write a letter for you, these are the same details you’d need to provide them. It helps a bit to think of an AI as just an assistant that is trying to figure out what you want and can only use what you give it to help you.Adding specifics to your prompt might be just enough to get what you’re looking for especially if the letter isn’t very personal. However, this letter still doesn’t have the personal touch you might like it to have. Let’s look at some other ideas for how to make our examples just a little better.Adding an ExampleIf we were to hire someone to write letters for us, something we might do is provide them with some good examples of how to do it. That would help them understand what we’re looking for and when we gave them feedback they could continue to learn with those examples as a foundation. For this improvement, we’re going to give the AI the same thing: an example of a letter for them to use.Write a casual letter to my friend, Paul. Tell him I got a promotion to Manager at my company, Nintendi, because of my hard work over the last year. My dog, Mari, died from old age and my family is still grieving. I won’t be able to visit him this summer due to the new responsibilities at work.Use the following example as a reference for my writing style:Hey!It’s been a crazy and amazing week. I’m at the Seville airport, getting ready to board flight TP1105 to Lisboa before TP259 to Toronto. It’s a really small airport here and security took me 5 minutes!My favourite memory is this dinner of the amazing view of Ronda with local wine, salmon tartar, croquettes, artichokes and Iberian ham. Amazing views, delicious pairings, and incredible company. Looking forward to being home now. 9.5 hours in the air and I’m there.Love you,BenAdding this example, we see the style of the result change to be more inline with the way that we talk and we see our formatting style in the resulting letter. However, you still might not be too happy with the result. Don’t worry, there are a couple other ideas we can try out!You can see the result of adding an example here.There are a number of other ways to provide an example that might provide even better results in your testing. One prompt engineering technique is called a **One-Shot Prompt** where we give it an example of a prompt and a resulting letter. You can see an example of this here.Adding RulesOne thing AI are great at is following rules that we provide. They can factor in a large number of factors that are hard for us to track. Because of this, rules are a great way for us to control their behavior and get better results. Let’s add some rules that the AI should follow when they help us write our letter.Write a casual letter to my friend, Paul. Tell him I got a promotion to Manager at my company, Nintendi, because of my hard work over the last year. My dog, Mari, died from old age and my family is still grieving. I won’t be able to visit him this summer due to the new responsibilities at work.Follow these rules when writing the letter:* Don’t be too emotive and stay somewhat reserved* Try to start and end with good news* Don’t add any details I didn’t include* End with “wishing you the best”Use the following example as a reference for my writing style:Hey!It’s been a crazy and amazing week. I’m at the Seville airport, getting ready to board flight TP1105 to Lisboa before TP259 to Toronto. It’s a really small airport here and security took me 5 minutes!My favourite memory is this dinner of the amazing view of Ronda with local wine, salmon tartar, croquettes, artichokes and Iberian ham. Amazing views, delicious pairings, and incredible company. Looking forward to being home now. 9.5 hours in the air and I’m there.Love you,BenThese rules help to reinforce our tone, style, and help keep the AI from adding details that we don’t want added to our letter. Rules are easy to add and remove to see different results in the letter’s style and offer great direction for the AI. Sometimes you can jump right to adding rules for a great result, but it very much depends on the task you’re trying to do.While it may not be perfect, this really gets us quite close to a letter that we’d be happy to send. After a few small tweaks and edits you have something that you can send to Paul and a great template you might use for future letters!You can see the final result here!Turning it off and on againSimilar to many technologies, AI sometimes get a little bit overwhelmed and need to be restarted. Consider opening a new chat conversation with ChatGPT after a while.You can keep a conversation going for a long time in ChatGPT but you will start to lose control of the conversation and the AI might start behaving strangely. This is especially true If you’re trying to do a lot of different things in a single chat. Previous responses may leak into other questions. For example if I’m writing another letter to a friend in the same chat that we were chatting with Paul, it may bring up my promotion even through I didn’t say that it should just because the AI knew about it from earlier in the conversation.Use a more powerful modelAnother great way to get better results is to use a more advanced version of the AI. If you’re struggling to get GPT-3.5 to return something, using GPT-4 could be all you need for the AI to understand what you want. Later models are a little slower but they’re smarter and able to better understand complex requests. The GPT-4 announcement gives some great examples for how much the model improved over time.TakeawaysPrompting can be a frustrating experience as we try to get an AI to understand just what we want. When struggling, refer back to these improvements that can help AI better understand what to do and how to produce something a little closer to what you’re looking for. AI act a lot like assistants and we
Prompt engineering doesn’t provide a repeatable way to use AI models.You’ve likely seen courses sold with thousands of prompts and industry professionals discussing amazing prompt engineering techniques. Technologists excited about ChatGPT have spent a lot of time creating engineered prompts. With so much excitement, how is it possible that prompt engineering isn’t providing a repeatable practice or series of techniques we can rely on?To understand how Prompt Engineering is coming up short, we need to start with the AI research discussed in one of the latest episodes of our podcast. Researchers are looking into ways of using AI models that get the best results. They do rigorous testing, collect their results, and publish their findings for others to read and discuss.For example, one study showed that posing a question as:Question:Answer:gives you better results than usingQ:A:Examples like this are perfect for improving prompts to get better results. Looking at this research, you can begin to piece together a general approach for prompting. Sometimes this can be just enough to overcome challenges you’ve encountered with AI models and get the result you’re looking for. But the problem is these models are changing every month, and this research may not be relevant to a later version.Changes to models may not seem like a big deal, but they fundamentally change how AIs think. Worse, we only have a small glimpse into how these models are changing. OpenAI, in particular, changed its models significantly last week to add functions as a new capability. While it’s hard to understand how much prompts will be impacted by this change, we can see the impact of these sorts of changes clearly when we look at image generation.I was a huge fan of DALL·E when it was first released, but my interest quickly soured after a month of using it. Updates completely changed how the model generated images, and many prompts that previously made amazing art for me suddenly had really poor results. I was devastated and eventually moved on to Midjourney, which, while changing and releasing multiple versions, continues to support older versions that consistently produce expected results.While it’s tempting to stick with a version of an AI model that works for you, you’ll eventually miss out on new capabilities. If we look at the results for the same Midjourney image generation below, we can see a dramatic shift in style. This upsets our ability to prompt for the same thing, but as new versions are released, new appealing aspects of the presentation are added.When we think of engineering, we might think of bridge construction. An engineer applies principles, standards, and techniques to work on a complex problem and create an expected output: a bridge. But that thinking starts to break down for AI because the tools we build with are changing, and the result is often a loose solution rather than a fixed statistical answer. It’s like building a bridge that has to handle different physics each time someone drives over it.So what do we do? How do we handle the change that will continue to happen as new AI versions are released?We need to do research. Every time a new model is released, we must re-validate what we know. Rather than focus on specific skills or prompt methods, we must learn from solutions to problems.For example: How do you get an AI to give a more direct or less direct answer? Knowing some good ways to change the answer returned gives you a better idea of how to direct a model. Record what you tried, try to measure its effectiveness, and see if you can reproduce it when the next version is released.This trial and error process is the journey of an early adopter. There won’t be any best practices for quite a while, so try not to focus too much on one way of doing things and keep flexible. Many changes will happen, and there will be many new ways to build things and converse with AIs. If you’re brave enough, there’s still an amazing space to explore with AIs, but don’t be surprised if the road gets rougher the further you go.Thanks for reading Stable Discussion! Subscribe for free to receive new posts and support our continued work. Get full access to Stable Discussion at blog.stablediscussion.com/subscribe























