DiscoverFIR Podcast NetworkFIR #491: Deloitte’s AI Verification Failures
FIR #491: Deloitte’s AI Verification Failures

FIR #491: Deloitte’s AI Verification Failures

Update: 2025-12-09
Share

Description

Big Four consulting firm Deloitte submitted two costly reports to two governments on opposite sides of the globe, each containing fake resources generated by AI. Deloitte isn’t alone. A study published on the website of the U.S. Centers for Disease Control (CDC) not only included AI-hallucinated citations but also purported to reach the exact opposite conclusion from the real scientists’ research. In this short midweek episode, Neville and Shel reiterate the importance of a competent human in the loop to verify every fact produced in any output that leverages generative AI.



Links from this episode:





The next monthly, long-form episode of FIR will drop on Monday, December 29.


We host a Communicators Zoom Chat most Thursdays at 1 p.m. ET. To obtain the credentials needed to participate, contact Shel or Neville directly, request them in our Facebook group, or email fircomments@gmail.com.


Special thanks to Jay Moonah for the opening and closing music.


You can find the stories from which Shel’s FIR content is selected at Shel’s Link Blog. You can catch up with both co-hosts on Neville’s blog and Shel’s blog.


Disclaimer: The opinions expressed in this podcast are Shel’s and Neville’s and do not reflect the views of their employers and/or clients.




Raw Transcript:


Neville Hobson: Hi everybody and welcome to For Immediate Release. This is episode 491. I’m Neville Hobson.


Shel Holtz: And I’m Shel Holtz, and I want to return to a theme we addressed some time ago: the need for organizations, and in particular communication functions, to add professional fact verification to their workflows—even if it means hiring somebody specifically to fill that role. We’ve spent the better part of three years extolling the transformative power of generative AI. We know it can streamline workflows, spark creativity, and summarize mountains of data.


But if recent events have taught us anything, it’s that this technology has a dangerous alter ego. For all that AI can do that we value, it is also a very confident liar. When communications professionals, consultants, and government officials hand over the reins to AI without checking its work, the result is embarrassing, sure, but it’s also a direct hit to credibility and, increasingly, the bottom line.


Nowhere is this clearer than in the recent stumbles by one of the world’s most prestigious consulting firms. The Big Four accounting firms are often held up as the gold standard for diligence. Yet just a few days ago, news broke that Deloitte Canada delivered a report to the government of Newfoundland and Labrador that was riddled with errors that are characteristic of generative AI. This report, a massive 526-page document advising on the province’s healthcare system, came with a price tag of nearly $1.6 million. It was meant to guide critical decisions on virtual care and nurse retention during a staffing crisis.


But when an investigation by The Independent, a progressive news outlet in the province, dug into the footnotes, the veneer of expertise crumbled. The report contained false citations pulled from made-up academic papers. It cited real research on papers they hadn’t worked on. It even listed fictional papers co-authored by researchers who said they had never actually worked together. One adjunct professor, Gail Tomlin Murphy, found herself cited in a paper that doesn’t exist. Her assessment was blunt: “It sounds like if you’re coming up with things like this, they may be pretty heavily using AI to generate work.”Deloitte’s response was to claim that AI wasn’t used to write the report, but was—and this is a quote—”selectively used to support a small number of research citations.” In other words, they let AI do the fact-checking and the AI failed.


Amazingly, Deloitte was caught doing something just like this earlier in a government audit for the Australian government. Only months before the Canadian revelation, Deloitte Australia had to issue a humiliating correction to a report on welfare compliance. That report cited court cases that didn’t exist and contained quotes from a federal court judge that had never been spoken. In that instance, Deloitte admitted to using the Azure OpenAI tool to help draft the report. The firm agreed to refund the Australian government nearly $290,000 Australian dollars.


This isn’t an isolated incident of a junior copywriter using ChatGPT to phone in a blog post. This is a pattern involving a major consultancy submitting government audits in two different hemispheres. The lesson is pretty stark: The logo on your letterhead isn’t going to protect you if the content is fiction. In fact, this could have long-term repercussions for the Deloitte brand.


But it doesn’t stop at consulting firms. Here in the US, we’ve seen similar failures in the public sector. There’s one from the Make America Healthy Again (MAHA) commission. They released a report with non-existent study citations to a presentation on the CDC website—that’s the Centers for Disease Control—citing a fake autism study that contradicted the real scientists’ actual findings.


The common thread here is a fundamental misunderstanding of the tool. For years, the mantra in our industry was a parroting of the old Ronald Reagan line: “Trust but verify.” When it comes to AI though, we just need to drop that “trust” part. It’s just verify. We have to remember that large language models are designed to predict the next plausible word, not to retrieve facts. When Deloitte’s AI invented a research paper or a court case, it wasn’t malfunctioning. It was doing exactly what it was trained to do: tell a convincing story.


And that brings us to the concept of the human in the loop. This phrase gets thrown around a lot in policy documents as a safety net, but these cases prove that having a human involved isn’t enough. You need a competent human in the loop. Deloitte’s Canadian report undoubtedly went through internal reviews. The Australian report surely passed across several desks. The failure here wasn’t just technological, it was a failure of human diligence. If you’re using AI to write content that relies on facts, data, or citations, you can’t simply be an editor. You must be a fact-checker.


Deloitte didn’t just lose money on refunds or potential reputational hits; they lost the presumption of competence. For those of us in PR and corporate communications, we’re the guardians of our organization’s truth. If we allow AI-generated confabulations to slip into our press releases, earnings statements, annual reports, or white papers, we erode the very foundation of our profession. Communicators need to update their AI policies. Make it explicit that no AI-generated fact, quote, or citation can be published without primary source verification. And you need to make sure that you have the human resources to achieve that. The cost of skipping that step, trust me, is a lot higher than a subscription to ChatGPT.


Neville Hobson: It’s quite a story, isn’t it really? I think you kind of get exasperated when we talk about something like this, because we’ve talked about this quite a bit. Most recently, in our interview with Josh Bernoff—which will be coming in the next day or so—where this very topic came up in discussion: fact-checking versus not doing the verification.


I suppose you could cut through all the preamble about the technology and all this stuff, and the issue isn’t that; it’s th

Comments 
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

FIR #491: Deloitte’s AI Verification Failures

FIR #491: Deloitte’s AI Verification Failures

Shel Holtz