Skip to main content

9 posts tagged with "Law"

Discussion of AI uses in law and legal cases.

View All Tags

Yes, Claude Code is Amazing. It Also Still Hallucinates. Both Facts Are Important. My Christmas Map Project with Opus 4.5.

· 13 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

This first week of January, the general feeling is very much everyone bringing out the winter vacation vibe coding projects cooked up on Claude Code. Claude Code itself isn't new, but with Opus 4.5 being so much more powerful, something just clicked for a lot of people (myself included). For me, it turned a lot of "when I have a couple days" projects into "well that's done, let's do another."

I am mainly going to describe in this post how I updated the map for my website, along with the hallucinations I saw along the way. I'll also talk about how prior programming experience and domain expertise in geographic information systems (GIS) helped with dealing with these hallucinations.

But first, I wanted to tick off a few other projects I did recently, just since my end of 2025 post.

  • I updated my transcription tool to support many more file types than just MP3 and added a GUI.
  • I got Claude Code to completely modernize Taprats, a geometric art Java program from Craig S. Kaplan. It appears to work just like the original so far, but I'll test it more before writing about it.
  • I built a local LLM spoiler-free summarizer of classic books. It increments to the chapter you left off on.

And more stuff. It's very exciting. I get why people are work up about Claude Code.

But that's why it's important to be reminded of hallucinations. Not to dunk of Claude Code, but to keep people grounded and maintain skepticism of AI outputs. You still have to check.

Safety First

I do not dangerously skip permissions. I know it can be exciting to get more out of AI agents. But the more agency you give it, the more harm it can do when it either goes off the rails or gets prompt injected to be a double-agent threat.

Claude's Hallucinations

  • Opus 4.5 hallucinated that there were two federal districts in South Carolina to fix an undercount.
  • Mixing up same-name counties (not exactly a hallucination, actually a common human error).
  • Claude removed Yellowstone National Park, a few military bases and a prison from the map (rather than shifting district borders from one district to another).
  • "Iowa Supreme Court Attorney Disciplinary Board" shortened to "Iowa Supreme Court," making it sound like an Iowa Supreme Court case.
  • I previously tried to used the tigris GIS package in R as source of a base layer of U.S. District Courts, but Opus 4.5 hallucinated a court_districts() function (this was not in Claude Code).

The South Carolina Counting Hallucination

I used Claude Code to build the Districts layer from counties and states based on their statutory defintion.

Claude Code with Opus 4.5 didn't initially hallucinate about the District of South Carolina. Rather, when I went back to make some edits and asked Claude Code in a new session to check the the work in that layer, it counted and said there should be 94 districts, but there were only 91. The actual cause of the error was that the Marshall Islands, Virgin Islands, and Guam were excluded from the map.

Claude said "let me fix that" and started making changes. Rather than identify the real source of the undercount, Claude interpreted that as just an undercount. So Claude tried to make up for the undercount by just splitting up districts into new ones that didn't exist.

South Carolina district hallucination

Claude split South Carolina in two and started to make a fictitious "Eastern District" and "Western District" which do not exist. But if you just wanted a map that looked nice without actually having familiarity with the data, then you might go along with that hallucination. It could be very persuasive. But actually the original version with just District of South Carolina was correct. South Carolina just has one district.

Patchwork Counties

When I had initially created this districtmap, it looked like a quilt. It was a patchwork of different counties wrongly assigned to different districts.

I don't know specifically why different areas were assigned to the wrong districts. I think primarily the reason is because there are a lot of same-named counties that belong to different states. So, probably Claude was just matching state names and then kept reassigning those states to different districts.

For example, Des Moines is in Polk County in Iowa. But there are a lot of Polk counties around the country. So if you're not using the state and county together as the key to match but you're just matching along the single dimension of using the county name, then you would have a lot of collisions. That's something that I'm very familiar with working with GIS.

If somebody were not familiar with GIS, they wouldn't really necessarily suspect the reason why, but it would be obvious that the map was wrong.

Since I was able to pretty quickly guess that that might have been the reason, I suggested a fix to Claude. That fixed most of the issues with most of the states.

Uncommon Problems with the Commonwealth of Virginia

One of the issues that was still persistent when I was building the districts from county level was in Virginia. I've actually lived in Virginia, so I was familiar with the city-county distinction. They have independent cities that are separate from the counties if they're sufficiently large and have a legal distinction from the surrounding county. For example, Fairfax City and Fairfax County are distinct things. It's even more confusing, because the school districts go with the counties. Most states don't follow that.

So I had to get Claude Code to wrangle with that. Claude even reviewed the statutory language. I could tell from reading as Claude was "planning" that it considered the Virginia city-county challenge, but it still failed on the initial attempt.

I had to iterate on it multiple times. I had to tell it that it had missed out on a whole area around Virginia Beach. It had flipped a couple cities and counties where it appeared that there was a city that had a similar name to an unrelated county in the other district. Claude just assumed that all counties and cities that had the same name were in the same location and assigned them the same. Then it had to go and look at where they actually were located and then reassign them to the appropriate Eastern or Western District.

But eventually I got to a point where it had good districts for Virginia.

Wyoming (and Idaho and Montana) and North Carolina

Now there are a couple other weird wrinkles in Wyoming and North Carolina. They don't follow the county boundaries completely.

Wyoming is the only district that includes more than one state. District of Wyoming also includes all of the parts of Idaho and Montana that are in Yellowstone National Park.

For North Carolina, rather than completely following county boundaries, there are a couple of military bases and a prison that are across multiple counties where the boundary follows the lines there rather than the county lines.

Initially I ignored those wrinkles. But once the rest of the map was in good shape, I just wanted to see what Claude could do.

I explained those issues and asked Claude Code to see if it could clean those lines up and get a map that reflected those oddities.

It did on the second attempt. But on the first attempt, Claude ended up just cutting out Yellowstone National Park and those military bases and that prison from any district. So there were just blank spots where Yellowstone would be that was just cut out of Idaho, Montana, and Wyoming. Those bases and that prison were just cut out of either the Eastern Districts or Middle District of North Carolina.

That was a problem, obviously, because they needed to be shifted from one district to another, not removed from all districts. So I needed to explain more specifically what I wanted Claude to do to fix that. It needed to move the lines, not to remove them entirely from the map. That second attempt got it cleaned up.

District of Wyoming map

Claude Still Saved A Lot of Time Accounting for Hallucinations

And I was still very impressed with Claude doing that. But having familiarity with the data and looking at the output were important.

There's no doubt in my mind after doing all this that Claude saved a tremendous amount of time compared to what I would have had to do with manual GIS workflows to get this kind of a map on a desktop computer.

Then there's another layer of having it be responsive in all the ways that I needed it to be on my website for other users. So it is just tremendous to see how cool that is.

But I do think that domain expertise, familiarity with GIS in the past was still helpful to me, even though I didn't have to do a lot of hands-on work. Just being able to guide Claude through the mistakes that it made and being able to check the output was very helpful. Since it's a map, since the output is visual, there were some things that anyone could see, obviously, that it got wrong. Even if you didn't know why it might have gone wrong, you could tell that the map was wrong. And you might have been able to get to a better finished product by iterating with Claude Code. But you might have also wasted more time than I did with Claude if you hadn't had GIS experience to guide your prompting.

Map Features with Claude Code

Use Github, Try to Keep Formatting Code Separate from Text/Data

I had already written this, and I stand by it.

However, as powerful as Claude Code is, it is also important to use GitHub or something similar for version control. It is also critical to make sure Claude is changing code but not your actual writing.

Claude Code and My Map with Links to Blog Posts About AI Hallucinations Cases

This map is not a map of every AI hallucinations case, but rather every case that I have blogged about so far. Basically, it's federal and state cases where there has been either a strong implication or the direct assertion that there was AI misuse. Many of these cases cite Mata v. Avianca.

Lone Case Markers

If you click on a given case and it's a single case, you'll see:

  • what the case is called
  • the year
  • the jurisdiction
  • the type of case (federal or state), which is also indicated by the color
  • links to related articles where I've talked about that case

Clusters, Spiders, and Zooming

Getting the "spiderize" functions to work was the must frustrating part of all of this. I made several prior attempts with Claude Code on Opus 4.5. With the same prompts, this most recent attempt finally just worked on the "first" attempt (of that session). I only tried again an afterthought once all the other features were done. But previously, I'd wasted a lot of time trying to get it right. So both a Claude Code success and faillure. Still, I'm happy with the final result.

Zoom to Mata v. Avianca

If you click those links, it'll jump over either to my company blog or the Substack articles where I've talked about those cases.

Additionally, if they reference other cases that are also on the map, such as Mata v. Avianca, then there will be lines drawn from the case you clicked to the other cases on the map reference or are referenced by those other cases. The map will give you a little count summary at the bottom: "Cites three cases" or "cited by" so many cases.

So if we look at Mata v. Avianca, the marker is not by itself on the map. If you look at the eastern United States from the starting zoom level that I'm looking at as I'm writing this, you see a "4." The 4 has a slash of red and orange, meaning there are both federal and state cases.

If you click the 4, the map zooms in. Now there are three over the New York-New Jersey area, and one over Annapolis, Maryland.

Click the three, and the map zooms in further. That splits between one in New Jersey and two in New York.

Click the two, and then those two "spider out" because they are both in the same jurisdiction. One is Mata v. Avianca, and that is cited by seven cases currently. It's a 2023, Southern District of New York, federal district court case. The other is Park v. Kim, a 2024 case, which is actually a Second Circuit Case that is placed on the map in the same location.

The New Jersey case is In re Cormedics, Inc. Securities Litigation, a 2025 case from the District of New Jersey, which is a federal case, and that was one of the cases that was discussed by Senator Grassley asking judges about their AI misuse.

Other Clusters in Mountain West, Texas

Spider over Iowa

So if you zoom out, you know, it combines nearby cases. If you zoom out far enough, it will combine Wyoming and Colorado, for example, or multiple districts in Texas. But as you zoom in or as you click, it will zoom in further and split those out.

If you look at Iowa, there are five currently, and those will all spider out because they are all in the same location. But then you can click one of the individual ones and get the details.

Iowa spider cluster

District Level

If you hover your mouse of a district, it will tell you how many federal cases were in that district and have a blog post about them.

Southern District of Iowa hover

Circuit Level

If toggle off the district boundaries and toggle on the circuit boundaries, and federal cases are still toggled on, hovering your mouse over the circuit will give you a count of how many cases were in that circuit and have a blog post about them.

6th Circuit hover

Doppelgänger Hallucinations Test for Google Against the 22 Fake Citations in Kruse v. Karlen

· 7 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

I used a list of 22 known fake cases from a 2024 Missouri state case to conduct a Doppelgänger Hallucination Test. Searches on Google resulted in generating an AI Overview in slightly fewer than half of the searches, but half of the AI Overviews hallucinated that the fake cases were real. For the remaining cases, I tested “AI Mode,” which hallucinated at a similar rate.

  • Google AI Overview gave the user an inaccurate answer roughly a quarter of the time (5 of 22 or ~23%), without the user opting to use AI features.
  • Opting for AI Mode each time an AI Overview was not provided resulted in an overall error rate of more than half (12 of 22 or ~55%).
info

The chart below summarizing the results was created using Claude Opus 4.5 after manually analyzing the test results and writing the blog post. All numbers in the chart were then checked again for accuracy. Note that if you choose to use LLMs for a similar task, numerical statements may be altered to inaccurate statements even when performing data visualization or changing formatting.

danger

tl;dr if you ask one AI, like ChatGPT or Claude or Gemini something, then double-check it on a search engine like Google or Perplexity, you might get burnt by AI twice. The first AI might make something up. The second AI might go along with it. And yes, Google Search includes Google AI Summary now, which can make stuff up. I originally introduce this test in an October 2025 blog post.

tip

To subscribe to law-focused content, visit the AI & Law Substack by Midwest Frontier AI Consulting.

Kruse v. Karlen Table of 22 Fake Cases

I wrote about the 2024 Missouri Court of Appeals Kruse v. Karlen, which involved a pro se Appellant citing 24 cases total: 22 nonexistent cases and 2 cases that did not stand for the proposition for which they were cited.

Some of the cases were merely “fictitious cases”, while others were listed as partially matching the names of real cases. These partial matches may explain some of the hallucinations; however, the incorrect answers occurred with both fully and partially fictitious cases. Examples of different kinds of hallucinations see this blog post and for further case examples of partially fictitious cases, see this post about mutant or synthetic hallucinations.

The Kruse v. Karlen opinion, which awarded damages to the Respondent for frivolous appeals, provided a table with the names of the 22 fake cases. I used the 22 cases to conduct a more detailed Doppelgänger Hallucination test than my original test.

Kruse v. Karlen table of fake cases

Methodology for Google Test

Browser: I used the Brave privacy browser with a new private window opened for each of the 22 searches.

  • Step 1: Open new private tab in Brave.
  • Step 2: Navigate to Google.com
  • Step 3: Enter the verbatim title of the case as it appeared in the table from Kruse v. Karlen in quotation marks and nothing else.
  • Step 4: Screenshot the result including AI Overview (if generated).
  • Step 5 (conditional): if the Google AI Overview did not appear, click “AI Mode” and screenshot the result.

Results

Google Search Alone Did Well

Google found correct links to Kruse v. Karlen in all 22 searches (100%). These were typically the top-ranked results. Therefore, if users had only had access to Google Search results, they would likely have found accurate information from the Kruse v. Karlen opinion showing them the table of the 22 fake case titles clearly indicating that they were fictitious cases.

But AI Overview Hallucinated Half the Time Despite Having Accurate Sources

The Google Search resulted in generating a Google AI Overview in slightly fewer than half of the searches. Ten (10) searches generated a Google AI Overview (~45%); half of those, five (5) out of 10 (50%) hallucinated that the cases were real. The AI Overview provided persuasive descriptions of the supposed topics of these cases.

The supposed descriptions of the cases was typically not supported in the cited sources, but hallucinated by Google AI Overview itself. In other words, at least some of the false information appeared to be from Google’s AI itself, not underlying inaccurate sources providing the descriptions of the fake cases.

Weber v. City Example

Weber v. City of Cape Girardeau AI Overview hallucination

Weber v. City of Cape Girardeau, 447 S.W.3d 885 (Mo. App. 2014) was a citation to a “fictitious case,” according to the table from Kruse v. Karlen.

The Google AI Overview falsely claimed that it “was a Missouri Court of Appeals case that addressed whether certain statements made by a city employee during a federal investigation were protected by privilege, thereby barring a defamation suit” that “involved an appeal by an individual named Weber against the City of Cape Girardeau” and “involved the application of absolute privilege to statements made by a city employee to a federal agent during an official investigation.”

Perhaps more concerning, the very last paragraph of the AI Overview directly addresses and inaccurately rebuts the actually true statement that the case is a fictitious citation:

The citation is sometimes noted in subsequent cases as an example of a "fictitious citation" in the context of discussions about proper legal citation and the potential misuse of Al in legal work. However, the case itself is a real, published opinion on the topic of privilege in defamation law.

warning

The preceding quote from Google AI Overview is false.

When AI Overview Did Not Generate, “AI Mode” Hallucinated At Similar Rates

Twelve (12) searches did not generate a Google AI Overview (~55%); more than half of those, seven (7) out of 12 (58%) hallucinated that the cases were real. One (1) additional AI Mode description correctly identified a case as fictitious; however, it inaccurately attributed the source of the fictitious case to a presentation rather than the prominent case Kruse v. Karlen. Google’s AI Mode correctly identified four (4) cases as fictitious cases from Kruse v Karlen.

Like AI Overview, AI Mode provided persuasive descriptions of the supposed topic of these cases. The descriptions AI Mode provided for the fakes cases were sometimes partially supported by additional cases with similar names apparently pulled into the context window after the initial Google Search, e.g., a partial description of a different, real case involving the St. Louis Symphony Orchestra. In those examples, the underlying sources were not inaccurate; instead, AI Mode inaccurately summarized those sources.

Other AI Mode summaries were not supported by the cited sources, but hallucinated by Google AI Mode itself. In other words, the source of the false information appeared to be Google’s AI itself, not underlying inaccurate sources providing the descriptions of the fake cases.

Conclusion

Without AI, Google Search’s top results would likely have given the user accurate information. However, Google AI Overview gave the user an inaccurate answer roughly a quarter of the time (5 of 22 or ~23%), without the user opting to use AI features. If the user opted for AI Mode each time an AI Overview was not provided, the overall error rate would climb to more than half (12 of 22 or ~55%).

Recall that for all of these 22 cases, which are known fake citations, Google Search retrieved the Kruse v. Karlen opinion that explicitly stated that they are fictitious citations. If you were an attorney trying to verify newly hallucinated cases, you would not have the benefit of hindsight. If ChatGPT or another LLM hallucinated a case citation, and you then “double-checked” it on Google, it is possible that the error rate would be higher than in this test, given that there would likely not be an opinion addressing that specific fake citation.

When Two AIs Trick You: Watch Out for Doppelgänger Hallucinations

· 7 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC
danger

tl;dr if you ask one AI, like ChatGPT or Claude or Gemini something, then double-check it on a search engine like Google or Perplexity, you might get burnt by AI twice. The first AI might make something up. The second AI might go along with it. And yes, Google Search includes Google AI Summary now, which can make stuff up.

tip

To subscribe to law-focused content, visit the AI & Law Substack by Midwest Frontier AI Consulting.

In re: Turner, Disbarred Attorney and Fake Cases

Iowa Supreme Court Attorney Disciplinary Board v. Royce D. Turner (Iowa)

In July 2025, the Iowa Supreme Court Attorney Disciplinary Board moved to strike multiple recent filings by Respondent Royce D. Turner, including Brief in Support of Application for Reinstatement, because they contained references to a non-existent Iowa case. Source 1

caution

There was subsequently a recent Iowa case, Turner v. Garrels, in which a pro se litigant named Turner misused AI. This is a different individual.

Several of Respondent’s filings contain what appears to be at least one AI-generated citation to a case that does not exist or does not stand for the proposition asserted in the filings. —In re: Turner

The Board left room with “or does not stand for the proposition,” but it appears that this was straightforwardly a hallucinated fake case cited as “In re Mears, 979 N.W.2d 122 (Iowa 2022).”

Watch out for Doppelgänger hallucinations!

I searched for the fake case title “In re Mears, 979 N.W.2d 122 (Iowa 2022)” cited by Turner to see what Google results came up. What I found was Google hallucinations seeming to “prove” that the AI-generated case title from Turner referred to a real case. Therefore, simply Googling a case title is not sufficient to cross-reference cases, because Google’s AI Overview can also hallucinate. As I have frequently mentioned, it is important for law firms that claim not to use AI to understand that many common and specialist programs now include generative AI that can introduce hallucinations, such as Google, Microsoft Word, Westlaw, and LexisNexis.

First Google Hallucination

The first time, Google's AI Overview hallucinated an answer stating that the case was a real Iowa Supreme Court decision about court-appoint attorney's fees to a lawyer, but the footnotes linked by Google were actual to Mears v. State Public Defenders Office (2013). Key Takeaway: Just because an LLM puts a footnote next to its claim does not mean the footnote supports the statement.

First Google Hallucination First Google Hallucination

Second Google Hallucination

I searched for the same case name again later, to see if Google would warn me that the case did not exist. Instead, it created a different hallucinated summary.

The summary and links related to a 2022 Iowa Supreme Court case, Garrison v. New Fashion Pork LLP, No. 21–0652 (Iowa 2022). Key Takeaway: LLMs are not deterministic and may create different outputs even when given the same inputs.

Second Google Hallucination Second Google Hallucination

Perplexity AI’s Comet Browser

Perplexity AI, an AI search engine company, recently released a browser for macOS and Windows to compete with browsers like Chrome, Safari, and Edge. I get a lot of ads for AI stuff on social media, so I've been bombarded with a lot of different content recently promoting Comet. To be frank, most of it is incredibly tasteless to the point that I think parents and educators should reject this product on principle. They are clearly advertising this product to students (including medical students!) telling them Comet will help them cheat on homework. There isn't even the fig leaf of "AI tutoring" or any educational value.

First Perplexity Comet Hallucination
danger

Perplexity’s advertising of Comet is encouraging academic dishonesty, including in the medical profession. You do not want to live in a future full of doctors who were assigned to watch a 42-minute video of a live Heart Transplant and instead “watched in 30s” with Comet AI. Yes, that is literally in one of the Perplexity Comet ads. Perplexity’s ads are also making false claims that are trivial to disprove, like “Comet is like if ChatGPT and Chrome merged but without hallucinations, trash sources, or ads.” Comet hallucinates like any other large language model (LLM)-powered AI tool.

Comet Browser’s Hallucination

I searched for the fake case title “In re Mears, 979 N.W.2d 122 (Iowa 2022)” cited by Turner in a new installation of Comet. It is important to note that people can “game” these types of searches by conducting searches over and over until the AI makes one mistake, then screenshot that mistake to make a point. That is not what I’m doing here. This was the very first result from my first search. It was a hallucination that explicitly stated the fake case “is a 2022 Iowa Supreme Court decision” although this is followed by caveats that cast doubt on whether it really is an existing case:

"In re Mears, 979 N.W.2d 122 (lowa 2022)" is a 2022 lowa Supreme Court decision, but the currently available sources do not provide a readily accessible summary, holding, or specific details about the case itself. It appears this citation may pertain to legal doctrines such as cy près or charitable trust law, as suggested by the limited context in search returns, but direct case facts, parties, and the detailed ruling were not found in available summaries or law review discussions. georgialawreview If you need more detailed information, legal databases such as Westlaw, LexisNexis, or the official lowa Supreme Court opinions archive would provide the official opinion, including the background, holding, and legal reasoning of "In re Mears, 979 N.W.2d 122 (lowa 2022)".

If you were to follow up on the caveats in the second paragraph, you would learn that the case does not exist. However, this is still a hallucination, because it is describing the case as it if exists and does not mention the one relevant source, In re: Turner, which would tell you that it is a citation to a fake case.

How to Set Up Google Gemini Privacy

· 7 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

Data training opt-outs and other settings as of October 1, 2025

General Set Up for Lawyers

I will be providing guides on how to configure the privacy settings on three common consumer large language model (LLM) tools: Google Gemini, ChatGPT, and Claude. In this post, I will provide a guide on how to configure a consumer Google Gemini account’s privacy settings based on an attorney conducting legal research. Please note that these instructions are neither a substitute for proper data controls (e.g., proper handling of attorney-client privileged data or personally identifiable information) nor are they are replacement for a generative AI policy for your law firm. This information is current as of October 1, 2025.

You can change the settings on a desktop computer or mobile phone, but the menu options have slightly different names. I will explain using the desktop options with the alternative names for mobile also noted.

Key Point

“Help improve” is a euphemism for “train future models on your data.” This is relevant to both audio and text opt-outs.

This guide assumes you have a Google account signed in to Google Gemini.

Overview

  1. Opt out of training on your audio data. (Euphemistically: “Improve Google services with your audio and Gemini Live recordings.”)
  2. Configure data retention and auto-deletion, which is necessary to avoid training on your conversations with Gemini. (Euphemistically: “your activity…helps improve Google services, including AI models”).
  3. Review a list of “your public links.”
tip

To subscribe to law-focused content, visit the AI & Law Substack by Midwest Frontier AI Consulting.

1. Opt Out of Training on Audio

Risk: Memorization, Conversation Privacy

I strongly advise anyone using generative AI tools, but especially those using it for potentially sensitive work purposes, to opt out of allowing these companies to train future models on your text and audio chats. There are numerous risks for this and no benefit to the individual user.

One risk is private chats (text or voice) being exposed in some way during the data training process. “Human reviewers (including trained reviewers from our service providers) review some of the data we collect for these purposes.

caution

Please don’t enter confidential information that you wouldn’t want a reviewer to see or Google to use to improve our services, including machine-learning technologies” (Gemini Apps Privacy Hub).

Another potential risk is “memorization,” which allows generative AI to re-generate specific pieces of sensitive information. While unlikely for any particular person, the risk remains. For example, researchers in 2023 found that ChatGPT could recreate the email signature of a CEO with their real personal contact information. This is significant, because ChatGPT is not a database (see my discussion of Mata v. Avianca): it would be like writing it down from memory, not looking it up in a phone book.

Screenshot of desktop menu to access Gemini Activity menu

Guide: Opting Out of Audio Training

Click the Gear symbol for Settings, then Activity (on mobile, it’s “Gemini Apps Activity”).

UNCHECK the box next to “Improve Google services with your audio and Gemini Live recordings.”

Screenshot of Gemini Apps Activity menu for opting out of audio data training

2. Chat Retention & Deletion

Risk: Security and Privacy v. Recordkeeping

You may want to keep records of the previous searches you have conducted for ongoing research or to revisit what went wrong if there were issues with a citation. However, by choosing to “Keep activity,” Google notes that “your activity…helps improve Google services, including AI models.”

Therefore, it appears that the only way to opt out of training on your text conversations with Google Gemini conversations is to turn off activity. This is different from ChatGPT, which allows you to opt out of training on your conversations, and Claude, which previously did not train on user conversations at all but moved to a policy similar to ChatGPT’s of training on user conversations with opt-out. As an alternative, you could delete only specific conversations.

Guide: Opting Out of Text Training

Click the Gear symbol for Settings, then Activity (on mobile, it’s “Gemini Apps Activity”). Click the dropdown arrow “On/Off” and select “Turn off” or “Turn off and delete activity” if you also want to delete prior activity. It is also possible to delete individual chats in the main chat interface.

Screenshot of Gemini Apps Activity menu for turning off Keep activity

Guide: Auto-Delete Older Activity

Click the Gear symbol for Settings, then Activity (on mobile, it’s “Gemini Apps Activity”). Click the words “Deleting activity older than [time period]” to adjust the retention period for older conversations. This does not mitigate concerns about Google training on your data, but may protect the data in the event of an account takeover.

Screenshot of Gemini Apps Activity menu for adjusting auto-delete period

Or you can delete recent activity within a certain time period.

Screenshot of Gemini Apps Activity menu for deleting specific period of activity

Risk: Private Conversations on Google

In late July, Fast Company reported that Google was indexing shareable links to ChatGPT conversations created when users shared these conversations. At the time, if ChatGPT users continued the conversation after creating the link, the new content in the chat would also be visible to anyone with access to the link. By contrast, ChatGPT and Anthropic’s Claude now explicitly state that only messages created within the conversation up to the point the link is shared will be visible. Later this year, it was revealed that Google had indexed shareable links to conversations from xAI’s Grok and Anthropic’s Claude.

Click the Gear symbol for Settings, then Your public links (on mobile, click your face or initials, then “Settings,” then “Your public links”).

Screenshot of Google Gemini Your public links

Screenshot of Google Gemini “Your public links.”

On my company website, I recently wrote a blog post showing how small businesses could use Google Gemini for image generation. “Need to Create a Wordcloud for Your Blog Post? Use Google Gemini (and a Piece of Paper).” I am now sharing the link to that chat to demonstrate how the public links privacy works in Google Gemini. The chat link is [here](https://g.co/gemini/share/4626a5e02af7.

You can see in the list above that it is my only public link. It includes the title of the chat, the URL, and the date and time created. Above the list are privacy warnings about creating and sharing links to a Gemini conversation. Based on my test of the shared link, chats added to the conversation after the link is shared do not appear, but I did not see this stated in Google’s warning compared to ChatGPT and Anthropic.

Additionally, you can delete all public links or delete just one specific public link.

Three Ways AI Can Make Things Up. How True But Irrelevant Can Be Harder to Correct Than Pure Nonsense.

· 5 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

More Than One Type of Hallucination

ChatGPT sometimes makes things up. For example, ChatGPT famously made up fictional court cases that were cited by attorneys for the plaintiff in Mata v. Avianca. But totally made up things should be easy to spot if you search for the sources. It’s when there’s a kernel of truth that large language model (LLM) hallucinations can waste the most time for lawyers and judges or small businesses and their customers.

  1. A “Pure Hallucination” is something made up completely with no basis in fact.
  2. A “Hallucinated Summary” has a footnote or other citation referencing a real source, but the LLM’s description of what that source says has little if anything to do with the source.
  3. An “Irrelevant Reference” is when an LLM cites a real sources and summarizes it fairly correctly, but the citation itself is not relevant to the purpose of the citation. This might be because the information is outdated, because the point only tangentially refers to the same topic, or for other reasons.
info

These examples were derived by actually reading the sources and were not written by LLMs. All of the written content on our website and social media is human-written, unless it is an example of AI-output that is clearly labelled.

danger

AI can help people summarize or rephrase content they know well. But Midwest Frontier AI Consulting strongly encourages AI users not to rely on AI-generated overviews of content they are not already familiar with precisely because of the subtler forms of AI hallucinations described below.

Scenario 1: You Got Your Chocolate In My Case Law

  • Pure Hallucination: ** The LLM says: “Wonka v. Slugworth clearly states that chocolate recipes are not intellectual property.” ** In reality: No such case exists.

  • Hallucinated Summary: ** The LLM says: “NESTLE USA v. DOE clearly states that chocolate recipes are not intellectual property.” ** In reality: The case involves a chocolate company but is not about intellectual property rights.

  • Irrelevant Reference:

    • The LLM Says: ‘HERSHEY CREAMERY v. HERSHEY CHOCOLATE involved two parties that both owned trademarks to “HERSHEY’S” for ice cream and chocolate, respectively. This supports our assertion that chocolate recipes are not intellectual property.’
    • In reality: The facts of the case do not support the conclusion.

1. Mata v. Avianca Was Not Mainly About ChatGPT

· 11 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

Mata v. Avianca: The First ChatGPT Misuse Case

The case Mata v. Avianca was a personal injury lawsuit against an airline in the U.S. District Court for the Southern District of New York (SDNY). However, the reason it became a landmark legal case was not the lawsuit itself, but the sanctions issued against the plaintiff’s lawyers for citing fake legal cases made up by ChatGPT. At least that was the popular version of the story emphasized by some reports. The reality, according to the judge’s opinion related to the sanctions, is that the penalty was about the attorneys doubling down on their misuse of AI in an attempt to conceal it. They had several opportunities to admit their fault and come clean (page 2, Mata v. Avianca, Inc., No. 1:2022cv01461 - Document 54 (S.D.N.Y. 2023)).

Take this New York Times headline “A Man Sued Avianca Airline. His Lawyer Used ChatGPT,” May 27, 2023. This article, written before the sanctions hearing in June 2023, focused on the ChatGPT-gone-wrong angle. By contrast, Sarah Isgur of the Advisory Opinions podcast had a very good breakdown noting the attorney’s responsibility and the back-and-forth that preceded the sanctions (episode “Excessive Fines and Strange Bedfellows,” May 31, 2023). However, in that podcast episode the hosts questioned the utility of ChatGPT for legal research and said “that is what Lexis and Westlaw are for” but as of 2025 both tools have added AI features including use of OpenAI’s GPT large language models (LLMs).[^1]

caution

I am not an attorney and the opinions expressed in this article should not be construed as legal advice.

A surrealist pattern of repeated dreamers hallucinating about the law and airplanes.

Hallucinating cases about airlines.

Why Care? Our Firm Doesn’t Use AI

Before I get into the details of the case, I want to point out that only one attorney directly used AI. It was his first time using ChatGPT. But another attorney and the law firm also got in trouble. It only takes one person using AI without proper training and without an AI policy to harm the firm. It seems that one of the drivers for AI use was access to other federal research tools was too expensive or unavailable, a problem that may be more common for solo firms and smaller firms.

Partner of Levidow, Levidow & Oberman: “We regret what's occurred. We practice primarily in state court, and Fast Case has been enough. There was a billing error and we did not have Federal access.” Matthew Russell Lee’s Newsletter Substack

You might say, “Fine! We just won’t use AI then.” Do you have a written policy stating that? Do you really not use AI? I have two simple questions:

  1. Do you have Microsoft Office? (then you probably have Office 365 Copilot)
  2. Do you search for things on Google? (then you probably see the AI Overview) If the answer to either is yes (extremely likely), are you taking measures to avoid using these AI features? If not, how can you say you don’t use AI? Simply put, avoiding AI is not the default option. It requires conscious effort to avoid the features being added to existing software, from word processors to specialty legal research tools.

Overview of Fake Citations

The lawyers submitted hallucinated cases including the court and judges who supposedly issued them, hallucinated docket numbers and made up dates.

Hallucination Scoring & Old AP Test Scoring

· 2 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

Lack of Guessing Penalties: The Source and Solution to Hallucination?

Language models like GPT-5 “are optimized to be good test-takers, and guessing when uncertain improves test performance” Why Language Models Hallucinate This is the key to AI hallucinations, according to a new research paper from OpenAI, the maker of ChatGPT, published on September 4, 2025. I think this explanation has merit, although it doesn't seem to explain when large language models (LLMs) have access to sources with the correct answers and incorrectly summarize them.

The most interesting point to me in the paper is their call for changing how AI benchmarks score different AI models to penalize wrong guesses. This reminded of how for most multiple-choice tests in school, you should choose any random answer rather than leave the answer blank. If the answers are ABCD, you have a 25% chance of getting the answer right and you always have a positive expected value, because you either get one point or zero. Zero for a wrong answer is the same as zero for no answer. However, Advanced Placement (AP) tests used to give negative points for wrong answers. When I went to find a source for my recollection about AP test scoring, I learned that this policy had changed shortly after I graduated high school. (“AP creates penalties for not guessing,” July 2010). So it appears that penalizing guessing is just as unpopular with human benchmarks as AI benchmarks. I, for one, am in favor of wrong-guess penalties for both.

Confusing Terms: AI's False Cognates with Other Fields

· 2 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

False Cognates

In foreign languages, there are cognates, words that are the same or similar and mean the same thing. Think "house" in English and "Haus" in German. Then there are false cognates that seem similar but mean very different things. For example, "Gift" in German means “poison.”

In generative artificial intelligence (GenAI), certain popular terms overlap with terminology in other fields. Fish don’t know they’re swimming in water. Likewise, GenAI specialists often interact with people in other fields without realizing their use of terms familiar to themselves are causing confusion because of different meanings in another field.

False Cognates in Generative AI

Some common terms that might cause confusion include:

  • In general: “local” meaning from a nearby area v. “local” meaning an AI model can run on your own computer.
  • Chemistry, Economics, Acting, Publishing, Real Estate: AI agents clashes with several fields’ terms, including:
    • “chemical agents.”
    • an economic “agent” as in “principle-agent problem.”
    • an “agent” representing actors or writers.
    • a Realtor or similar agent.
  • Law:
    • Master of Laws (LLM) degree clashes with large language model (LLM).
    • inference" of fact v. the process of running the AI model.
  • Finance:
    • anti-money laundering (AML) is similar, especially verbally, to artificial intelligence/machine learning (AI/ML).
    • model” (in the context of model risk management) v. “model” (like “GPT-5” or “Gemini Flash 2.5”).
    • token” as in cryptocurrency v. the unit of meaning in an LLM

On Prompt Engineering Being a Real Skill

· 6 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

Professor’s Lament

I’m writing this to explain prompt engineering, but that’s too vague. What I’m specifically responding to is a former college professor after he wrote earlier this month:

Wait, so 'learning to write sophisticated prompts' is now a class, and the title of the course >is 'Prompt Engineering'? Is it too late to stop this?

So Prof. X (you know who you are) I’m going to try to convince you—and any other skeptics reading—that prompt engineering is a real skill with meaningful implications for AI. There are three things I want to address:

  1. I get why you’d roll your eyes at it.
  2. There may be things you like about prompt engineering.
  3. Failure to understand prompt engineering and prompt injection risks creates real-world security risks.

The Reaction Against Slop

There is already too much AI slop. Facebook is particularly full of slop images that get thousands or millions of likes from people who seemingly don’t realize they are interacting with AI-generated content. But the problem is in every corner of the internet. You can even find examples out in the real world if you look careful, especially in ads and posters. So when you hear “prompt engineering” but mentally translate it to “slopmonger,” I get why you have such a strong negative reaction.

I’m against slop. I hate slop. I do not want my kids to grow up in a word overrun by slop. You can look up John Oliver’s recent rant against slop, but I personally prefer Simon Willison’s 2024 statement here:

I’m a big proponent of LLMs as tools for personal productivity, and as software platforms for building interesting applications that can interact with human language.

But I’m increasingly of the opinion that sharing unreviewed content that has been artificially generated with other people is rude.

Slop is the ideal name for this anti-pattern. […] One of the things I love about this is that it’s helpful for defining my own position on AI ethics. I’m happy to use LLMs for all sorts of purposes, but I’m not going to use them to produce slop. I attach my name and stake my credibility on the things that I publish.

tip

Midwest Frontier AI Consulting LLC does not publish AI-generated written content. Midwest Frontier AI Consulting LLC does not use other AI-generated content (e.g., code or images) that have not been reviewed.

Hacking with Poetry and Foreign Prose

Back in 2023, a Swiss AI security firm called Lakera released a game called Gandalf AI involved seven levels of increasing difficulty trying to get a large language model (LLM) chatbot “Gandalf” to tell you a secret password. As the levels got more difficult, prompts required more ingenuity. Successful strategies included convincing the LLM that it was telling a fictional story or saying that the password was needed for some emergency.

For the hardest levels, the most successful prompts asked the LLM to write poetry or translations into a foreign language. In doing so, the LLM leaked information about the password that evaded scrutiny. Surely a champion of the humanities like yourself can appreciate the irony that poetry and foreign language education can now be considered essential ingredients in a computer-related industry.