1. Mata v. Avianca Was Not Mainly About ChatGPT
Mata v. Avianca: The First ChatGPT Misuse Case
The case Mata v. Avianca was a personal injury lawsuit against an airline in the U.S. District Court for the Southern District of New York (SDNY). However, the reason it became a landmark legal case was not the lawsuit itself, but the sanctions issued against the plaintiff’s lawyers for citing fake legal cases made up by ChatGPT. At least that was the popular version of the story emphasized by some reports. The reality, according to the judge’s opinion related to the sanctions, is that the penalty was about the attorneys doubling down on their misuse of AI in an attempt to conceal it. They had several opportunities to admit their fault and come clean (page 2, Mata v. Avianca, Inc., No. 1:2022cv01461 - Document 54 (S.D.N.Y. 2023)).
Take this New York Times headline “A Man Sued Avianca Airline. His Lawyer Used ChatGPT,” May 27, 2023. This article, written before the sanctions hearing in June 2023, focused on the ChatGPT-gone-wrong angle. By contrast, Sarah Isgur of the Advisory Opinions podcast had a very good breakdown noting the attorney’s responsibility and the back-and-forth that preceded the sanctions (episode “Excessive Fines and Strange Bedfellows,” May 31, 2023). However, in that podcast episode the hosts questioned the utility of ChatGPT for legal research and said “that is what Lexis and Westlaw are for” but as of 2025 both tools have added AI features including use of OpenAI’s GPT large language models (LLMs).1
I am not an attorney and the opinions expressed in this article should not be construed as legal advice.
Hallucinating cases about airlines.
Why Care? Our Firm Doesn’t Use AI
Before I get into the details of the case, I want to point out that only one attorney directly used AI. It was his first time using ChatGPT. But another attorney and the law firm also got in trouble. It only takes one person using AI without proper training and without an AI policy to harm the firm. It seems that one of the drivers for AI use was access to other federal research tools was too expensive or unavailable, a problem that may be more common for solo firms and smaller firms.
Partner of Levidow, Levidow & Oberman: “We regret what's occurred. We practice primarily in state court, and Fast Case has been enough. There was a billing error and we did not have Federal access.” Matthew Russell Lee’s Newsletter Substack
You might say, “Fine! We just won’t use AI then.” Do you have a written policy stating that? Do you really not use AI? I have two simple questions:
- Do you have Microsoft Office? (then you probably have Office 365 Copilot)
- Do you search for things on Google? (then you probably see the AI Overview) If the answer to either is yes (extremely likely), are you taking measures to avoid using these AI features? If not, how can you say you don’t use AI? Simply put, avoiding AI is not the default option. It requires conscious effort to avoid the features being added to existing software, from word processors to specialty legal research tools.
Overview of Fake Citations
The lawyers submitted hallucinated cases including the court and judges who supposedly issued them, hallucinated docket numbers and made up dates.
- The plaintiff’s lawyers (sanctioned) submitted an “affirmation” in opposition to the defendant’s motion to dismiss citing fake cases.
- Both the opposing (defendant’s) counsel and the judge were unable to verify references to the cited fake cases.
- The judge ordered plaintiff’s lawyers to produce the cases.
- Plaintiff’s counsel (sanctioned) submitted an affidavit attaching (AI-hallucinated) excerpts of the “cases.”
- The judge reviewed the “purported decisions” and described them as showing “stylistic and reasoning flaws that do not generally appear in decisions” from federal courts and containing legal analysis that was “gibberish.”
- Judge then ordered plaintiff’s lawyers (sanctioned) to show cause why they should not be sanctioned.
- One of the lawyers finally admitted that he had used ChatGPT to conduct his legal research, not understanding that it could invent fake cases. (Sources: Mata v. Avianca, Inc., SDNY BLOG, AEI, New York Times, Advisory Opinions, Matthew Russell Lee’s Newsletter Substack)
Penalty: Fines, Apologies
- Fines: The attorney who misused ChatGPT directly, the attorney who signed off on the briefs without verifying the AI-hallucinated sources, and the law firm were fined $5,000 total.
- Apologies:
- Client
- Judges cited in fake cases
- Opposing Counsel’s Fees: were not requested. Some later cases citing Mata v. Avianca have awarded attorneys’ fees.
- Remedial Education: Education not mandated as it would be redundant. The firm had already arranged for CLE on generative AI.
Generative AI Technological Issues
The main point seems to be: don’t lie to a judge and come clean as soon as possible. But what about the specifics of generative AI technology that led to the attorneys’ confusion?
Is it A Search Engine? A Data Base? Nope.
The biggest problem was the researching attorney’s mental model of a large language model (LLM) chatbot like ChatGPT as a super smart database. In 2023, ChatGPT was not a search engine at all. LLMs predict the next token and the output is not deterministic. In other words, you won’t get the same output every time, even with the same input. That means that the LLM isn’t “retrieving” information, but generating it based on the model’s weights, and it won’t necessarily generate the same thing twice.
Storytelling and Next Token Prediction
Let’s say I asked ChatGPT to tell me the story “Goldilocks and the Three…” most likely, it would continue by saying “Bears” as the most likely option and tell a familiar version with porridge and broken furniture. But the version it tells me today would probably not be identical to the version it would tell me tomorrow. And it might, occasionally, tell me a new story about Three Dinosaurs or Three Ponies or something else.
Not on the Same Page
In the same way, ChatGPT doesn’t have a fixed version of the fake cases. If I went to Wikipedia and saw that some angry botanist had defaced the Wikipedia page for Nix v. Hedden to say that the Supreme Court had ruled tomatoes were fruit rather than vegetables for the purpose of tariffs, anyone accessing Wikipedia at the same time would see the same false content. It wouldn’t be true, but it would be consistent. However, if the opposing counsel for the airline in Mata v. Avianca had tried to investigate the plaintiff’s brief by asking ChatGPT for the text of the fake case Varghese v. China Southern Airlines, they would not have been reading the same version of the made up case. The plaintiff’s attorneys would have been reading Varghese A and the defendant’s attorneys would have been reading Varghese B, generated on the spot by ChatGPT.
Search and Prompt Injection
The ChatGPT search feature was added in October 31, 2024, well over a year after Mata v. Avianca. However, even with the ability to search, ChatGPT may still hallucinate. It can search for sources and cite them, but may still produce unfaithful summaries of the content: either by citing sources and summarizing them incorrectly, or summarizing them correctly but in a way that isn’t really on-point. Additionally, as with all LLM-enabled tools, it is susceptible to prompt injection attacks. That means the AI can receive and act on malicious instructions hidden in content it reads. If you use ChatGPT search for research, an adversarial target of your investigation could manipulate the information your AI assistant is telling you in its summaries.
[EDIT (September 19, 2025): I’ve added clarification to the timeline for features in ChatGPT, but it does not change my analysis in the following section. In April 2023, OpenAI added an “Alpha” feature for GPT-3.5 with Browsing. While not the same as ChatGPT Search, it allowed for limited internet search and the feature was slowly made available to paid users of ChatGPT. It would not have been involved in the initial hallucinated cases in Mata v. Avianca, which were from March 2023 or earlier, although it is possible that the feature was available in April during the follow-up research).
Hallucination: Making Stuff Up
ChatGPT and other LLMs are incentivized to give an answer rather than no answer, which creates a bias toward making things up. The AI industry has settled on the term “hallucination,” although “confabulation” was a competing term. You might also argue for “lying” (and I will, at a later date).
Sycophancy: Telling You What You Want to Hear
These chatbots were developed using a process that rewards them for telling users answers they want to hear. This means that LLMs want to tell users answers they like, whether or not they correspond to reality (Anthropic). As a consequence, telling a chatbot “give me proof of X” is a terrible way to get at the truth; it might dutifully make up proof if none exists. Asking “is there any proof of X?” is helpful, but it may still indicate that what you want is for X to be true. A neutral prompt like “what is the evidence for or against this proposition?” may provide better results. No mater what, you still have to check the sources.
Conclusion
So Mata v. Avianca was about misusing ChatGPT. But it was also about lying to the judge. And it was about going back to ChatGPT to fact-check ChatGPT, instead of looking for external sources to verify the false claims. And it was about signing off on another attorney’s work without verifying it.
If you are interested in learning appropriate use of AI in law, you can contact Midwest Frontier AI Consulting (email):
- We offer training on how to use generative AI in your firm while accounting for hallucination risks.
- In Mata v. Avianca, the opposing counsel identified the fake citations. Even if you are not interested in using AI yourself, we also offer training on identifying signs of possible AI misuse by opposing counsel.
- The absence of an AI policy is not a ban. We offer consulting to develop an AI policy either allowing AI use under specific circumstances or clearly banning it.
P.S. Reporting from SDNY at the Time
Description of the sanctions hearing on June 8, 2023…
…and a summary of Mata v. Avianca in song form by the author of the above Substack. It currently only has ~250 views, but I think it deserves more.
Cross-posted from AI & Law Substack
Footnotes
-
Westlaw’s CoCounsel AI uses both OpenAI’s GPT models and Google’s Gemini models, based on its FAQs. Lexis+ AI uses both OpenAI’s GPT models and Anthropic’s Claude models, according to its product page. ↩