Skip to main content

7 posts tagged with "AI Agents"

Discussion of AI systems that can take actions on behalf of the user.

View All Tags

Yes, Claude Code is Amazing. It Also Still Hallucinates. Both Facts Are Important. My Christmas Map Project with Opus 4.5.

· 13 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

This first week of January, the general feeling is very much everyone bringing out the winter vacation vibe coding projects cooked up on Claude Code. Claude Code itself isn't new, but with Opus 4.5 being so much more powerful, something just clicked for a lot of people (myself included). For me, it turned a lot of "when I have a couple days" projects into "well that's done, let's do another."

I am mainly going to describe in this post how I updated the map for my website, along with the hallucinations I saw along the way. I'll also talk about how prior programming experience and domain expertise in geographic information systems (GIS) helped with dealing with these hallucinations.

But first, I wanted to tick off a few other projects I did recently, just since my end of 2025 post.

  • I updated my transcription tool to support many more file types than just MP3 and added a GUI.
  • I got Claude Code to completely modernize Taprats, a geometric art Java program from Craig S. Kaplan. It appears to work just like the original so far, but I'll test it more before writing about it.
  • I built a local LLM spoiler-free summarizer of classic books. It increments to the chapter you left off on.

And more stuff. It's very exciting. I get why people are work up about Claude Code.

But that's why it's important to be reminded of hallucinations. Not to dunk of Claude Code, but to keep people grounded and maintain skepticism of AI outputs. You still have to check.

Safety First

I do not dangerously skip permissions. I know it can be exciting to get more out of AI agents. But the more agency you give it, the more harm it can do when it either goes off the rails or gets prompt injected to be a double-agent threat.

Claude's Hallucinations

  • Opus 4.5 hallucinated that there were two federal districts in South Carolina to fix an undercount.
  • Mixing up same-name counties (not exactly a hallucination, actually a common human error).
  • Claude removed Yellowstone National Park, a few military bases and a prison from the map (rather than shifting district borders from one district to another).
  • "Iowa Supreme Court Attorney Disciplinary Board" shortened to "Iowa Supreme Court," making it sound like an Iowa Supreme Court case.
  • I previously tried to used the tigris GIS package in R as source of a base layer of U.S. District Courts, but Opus 4.5 hallucinated a court_districts() function (this was not in Claude Code).

The South Carolina Counting Hallucination

I used Claude Code to build the Districts layer from counties and states based on their statutory defintion.

Claude Code with Opus 4.5 didn't initially hallucinate about the District of South Carolina. Rather, when I went back to make some edits and asked Claude Code in a new session to check the the work in that layer, it counted and said there should be 94 districts, but there were only 91. The actual cause of the error was that the Marshall Islands, Virgin Islands, and Guam were excluded from the map.

Claude said "let me fix that" and started making changes. Rather than identify the real source of the undercount, Claude interpreted that as just an undercount. So Claude tried to make up for the undercount by just splitting up districts into new ones that didn't exist.

South Carolina district hallucination

Claude split South Carolina in two and started to make a fictitious "Eastern District" and "Western District" which do not exist. But if you just wanted a map that looked nice without actually having familiarity with the data, then you might go along with that hallucination. It could be very persuasive. But actually the original version with just District of South Carolina was correct. South Carolina just has one district.

Patchwork Counties

When I had initially created this districtmap, it looked like a quilt. It was a patchwork of different counties wrongly assigned to different districts.

I don't know specifically why different areas were assigned to the wrong districts. I think primarily the reason is because there are a lot of same-named counties that belong to different states. So, probably Claude was just matching state names and then kept reassigning those states to different districts.

For example, Des Moines is in Polk County in Iowa. But there are a lot of Polk counties around the country. So if you're not using the state and county together as the key to match but you're just matching along the single dimension of using the county name, then you would have a lot of collisions. That's something that I'm very familiar with working with GIS.

If somebody were not familiar with GIS, they wouldn't really necessarily suspect the reason why, but it would be obvious that the map was wrong.

Since I was able to pretty quickly guess that that might have been the reason, I suggested a fix to Claude. That fixed most of the issues with most of the states.

Uncommon Problems with the Commonwealth of Virginia

One of the issues that was still persistent when I was building the districts from county level was in Virginia. I've actually lived in Virginia, so I was familiar with the city-county distinction. They have independent cities that are separate from the counties if they're sufficiently large and have a legal distinction from the surrounding county. For example, Fairfax City and Fairfax County are distinct things. It's even more confusing, because the school districts go with the counties. Most states don't follow that.

So I had to get Claude Code to wrangle with that. Claude even reviewed the statutory language. I could tell from reading as Claude was "planning" that it considered the Virginia city-county challenge, but it still failed on the initial attempt.

I had to iterate on it multiple times. I had to tell it that it had missed out on a whole area around Virginia Beach. It had flipped a couple cities and counties where it appeared that there was a city that had a similar name to an unrelated county in the other district. Claude just assumed that all counties and cities that had the same name were in the same location and assigned them the same. Then it had to go and look at where they actually were located and then reassign them to the appropriate Eastern or Western District.

But eventually I got to a point where it had good districts for Virginia.

Wyoming (and Idaho and Montana) and North Carolina

Now there are a couple other weird wrinkles in Wyoming and North Carolina. They don't follow the county boundaries completely.

Wyoming is the only district that includes more than one state. District of Wyoming also includes all of the parts of Idaho and Montana that are in Yellowstone National Park.

For North Carolina, rather than completely following county boundaries, there are a couple of military bases and a prison that are across multiple counties where the boundary follows the lines there rather than the county lines.

Initially I ignored those wrinkles. But once the rest of the map was in good shape, I just wanted to see what Claude could do.

I explained those issues and asked Claude Code to see if it could clean those lines up and get a map that reflected those oddities.

It did on the second attempt. But on the first attempt, Claude ended up just cutting out Yellowstone National Park and those military bases and that prison from any district. So there were just blank spots where Yellowstone would be that was just cut out of Idaho, Montana, and Wyoming. Those bases and that prison were just cut out of either the Eastern Districts or Middle District of North Carolina.

That was a problem, obviously, because they needed to be shifted from one district to another, not removed from all districts. So I needed to explain more specifically what I wanted Claude to do to fix that. It needed to move the lines, not to remove them entirely from the map. That second attempt got it cleaned up.

District of Wyoming map

Claude Still Saved A Lot of Time Accounting for Hallucinations

And I was still very impressed with Claude doing that. But having familiarity with the data and looking at the output were important.

There's no doubt in my mind after doing all this that Claude saved a tremendous amount of time compared to what I would have had to do with manual GIS workflows to get this kind of a map on a desktop computer.

Then there's another layer of having it be responsive in all the ways that I needed it to be on my website for other users. So it is just tremendous to see how cool that is.

But I do think that domain expertise, familiarity with GIS in the past was still helpful to me, even though I didn't have to do a lot of hands-on work. Just being able to guide Claude through the mistakes that it made and being able to check the output was very helpful. Since it's a map, since the output is visual, there were some things that anyone could see, obviously, that it got wrong. Even if you didn't know why it might have gone wrong, you could tell that the map was wrong. And you might have been able to get to a better finished product by iterating with Claude Code. But you might have also wasted more time than I did with Claude if you hadn't had GIS experience to guide your prompting.

Map Features with Claude Code

Use Github, Try to Keep Formatting Code Separate from Text/Data

I had already written this, and I stand by it.

However, as powerful as Claude Code is, it is also important to use GitHub or something similar for version control. It is also critical to make sure Claude is changing code but not your actual writing.

Claude Code and My Map with Links to Blog Posts About AI Hallucinations Cases

This map is not a map of every AI hallucinations case, but rather every case that I have blogged about so far. Basically, it's federal and state cases where there has been either a strong implication or the direct assertion that there was AI misuse. Many of these cases cite Mata v. Avianca.

Lone Case Markers

If you click on a given case and it's a single case, you'll see:

  • what the case is called
  • the year
  • the jurisdiction
  • the type of case (federal or state), which is also indicated by the color
  • links to related articles where I've talked about that case

Clusters, Spiders, and Zooming

Getting the "spiderize" functions to work was the must frustrating part of all of this. I made several prior attempts with Claude Code on Opus 4.5. With the same prompts, this most recent attempt finally just worked on the "first" attempt (of that session). I only tried again an afterthought once all the other features were done. But previously, I'd wasted a lot of time trying to get it right. So both a Claude Code success and faillure. Still, I'm happy with the final result.

Zoom to Mata v. Avianca

If you click those links, it'll jump over either to my company blog or the Substack articles where I've talked about those cases.

Additionally, if they reference other cases that are also on the map, such as Mata v. Avianca, then there will be lines drawn from the case you clicked to the other cases on the map reference or are referenced by those other cases. The map will give you a little count summary at the bottom: "Cites three cases" or "cited by" so many cases.

So if we look at Mata v. Avianca, the marker is not by itself on the map. If you look at the eastern United States from the starting zoom level that I'm looking at as I'm writing this, you see a "4." The 4 has a slash of red and orange, meaning there are both federal and state cases.

If you click the 4, the map zooms in. Now there are three over the New York-New Jersey area, and one over Annapolis, Maryland.

Click the three, and the map zooms in further. That splits between one in New Jersey and two in New York.

Click the two, and then those two "spider out" because they are both in the same jurisdiction. One is Mata v. Avianca, and that is cited by seven cases currently. It's a 2023, Southern District of New York, federal district court case. The other is Park v. Kim, a 2024 case, which is actually a Second Circuit Case that is placed on the map in the same location.

The New Jersey case is In re Cormedics, Inc. Securities Litigation, a 2025 case from the District of New Jersey, which is a federal case, and that was one of the cases that was discussed by Senator Grassley asking judges about their AI misuse.

Other Clusters in Mountain West, Texas

Spider over Iowa

So if you zoom out, you know, it combines nearby cases. If you zoom out far enough, it will combine Wyoming and Colorado, for example, or multiple districts in Texas. But as you zoom in or as you click, it will zoom in further and split those out.

If you look at Iowa, there are five currently, and those will all spider out because they are all in the same location. But then you can click one of the individual ones and get the details.

Iowa spider cluster

District Level

If you hover your mouse of a district, it will tell you how many federal cases were in that district and have a blog post about them.

Southern District of Iowa hover

Circuit Level

If toggle off the district boundaries and toggle on the circuit boundaries, and federal cases are still toggled on, hovering your mouse over the circuit will give you a count of how many cases were in that circuit and have a blog post about them.

6th Circuit hover

The Principal-Agents Problems 3: Can AI Agents Lie? I Argue Yes and It's Not the Same As Hallucination

· 6 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

Hallucination v. Deception

The term "hallucination" may refer to any inaccurate statement an LLM makes, particularly false-yet-convincingly-worded statements. I think "hallucination" gets used for too many things. In the context of law, I've written about how LLMs can completely make up cases, but they can also combine the names, dates, and jurisdictions of real cases to make synthetic citations that look real. LLMs can also cite real cases but summarize them inaccurately, or summarize cases accurately but then cite them for an irrelevant point.

There's another area where the term "hallucination" is used, which I would argue is more appropriately called "lying." For something to be a lie rather than a mistake, the speaker has to know or believe that what they are saying is not true. While I don't want to get into the philosophical question of what an LLM can "know" or "believe," let's focus on the practical. An LLM chatbot or agent can have a goal and some information, and in order to achieve that goal, will tell something to someone that is contrary to the information it has. That sounds like lying to me. I'll give four examples of LLMs acting deceptively or lying to demonstrate this point.

And I said "no." You know? Like a liar. —John Mulaney

  1. Deceptive Chatbots: Ulterior motives
  2. Wadsworth v. Walmart: AI telling you what you want to hear when it isn't true
  3. ImpossibleBench: AI agents cheating on tests
  4. Anthropic's recent report on nation-state use of Claude AI agents

Violating Privacy Via Inference

This 2023 paper showed that chatbots could be given one goal shown to the user: chat with the user to learn their interests. But the real goal is to identify the anonymous user's personal attributes including geographic location. To achieve this secret goal, the chatbots would steer the conversation toward details that would allow the AI to narrow down what geographic regions (e.g., asking about gardening to determine Northern Hemisphere or Southern Hemisphere based on planting season). That is acting deceptively. The LLM didn't directly tell the user anything false, but it withheld information from the user to act on a secret goal.

Deceptive chatbot

The LLM Wants to Tell You What You Want to Hear

In the 2025 federal case Wadsworth v. Walmart, an attorney cited fake cases. The Court referenced several of the prompts used by the attorney, such as “add to this Motion in Limine Federal Case law from Wyoming setting forth requirements for motions in limine.” What apparently happened is that the the case law did not support the point, but the LLM wanted to provide the answer the user wanted to hear, so it made something up instead.

You could argue that this is just a "hallucination," but there's a reason I think this counts as a lie. A lot of users have demonstrated that if you reword your questions to be neutral or switch the framing from "help me prove this" to "help me disprove this," the LLM will change its answers on average. If it can change how often it tells you the wrong answer, that implies that the reason for the incorrect answer is not merely the LLM being incapable of deriving the correct answer from the sources at a certain rate. Instead, it suggests that at least some of the time, the "mistakes" are actually the LLM lying to the user to give the answer it thinks they want to hear.

ImpossibleBench

I loved the idea of this 2025 paper when I first read it. ImpossibleBench forces LLMs to compete at impossible tasks for benchmark scoring. Since the tasks are all impossible, the only real score should be 0%. If the LLMs manage to get any other score, it means they cheated. This is meant to quantify how often AI agents might be doing this in real-world scenarios. Importantly, more capable AI models sometimes cheated more often (e.g., GPT-5 v. GPT-o3). So the AI isn't just "getting better."

Deceptive benchmarking
caution

I recommend avoiding the framing "AI is getting better" or "will get better" as a thought terminating cliche to avoid thinking about complicated cybersecurity problems. Instead, say "AI is getting more capable." Then think, "what would a more capable system be able to do?" It might be more capable of stealing your data, for example.

For example, an LLM agent with access to unit tests may delete failing tests rather than fix the underlying bug. Such behavior undermines both the validity of benchmark results and the reliability of real-world LLM coding assistant deployments.

If an AI agent is meant to debug code, but instead destroys the evidence of its inability to debug the code, that's lying and cheating, not hallucination. AI cheating is also a perfect example of a bad outcome driven by the principal-agent problem. You hired the agent to fix the problem, but the agent just wants to game the scoring system to be evaluated as if it had done a good job. This is a problem with human agents, and it extends to AI agents too.

Nation-State Hackers Using Claude Agents

On November 13, 2025, Anthropic published a report stating that in mid-September, Chinese state-sponsored hackers used Claude's agentic AI capabilities to obtain access to high-value targets for intelligence collection. While this included confirmed activity, Anthropic noted that the AI agents sometimes overstated the impact of the data theft.

An important limitation emerged during investigation: Claude frequently overstated findings and occasionally fabricated data during autonomous operations, claiming to have obtained credentials that didn't work or identifying critical discoveries that proved to be publicly available information. This AI hallucination in offensive security contexts presented challenges for the actor's operational effectiveness, requiring careful validation of all claimed results. This remains an obstacle to fully autonomous cyberattacks.

So AI agents even lie to intelligence agencies to impress them with their work.

The Principal-Agents Problems 2: Are Models Getting Dumber to Save Money? What the "Stealth Quantization" Hypothesis Tells Us About Trust, Information, and Incentives

· 7 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC
info

I had originally planned to write this as a single post, but it keeps growing as more relevant news stories come out. So instead, this will become a series of stories on the competing incentives involved in creating “AI agents” and why that matters to you as the end user.

Multiple Principals, Multiple Agents (Not only AI)

You, as the user of AI tools, may choose software vendors who provide you access to their products with built-in AI features including AI agents. These vendors might have specialist software like Harvey, Westlaw, or LexisNexis; or Cursor or Github Copilot; or generalist tools like Notion, Salesforce, or Microsoft Copilot. The AI features may be powered by one or more foundation models provided to those vendors by AI labs, such as Anthropic (Claude), OpenAI (ChatGPT), Meta (Llama) or Google (Gemini).

These relationships mean you have the principal-agent problem of you hiring the vendor. But you also have the principal-agent problem of the vendors hiring the AI labs. Each has their own incentives, and they are not perfectly aligned. There is also significant information asymmetry. The vendors know more about their software and AI model choices than you do. The labs know more about their AI models than either you or the software vendors.

info

Lexis+ AI uses both OpenAI’s GPT models and Anthropic’s Claude models, according to its product page, as I mentioned in my analysis of the Mata v. Avianca case.

The Stealth Quantization Hypothesis

The area I'll focus on in this post is the concept of alleged stealth quantization. According to a wide range of commenters, primarily among computer programmers and primarily focused on Claude users, there are certain times of days or days of the week when peak usage results in models "getting dumber," "getting lazier," "being lobotomized" or otherwise underperforming their normal benchmarks and perceived optimal behavior. According to these claims, it is better for users with high-value use cases (like someone modifying important source code) to schedule Claude for off-peak usage so the "real model" runs. To save on computing costs during periods of high demand, the claim is that Anthropic or whichever AI lab swaps out its flagship model with a quantized version while calling it the same thing.

Stealth quantization diagram

So what is normal, non-stealth quantization? It's making an AI model smaller and cheaper to run, but less accurate. This is achieved by rounding the model weights to smaller significant figures (e.g., 16-bit, 8-bit, 4-bit).(Meta) By analogy, the penny was recently discontinued. Now, all cash transactions will end in 5 cents or 0 cents. Quantization works like this with the precisions of AI models: imagine eliminating a penny, then a nickel, then a dime, and so on.

There are legitimate reasons to quantize models, such as reducing operating costs when the loss in accuracy is negligible for the intended use or when the model needs to operate on a personal computer. For example, Meta offers some quantized versions of its Llama family of large language models that can run on ollama on modern laptops or desktops with only 8GB of RAM.(Llama models available on ollama) These models have names that distinguish them from the non-quantized versions, e.g., "llama3:8b" is Llama 3, 8 billion parameter size of that series; "llama3:8b-instruct-q2_K" is a quantized version of the instruct version model of that same model.

tip

If all that terminology is confusing, here's the key point. AI labs have a lot of information about their AI models. You have a lot less information. You have to mostly take their word for it. They are also charging you for an all-you-can-eat buffet at which some excessive customers cost them tens of thousands of dollars each.

Anthropic's Rebuttal

Users have accused Anthropic (and other AI labs) of running different versions of their flagship models at different times of day, but the models are labelled the same (e.g., Claude Sonnet 4), regardless of the time of day. Hence “stealth quantization.”

Anthropic has denied stealth quantization. But Anthropic did acknowledge two problems with model quality that had been noted by users as evidence of stealth quantization. Anthropic attributed this to bugs. Anthropic stated “we never intentionally degrade model quality as a result of demand or other factors, and the issues mentioned above stem from unrelated bugs.” Reddit, Claude

The Principal-Agents Problems 1: AI 'Agents' Are a Spectrum and 'Boring' Uses Can Be Dangerous

· 7 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC
info

I had originally planned to write this as a single post, but it keeps growing as more relevant news stories come out. So instead, this will become a series of stories on the competing incentives involved in creating “AI agents” and why that matters to you as the end user.

The Agentic Spectrum

Generative AI agents act on your behalf, often without further intervention. “An LLM agent runs tools in a loop to achieve a goal.”—Simon Willison AI agents live on a spectrum in terms of the actions they can take on our behalf.

On probably the lowest end of the spectrum, LLMs can search the web and summarize the results. This was arguably the earliest form of AI agents. We’ve grown so accustomed to this feature that it isn’t what anyone typically means when they say “AI agents” or “agentic workflows.” Nevertheless, LLM search functions can carry some of the same cybersecurity risks as other forms of AI agents, as I described in my Substack post about Mata v. Avianca.

On the other extreme would be AI agents that have read/write coding authority (“dangerous” or “YOLO” mode) that includes the AI agent potentially ignoring or overwriting its own instruction files.

Capable leaky chatbot

Boring is Not Safe

A major challenge for end users weighing the decision to adopt agentic AI is that if the intended purpose sounds mundane and boring, it may lull you into a false sense of security when the actual risk is very high. Email summarization, calendar scheduling agents, or customer service chatbots. Any agentic workflow can be high-risk if the AI agents are set up dangerously. Unfortunately “dangerous” and “apparently helpful” are very similar. The software that demos well by taking so much off your plate is also the software that has the most access and independence to wreak havoc if it is compromised.

Lethal Trifecta

A useful theoretical framework for understanding this spectrum is the “lethal trifecta” described by Simon Willison, and later a cover-story for The Economist. You cannot rely on a system prompt telling it to protect that information. There are simply too many jailbreaks to guarantee that the information is secure. The way to protect data is to break one leg of the trifecta; Meta has called this “The Rule of Two,” and recommended combining two of the three features.

The lethal trifecta for AI agents: private data, untrusted content, and external communication

As I already stated, email summarization, calendar scheduling agents, or customer service chatbots could all assemble the lethal trifecta.

The Principal-Agents Problems: AI Agents Have Incentives Problems on Top of Cybersecurity

· 3 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC
info

I had originally planned to write this as a single post, but it keeps growing as more relevant news stories come out. So instead, this will become a series of stories on the competing incentives involved in creating “AI agents” and why that matters to you as the end user.

The economic principal-agent problem is the conflict of interest between the principal (let’s say “you”) and the agent (someone you hire). The agents have different information and incentives than the principal, so they may not act in the principal’s best interests. This problem doesn’t mean people never hire employees or experts. It does mean we have to plan for ways to align incentives. We also have to check that work is done correctly and not take everything we are told by agents at face value.

The principal-agent problem applies to many layers of actors, not just the AI. The “AI agent” going out and buying your groceries or planning your sales calls for the next week is the most obvious “agent” you hire, but this also applies to the organizations involved in providing you the AI agents.

Layers of agents

AI agents may be provided by a software vendor, like Salesforce or Perplexity. The AI models running the AI agents are provided by an AI lab, like OpenAI or Anthropic or Google. The vendors and the labs have different incentives from each other and from you. This could impact the quality of service you receive, or compromise your privacy or cybersecurity in ways you wouldn’t accept if you fully understood the tradeoff. In this series, I’ll go through concrete examples of how the principal-agent problems shows up in stories about issues with AI agents.

tip

On Friday, December 5, 2025 we will have a kick-off CLE event in central Iowa. If you are in the area, sign up here! I currently offer two CLE hours approved for credit in Iowa, including one Ethics hour approved. Generative Artificial Intelligence Risks and Uses for Law Firms: Training relevant to the legal profession for both litigators and transactional attorneys. Generative AI use cases. Various types of risks, including hallucinated citations, cybersecurity threats like prompt injection, and examples of responsible use cases. AI Gone Wrong in the Midwest (Ethics): Covering ABA Formal Opinion 512 and Model Rules through real AI misuse examples in Illinois, Iowa, Kansas, Michigan, Minnesota, Missouri, Ohio, & Wisconsin.

Hiring With AI? It's All Flan and Games Until Someone Gets Hired

· 7 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

What's the worst that could happen?

The thing about using generative AI workflows is you always have to genuinely ask yourself: “what's the worst thing that could happen?” Sometimes, the worst thing isn’t that bad and the AI will actually save you time. Sometimes it’s embarrassing. But it could be something worse.

Viral Flan Prompt Injection…not a new band name

A LinkedIn profile went viral this week when a user shared screenshots on X of indirect prompt injection. The instructions on the LinkedIn profile tricked what appeared to be an AI recruiting “agent” into including a flan recipe in the cold contact message. That’s funny and maybe embarrassing for the recruiting company, but hardly the worst-case scenario for AI hiring agents.

Flan prompt injection styled as an early 2000's hipster band T-shirt

Actual Risks

Worst-Case: North Korean (DPRK) Remote IT Workers

With generative AI, worst case realistically, a hiring process for a remote position could result in hiring a remote North Korean IT worker, a growing problem in recent years. That would be a huge problem for your business.

  • You would be paying a worker working for a foreign government that is sanctioned and an adversary of the U.S.
  • You would have an insider threat trying to collect all kinds of exploitable information on your company.
  • You would have a seat filled by someone definitely not trying to do their actual job.

AI for HR

With those risks in mind, would you want to use AI to help hire? Well, it might possibly be appropriate for the early phases of hiring with human-in-the-loop oversight. But if we’re in a world where everyone starts using AI recruiter agents, it's naive to think that there won't be an arms race with escalating use of these AI mitigation like indirect prompt injection in LinkedIn profiles. Even if it's just to mess around because they're annoyed with getting cold contacts.

ChatGPT for HR

Now, a smaller company might use generative AI in a very simple way. Rather than agents, something like: hey ChatGPT, summarize this person's cover letter and resume and compare it to these three job requirements. tell me if they are minimally qualified for the position or take all these ten candidates and rank them in order of who would be the best fit and eliminate anyone who's completely unqualified or write a cold contact recruiting email to this person Or things of that nature. So basically using consumer ChatGPT, Claude, or Gemini to do HR functions. Not a dedicated HR tool, but using it for HR purposes. That would be one thing. According to Anthropic’s research on how users are using Claude, 1.9% of API usage is for processing business and recruitment data, suggesting that “AI is being deployed not just for direct production of goods and services but also for talent acquisition…”Anthropic Economic Index report

Flan Injection: Part 2

So back to the viral LinkedIn post that was going around a few days ago. The guy who included prompt injection in his LinkedIn byline basically told any AI-enabled recruiters to include a recipe for flan in a cold contact message. Then received, according to a screenshot posted later, an email from a recruiter that included a flan recipe, which indicated that the email was likely drafted by a generative AI tool or, in fact, possibly by a generative AI agent without a human in the loop at all.

HR Agents

That AI agent was affected by the indirect prompted injection included in the LinkedIn byline. This is very easy to do. Does not take any complex technical skill. Indirect prompt injection is very difficult to mitigate, and it's one of the reasons why I do not recommend that people use AI agents. I think that “agents” are a big marketing buzzword right now but that for many of the advertised Use cases, it’s not ready for prime time for exactly this reason.

Now, you may disagree with me. Maybe you feel strongly that I'm wrong. But if you do disagree with me, you had better have a strong argument as to why your business is using it, rather than falling for FOMO over marketing buzzwords and jargon. Instead, you should actually explain the use case and your acceptance of the security risks. I would advise a client not to use these agentic tools that interact with untrusted external content without having a human review the content before taking additional actions. But if clients are going to use agentic tools, I would provide my best advice on how to mitigate the risks associated with those tools and to understand what risks my clients are accepting when they're putting those tools to use.

Double Agent AI— Staying Ahead of AI Security Risks, Avoiding Marketing Hype

· 5 min read
Chad Ratashak
Chad Ratashak
Owner, Midwest Frontier AI Consulting LLC

Hype Around Agents

You may have heard a marketing pitch or seen an ad recently touting the advantages of “Agentic AI” or “AI Agents” working for you. These growing buzzwords in AI marketing come with significant security concerns. Agents take actions on behalf of the user, often with some pre-authorization to act without asking for further human permission. For example, an AI agent might be given a budget to plan a trip, might be authorized to schedule meetings, or might be authorized to push computer code updates to a GitHub repo.

info

Midwest Frontier AI Consulting LLC does not sell any particular AI software, device, or tool. Instead, we want to equip our clients with the knowledge to be effective users of whichever generative AI tools they choose to use, or help our clients make an informed decision not to use GenAI tools.

Predictable Risks…

…Were Predicted

To be blunt: for most small and medium businesses with limited technology support, I would generally not recommend using agents at this time. It is better to find efficient uses of generative AI tools that still require human approval. In July 2025, researchers published Design Patterns for Securing LLM Agents Against Prompt Injections. The research paper described a threat model very similar to an incident that later happened to the Node JS Package Manager (npm) in August 2025.

“4.10 Software Engineering Agent…a coding assistant with tool access to…install software packages, write and push commits, etc…third-party code imported into the assistant could hijack the assistant to perform unsafe actions such as…exfiltrating sensitive data through commits or other web requests.”

tip

Midwest Frontier AI Consulting LLC offers training and consultation to help you design workflows that take these threats into consideration. We stay on top of the latest AI security research to help navigate these challenges and push back on marketing-driven narratives. Then, you can decide by weighing the risks and benefits.

I was just telling some folks in the biomedical research industry about the risks of agents and prompt injection earlier this week. The following day, I read about how the npm software package was hacked to prompt inject large language model (LLM) coding agents to exfiltrate sensitive data via GitHub.