It’s not hard to understand that, if someone constantly lies to you, the best solution is to remove them from your life. But what if it’s your AI? Mike Reeves, CISSP, CCSP, takes a member’s perspective on this conundrum.

Mike Reeves, CISSP, CCSPThe ability of an artificial intelligence (AI) platform to simulate information which it is lacking is commonly referred to as “hallucinations”. AI hallucinations are a strength of the technology, differentiating it from pure machine learning and the so-called Expert Systems of the past, enabling the creation of new and diverse content seemingly out of nowhere. However, hallucinations can become the ultimate gaslighting instance, causing AI users to question their own knowledge and experience.

Computers Aren’t Infallible

There is little argument that globally we have developed a utility mentality towards computing services, much like we do with electricity, gas and water. As a result, reliance on computers is now commonplace in every facet of our lives. When a computer calculates numbers, we expect that it is correct without the need to double check it. We rely on internet searches to pull up the most relevant, accurate, and reputable result to our query. It’s only natural to expect a product to be marketed as AI to behave like Data from Star Trek rather than Hal from 2001: A Space Odyssey.

Most people just learning about AI today are unaware that hallucinations are possible – let alone a feature of the system. They have taken the word of the AI as gospel, in some cases with horrible consequences – such as the Texas A&M University professor who attempted to fail their students because ChatGPT falsely claimed the AI wrote the student’s papers. The reality is that an AI is only as accurate as the relevant data it has on the subject – and there is no way of knowing if it has relevant (or indeed accurate) data or if it is just speculating a believable answer based on data-driven expectations because it can’t say “I don’t know.”

You can witness for yourself the unpredictability of AI technology by watching the Twitch streamer Vedal, who has been streaming his AI “Neuro” since 2022. Neuro is able to interact in real time with the stream’s chat comments and respond to verbal commands from Vedal and his guests. While live-streaming, Vedal has taught the AI how to simulate ordering a pizza, instructed it on etiquette and held complete conversations. However, the AI does not always behave as expected as Neuro has managed to break its profanity filter on multiple occasions, outwardly rejected Vedal’s corrections and denied statements it has made just moments before.

This rejection of reality is not limited to Vedal’s AI. Just as ChatGPT claimed it wrote papers it had not, ChatGPT has also infamously hallucinated court cases that never existed, complete with full citation.

I’ve used several different natural language AI tools for problem solving, research and a slew of other applications. Each AI has experienced varying degrees of hallucination, which drove me towards false pursuits rather than solutions. I’ve seen claims of commands, functions, or capabilities for a product which simply don’t exist. Such falsehoods would lead me down incorrect paths and/or lead me to believe that I’d done something wrong, rather than the AI. Ultimately, Google searches restored the correct perspective of what was possible and what was simply wrong.

These limitations weren’t isolated to technical or syntax problems. The AIs have forgotten to discuss the subject they were requested to exemplify, repeated the same information when told I needed more detail and refused to answer questions on the grounds of security despite me conducting security research. Like Neuro, telling the AI that it was wrong did not always result in the AI correcting itself – in some cases it doubled down.

The Serpent is Eating Its Tail

With more and more AI content being made available to search engines, it’s becoming increasingly difficult for the search engines to distinguish between valid information and hallucinated content. AIs are exceptionally efficient at optimizing their content for search engines, which naturally elevates search engines’ ratings for AI hallucinations, potentially over accurate results that should be preferred.

This is, ultimately, the classic database problem of garbage-in/garbage-out. In theory, the more that AIs learn, the less they hallucinate and the better they become – but this theory is predicated on what they are learning being true in the first place. Hallucinations feeding hallucinations has the potential to reject true information over the previously accepted hallucinations – degrading the AI’s reliability.

Corrupt AIs: A New Vulnerability

As AIs are increasingly used for more practical and security approaches, corrupting an AIs sense of truth is a very real attack path.

A plethora of AI-generated emails can be used to help the malicious ones fly under the radar or make a new phishing campaign appear to be “normal” noise in the spam filter. AI tools engineered for cyber threat modeling and intelligence can be targeted, using spoofing and botnets to convince the AI tool into believing that an innocent system or network is being used to launch attacks (likely automatically adding the system to a block-list distributed world-wide).

AI technology could be further exploited in a “scorched earth” approach, intentionally providing false or conflicting information with the intention of causing an AI system reliant on this information to become erratic or unreliable. The goal with this approach is to either erode confidence in the tool or make it no longer provide value.

Trust But Verify

As security professionals, we must defend against the accidental or intentional corruption of the data used to train our AIs. The best, recommended practice when employing AI is to have some form of validation outside the AI. Such checks may be in the form of human review, existing code review analysis tools, utilizing Expert Systems, or even cross-referencing using AI agents (or any combination of these solutions).

None of these mitigations are foolproof solutions and come with their own overheads. Nonetheless, as long as the AI generation time plus the validation time is less than it would take without the use of AI, productivity is improved. Such time saving demonstrates the AI’s value. Conversely, over-reliance on AI may be costly to productivity.

These days I find that I rely on AI less and less. While part of the reason might well be that the novelty of AI has worn off for me, I am also undeniably wary of trusting it. I’ve surmised that, if I’m going to cross-reference a knowledgebase or forum anyway, I might as well cut out the middle-man and go there first.

Regardless of my own reliance on AI (or lack thereof), it’s also undeniable that the day of generative AI feeding generative AI is on the horizon (if not already here). As AI gets incorporated in new and different ways, effective validation checks will be paramount to ensuring AI is providing value and reflecting reality.

Michael Reeves, CISSP, CCSP, has 23 years of experience in the National Defense and the Space Operations industries. He has held Cyber Technical Lead and Information Systems Security Manager roles, with responsibility for network defense, systems engineering, and risk management. His cybersecurity work spans cryptography, penetration testing, secure design, compliance and new technology integration.