Spectra Assure Free Trial
Get your 14-day free trial of Spectra Assure for Software Supply Chain Security
Get Free TrialMore about Spectra Assure Free Trial
A team of researchers has developed a technique that uses “fingerprinting” to identify the large language models (LLMs) embedded in applications, a technique that can be used to accelerate attacks.
The technique, called LLMmap (PDF), can identify 42 LLM versions with 95% accuracy using as few as eight interactions, according to the researchers, Dario Pasquini, of RSAC Labs, and Evgenios M. Kornaropoulos and Giuseppe Ateniese, of George Mason University.
Identifying the specific version of an LLM embedded within an application can reveal critical attack surfaces, the researchers said. Once the LLM has been accurately fingerprinted, an attacker can craft targeted adversarial inputs; exploit vulnerabilities unique to that model version, such as the buffer overflow vulnerability in mixture-of-experts architectures, privacy attacks, and the “glitch tokens” phenomenon; or exploit previously leaked information.
The researchers noted that a successful fingerprinting technique can fast-track an attack and enable the adversary to automate the generation of tailored inputs that work robustly on the specific LLM version under attack.
The strong efficacy of tailored inputs compared to non-tailored inputs has been widely studied — and is helpful regardless of whether the attacker operates in the white-box or black-box threat model, the researchers added.
Here’s what you need to know about fingerprinting LLMs — and how to leverage the visibility for better application security (AppSec).
[ See webinar: Stop Trusting Packages. Start Verifying Them. ]
Rosario Mastrogiacomo, chief strategy officer at Sphere Technology Solutions and author of AI Identities: Governing the Next Generation of Autonomous Actors, said LLMmap fingerprinting is used to identify a specific model — or even the specific version or deployment instance — behind an AI-powered application. “By systematically probing a model with carefully crafted prompts and analyzing patterns in its outputs, an attacker can infer characteristics such as architecture family, training data tendencies, alignment tuning, safety guardrails, or hosting provider,” he said. The fingerprinting is not itself exploitation, but it can be the first step toward targeted compromise, he added.
“Think of it as digital model reconnaissance. Just as attackers fingerprint operating systems, web servers, or APIs to determine their version and vulnerability surface, they now apply similar methods to LLMs.”
—Rosario Mastrogiacomo
The LLMmap research demonstrates that models leave subtle but consistent behavioral signatures across structured prompt sets, he said. These signatures can be compared against known baselines to classify which model is in use.
Every LLM has distinct characteristics based on its training data and optimization weights, said David Lindner, CISO and data privacy officer at Contrast Security, that remain detectable even when the model is hidden behind a generic user interface.
“An attacker wants to identify the specific model to transition from generic attempts to targeted exploits. Knowing the model version allows them to utilize existing research about that specific model’s known weaknesses and safety guardrails. This information makes it easier to predict how the system will react to malicious inputs or attempts at data extraction.”
—David Lindner
Lindner said that once a model has been identified, attackers can deploy specialized jailbreak prompts that are known to bypass the safety filters of that specific version. “They can also exploit architectural flaws such as token processing limits or specific attention-mechanism weaknesses to disrupt service,” he said.
“Knowledge of the fingerprint enables more efficient prompt injection attacks that target the way a particular model interprets instructions.”
—David Lindner
Jailbreaking attacks can also be optimized, Mastrogiacomo said. “If the model’s safety alignment is known to be weaker in specific domains — code generation, translation, or role-play scenarios — attackers can route malicious intent through those pathways,” he said.
Fingerprinting can also facilitate context-window manipulation, Mastrogiacomo noted.
“If the fingerprint reveals a smaller context limit or specific truncation behavior, attackers can craft inputs that deliberately overflow or displace system instructions.”
—Rosario Mastrogiacomo
Tool invocation attacks become more feasible as well, Mastrogiacomo said. “If the application architecture connects the LLM to APIs, identity systems, or data stores, attackers may exploit indirect prompt injection, embedding malicious instructions in retrieved documents or user-supplied content that the model later treats as authoritative,” he said.
There are also model extraction and inversion risks, he continued. “A fingerprinted model is easier to profile for response consistency, which can enable attempts to reconstruct training tendencies or replicate decision logic,” Mastrogiacomo said.
But the most concerning risk in enterprise environments is decision manipulation, he said.
“If an LLM participates in identity workflows — access recommendations, entitlement reviews, risk scoring — then targeted prompt engineering can influence governance decisions. At that point, the issue is no longer misinformation. It becomes privilege escalation.”
—Rosario Mastrogiacomo
Fingerprinting primarily improves the efficiency of model-specific jailbreaks because known techniques for that model family can be selected with confidence rather than guesswork, said Noah Stone, head of content at GrayNoise. “It’s worth noting that prompt injection and training data extraction are real threats but work in many documented cases without the attacker knowing which model is running,” he said.
“Where fingerprinting matters most is in targeted exploitation of agentic systems, where knowing the model’s behavioral tendencies helps an attacker manipulate it into invoking tools or taking unauthorized actions.”
—Noah Stone
The LLMmap researchers also speculated whether fingerprinting can be avoided entirely. “In settings such as [operating system] fingerprinting, standardizing implementation details could theoretically eliminate fingerprinting without affecting core functionality,” they wrote. “However, for LLMs, fingerprinting is tied to the model’s fundamental behavior. Altering this behavior to prevent fingerprinting would also mean altering the model’s functionality, which may not be feasible or desirable in many cases.”
“Ultimately, our findings suggest that LLM fingerprinting is an inevitable consequence of the unique behaviors exhibited by different models. Thus, it seems unlikely that a practical solution exists that can fully obscure an LLM’s behavior to prevent fingerprinting while preserving its utility.”
—LLMmap researchers
The researchers noted that the difficulty is further compounded “when the defender is unaware of the attacker’s query strategy or when the query strategy is deliberately designed to be hard to detect and block.”
The LLMmap authors were candid that it will be challenging — or even unrealizable — to devise effective countermeasures, “so defenders should assume fingerprinting may succeed and focus on limiting what happens next," GrayNoise’s Stone said.
His recommendations for foiling fingerprinting forays:
“The risk also varies by deployment. Public-facing LLM APIs face this as pre-exploitation recon, while internal LLMs behind a corporate perimeter require prior network access before fingerprinting is even possible.”
—Noah Stone
Carl Vincent, principal AI security researcher at Straiker, said fingerprinting attacks can be easily avoided if you know the model you are using well enough.
“This is like OWASP vulnerabilities that disclose things like web server version or application hosting framework. It’s typically a ‘low’ on a pen test, but since most implementing shops don’t know their tech well, they also don’t know the models powering their applications well, and thus don’t know that by allowing fingerprinting, they might be telling attackers everything they need to know to compromise their application and or subsequent environment.”
—Carl Vincent
Contrast Security’s Lindner said fingerprinting is a critical early step in the attack lifecycle “that often leads to more severe security breaches.”
“As companies increasingly rely on third-party AI services, the ability to maintain model anonymity becomes a vital component of defense in depth. Protecting the identity of the underlying model prevents adversaries from using automated tool sets to find and exploit known vulnerabilities across the internet.”
—David Lindner
Learn about how to detect malware in ML and LLM models with Spectra Assure. Plus: Join Spectra Assure Community to start using binary analysis to secure your packages — for free.