RL Blog
|

The Week in Security: When AI attacks, ChatGPT lowers the bar for developing malware

Carolynn van Arsdale
Blog Author

Carolynn van Arsdale, Writer, ReversingLabs. Read More...

trojan-puzzle-ai-malicious-chatgpt

Welcome to the latest edition of The Week in Security, which brings you the newest headlines from both the world and our team across the full stack of security: application security, cybersecurity, and beyond. This week: New Trojan Puzzle attack shows how AI coding assistants can be trained for malicious purposes. Also: ChatGPT is enabling script kiddies to write functional malware. 

This Week’s Top Story

AI assistants can be trained to suggest malicious code

Coding assistants have become quite popular, such as OpenAI’s ChatGPT, released this past November. AI coding assistant platforms are trained using public code repositories, pulling code from abundant sources like GitHub. Given their rise in popularity, the threat to these platforms has grown, since malicious actors can use them to potentially carry out software supply chain attacks. 

BleepingComputer reports that researchers at the Universities of California, Virginia, and Microsoft devised a new attack that demonstrates the dangerous possibilities of AI coding assistants. Considered a poisoning attack, the researchers named it “Trojan Puzzle” (PDF), and it’s able to trick AI assistants into suggesting dangerous code. 

The researchers designed it with the ability to bypass static detection and signature-based dataset cleansing models. Previous studies on the abilities of poisoning attacks were more easily detectable using static analysis, making Trojan Puzzle stand out as a more capable threat. The attack’s design affords it the ability to train the AI to reproduce malicious payloads, demonstrating realistic software supply chain risk. 

A key tactic used in Trojan Puzzle avoids including the malicious payload in the publicly available code, and actively hides parts of the same payload during the AI assistant training process. Researchers tested the success of Trojan Puzzle by pulling 5.88 GB of Python code from over 18,000 repositories to use in a machine-learning (ML) dataset. 400 suggestions were then generated for three attack types, one of them being Trojan Puzzle. 

Once the ML became familiar with Trojan Puzzle, the researchers found that while in deserialization of untrusted data, Trojan Puzzle out-performed the other two attack methods. If an attacker were to rely on social engineering, employ a separate prompt poisoning mechanism, or pick a word/phrase that ensures frequent triggers, Trojan Puzzle could be carried out successfully. 

While the researchers discuss possible defense mechanisms against poisoning attacks like Trojan Puzzle, it is generally known that if the trigger or payload is unknown that there is little defense in stopping an attack like this from occurring. 

News roundup

Here are the stories we’re paying attention to this week…   

ChatGPT is enabling script kiddies to write functional malware (ArsTechnica)

Since its beta launch in November, AI chatbot ChatGPT has been used for a wide range of tasks, including writing poetry, technical papers, novels, and essays and planning parties and learning about new topics. Now we can add malware development and the pursuit of other types of cybercrime to the list.

Swiss Army's Threema messaging app was full of holes (The Register)

A supposedly secure messaging app preferred by the Swiss government and army was infested with bugs – possibly for a long time – before an audit by ETH Zurich researchers.

Cybercrime group exploiting old Windows driver vulnerability to bypass security products (Security Week)

A cybercrime group tracked as Scattered Spider has been observed exploiting an old vulnerability in an Intel Ethernet diagnostics driver for Windows in recent attacks on telecom and BPO firms.

Australian healthcare sector targeted in latest Gootkit malware attacks (The Hacker News) 

A wave of Gootkit malware loader attacks has targeted the Australian healthcare sector by leveraging legitimate tools like VLC Media Player. Gootkit, also called Gootloader, is known to employ search engine optimization (SEO) poisoning tactics (aka spamdexing) for initial access.

Over 1,300 fake AnyDesk sites push Vidar info-stealing malware (BleepingComputer)

A massive campaign using over 1,300 domains to impersonate the official AnyDesk site is underway, all redirecting to a Dropbox folder recently pushing the Vidar information-stealing malware. AnyDesk is a popular remote desktop application for Windows, Linux, and macOS, used by millions of people worldwide for secure remote connectivity or performing system administration.

Keep learning


Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

More Blog Posts

Do More With Your SOAR

Do More With Your SOAR

Running an SOC is complex — and running without the best tools makes it more difficult. Learn how RL File Enrichment can automate and bolster your SOC.
Read More