RL Blog

Topics

All Blog PostsAppSec & Supply Chain SecurityDev & DevSecOpsProducts & TechnologySecurity OperationsThreat Research

Follow us

XX / TwitterLinkedInLinkedInFacebookFacebookInstagramInstagramYouTubeYouTubeblueskyBluesky

Subscribe

Get the best of RL Blog delivered to your in-box weekly. Stay up to date on key trends, analysis and best practices across threat intelligence and software supply chain security.

ReversingLabs: The More Powerful, Cost-Effective Alternative to VirusTotalSee Why
Skip to main content
Contact UsSupportLoginBlogCommunity
reversinglabsReversingLabs: Home
Solutions
Secure Software OnboardingSecure Build & ReleaseProtect Virtual MachinesIntegrate Safe Open SourceGo Beyond the SBOM
Increase Email Threat ResilienceDetect Malware in File Shares & StorageAdvanced Malware Analysis SuiteICAP Enabled Solutions
Scalable File AnalysisHigh-Fidelity Threat IntelligenceCurated Ransomware FeedAutomate Malware Analysis Workflows
Products & Technology
Spectra Assure®Software Supply Chain SecuritySpectra DetectHigh-Speed, High-Volume, Large File AnalysisSpectra AnalyzeIn-Depth Malware Analysis & Hunting for the SOCSpectra IntelligenceAuthoritative Reputation Data & Intelligence
Spectra CoreIntegrations
Industry
Energy & UtilitiesFinanceHealthcareHigh TechPublic Sector
Partners
Become a PartnerValue-Added PartnersTechnology PartnersMarketplacesOEM Partners
Alliances
Resources
BlogContent LibraryCybersecurity GlossaryConversingLabs PodcastEvents & WebinarsLearning with ReversingLabsWeekly Insights Newsletter
Customer StoriesDemo VideosDocumentationOpenSource YARA Rules
Company
About UsLeadershipCareersSeries B Investment
EventsRL at RSAC
Press ReleasesIn the News
Pricing
Software Supply Chain SecurityMalware Analysis and Threat Hunting
Request a demo
Menu
Threat ResearchNovember 14, 2022

How to write detailed YARA rules for malware detection

New malware appears or evolves daily, so updating tools like YARA rules for detection is critical. Here's how my research team develops YARA rules.

Laura Dabelić
Laura Dabelić, Threat analyst at ReversingLabs.Laura Dabelić
FacebookFacebookXX / TwitterLinkedInLinkedInblueskyBlueskyEmail Us
How to write detailed YARA rules for malware detection

The purpose of YARA rules is to improve our methods of malware detection. New malware families appear and evolve every day, so it is important to provide our clients with tools to protect themselves. This is why ReversingLabs' threat research team continually writes YARA rules, to deliver an open-source, working tool that detects the latest malware families.

The rules also must be as precise and verbose as possible to prevent the appearance of false positives. The creation of high quality YARA rules allows our clients to keep their defenses up to date, giving them the best chance at preventing security incidents.

Since the threat landscape is constantly changing, the research team at ReversingLabs continuously updates the company's public YARA rules repository on GitHub with new and actual threats. This blog post describes the process of how we write our high-quality YARA rules. Here's an example of writing detailed YARA rules, demonstrated by the YARA rule for the GwisinLocker ransomware.

Choose the target

Writing high-quality YARA rules is a time consuming process, which means that our team must choose their battles. There are many criteria for choosing a malware family, and usually samples which will be chosen for analysis are the ones which are known to have a big impact, and are highly popular in the threat landscape, such as:

  • New ransomware which targets big companies and businesses
  • Destructive wipers used as means of cyber warfare
  • Spyware and backdoors used by the various APT groups

Most malware in the modern threat landscape is packed with custom or off-the-shelf packers, to make analysis and signature matching harder. This is why our team checks if the samples are packed before they start writing the YARA rule. YARA rules should match malicious code, not the packing layer, and we write them with the second, unpacked, layer in mind. This additionally makes them suitable to be deployed on dynamic analysis solutions, for runtime inspection. ReversingLabs can automatically unpack more than 400 executable packer formats.

When malware is packed by unidentified custom packers, the unpacking must be done manually. This typically involves using a debugger to analyze the packer layer, identify where execution is passed to the second layer, and extracting the payload. One common technique that packers use to execute the packed code is process injection. Process injection comes in several variants which include self-injection, PE injection, and process hollowing. All of the aforementioned variants can be recognized by the typical pattern of API calls which must occur during the unpacking.

In a nutshell, the process into which the packer is injecting the payload needs to be created or opened (using CreateProcess or OpenProcess APIs). Additional memory in the process might then be allocated with VirtualAllocEx, and is populated with the payload by using WriteProcessMemory. Other APIs might also be invoked, among which are:

  • VirtualProtectEx
  • ReadProcessMemory
  • CreateRemoteThread
  • ResumeThread
  • NtResumeThread

The execution of the malicious payload is then resumed instead of the original process’s contents. The malicious payload can be obtained in several ways from memory, and it’s important to dump the payload in an executable format for later analysis.

To make sure that we don't duplicate effort, every unpacked sample is matched against our entire YARA signature collection, to see which, if any, patterns are matched. This enables us to easily track novel malware, as well as new malware versions.

Do detailed, in-depth analysis

Every malware family has its own characteristics and set of behaviors. The way these are implemented in the code differs from one malware family to another. However, the behavior of malware types (like ransomware or backdoors, among others) can usually be described by a set of common actions that all malware families of a certain type share. For ransomware like GwisinLocker, the behaviors we are interested in are:

  • Finding the files
  • Encrypting files
  • Dropping the ransom note
  • Establishing a remote connection with the C2 server
  • Decrypting the malware configuration

One of the more interesting behaviors we found in GwisinLocker is the shutting down of the VMWare ESXi machines before the encryption. The part of code which implements this behavior can be seen in the picture below. The constants that can be seen in the picture are the sets of strings which are used as a method of obfuscation. They represent the following command:

esxcli vm process kill --type=force --world-id="[ESXi] Shutting down - %s"

codeblock

Stack strings are a method of obfuscation in which the string is built on the stack one (or few) character(s) at the time. The purpose of this technique is to confuse the reverser and make the reversing process slower. We will use this part of the code to create a behavior-focused pattern. The hardcoded stack strings are a good choice for a byte pattern because they make the pattern more unique and specific. By extension, this reduces the probability of catching false positives once the YARA rule is deployed. The created pattern can be seen in the picture below.

Code block

This small rule which represents the behavior-focused pattern is evaluated against samples in our cloud, to identify other potentially interesting samples with similar behavior, which might have been missed during initial sample collection stages. The results should be analyzed to see how similar (or different) the matched samples are. The possible conclusions derived from this step are:

  • The samples are very similar. This means that we are on the right track and that they probably belong to the same malware family
  • The samples are notably different. This means that the code pattern is not unique to the malware being analyzed, or it might be a part of a common library which is reused among different malware. Either way, the pattern needs to be expanded with more specific data, or supplemented with other parts of the code which are more unique to this malware family.

YARA rule structure matters

Every rule consists of the "meta" section, the "strings" section, and the "condition" section. They are described in detail below.

The meat of the 'meta' section

Every rule needs to have a "metadata" section, which is divided in two parts:

CCCS YARA metadata

We've decided to conform to the publicly available CCCS YARA validator. The specification requires several fields to be present, among which the most important are “sharing” and “malware.” The "sharing" field describes the sharing limitations of the YARA rule. The value "TLP:WHITE" means that the YARA rule can be freely distributed. The "malware" field contains the information about the category of the samples that YARA rule detects. Our YARA rule aim to detect the samples which belong to the "MALWARE" category, and have their family name.

author

Always set to "ReversingLabs"

source

Always set to "ReversingLabs"

status

Always set to "RELEASED"

sharing

Always set to "TLP:WHITE"

category

Always set to "MALWARE"

malware

Malware family name, in uppercase in the form MALWAREFAMILYNAME

description

Always needs to begin with "YARA rule that detects...", only the malware family name and malware family type are changed

If you're interested in the more detailed explanations of the fields, you can check out the CCCS YARA standard configuration page, and see how they’re used in our public YARA rules.

ReversingLabs-specific YARA metadata

ReversingLabs’ YARA rules are one of multiple classification methods, and they supplement more complex classifiers for added protection. In order for ReversingLabs’ core engine to correctly classify files using YARA rules, additional metadata must be present. The required metadata has the following structure:

tc_detection_type

MalwareFamilyType from the rule name

tc_detection_name

MalwareFamilyName from the rule name

tc_detection_factor

Usually set to 5, but often depends on the threat type

The example of the "meta" field for the GwisinLocker ransomware can be seen in the following image:

code block

Another example can be seen in the YARA rule for the HermeticWiper malware which was covered in one of our previous From the Labs blog posts.

The "strings" section

As analysts, our team commonly needs to update each other’s rules, and must be thoughtful of how fast they are evaluated, given the millions of files ReversingLabs processes daily. There are some good practices which should be followed to increase the readability and speed of the YARA rule evaluation:

  • Standardize the indentation and be consistent with it. For example, if you use one tab for the indentation, make sure it applies in all your rules.
  • Break the longer patterns into more, sequentially named subpatterns (e.g. $encrypt_files_p1, $encrypt_files_p2, ...)
  • The pattern shouldn't start or end with the optional, masked bytes (question marks).
  • Use patterns with longer sequences of exact (non-optional) bytes, as they serve as anchors

The example of correctly written and split patterns is the kill_processes pattern from the GwisinLocker YARA rule, which can be seen in the following image:

code block

The "condition" section

The rules are evaluated on PE and ELF files, so the "magic" bytes at the beginning of each file need to be checked:

  • uint16(0) == 0x5A4D - The "MZ" header for the PE files
  • uint32(0) == 0x464C457F - The ".ELF" header for the ELF files

When writing conditions, the team uses a whitespace-heavy style, to keep the rules consistent and readable. Additionally, we split the blocks by logical operators, to make it visually easy to see how the patterns are grouped. This organization makes it easy to troubleshoot and fix signatures as new versions appear, without compromising the logical validity of the condition.

The example of the GwisinLocker condition can be seen in the picture below. The first group of conditions covers the 32-bit version of the ransomware, while the second group covers the 64-bit version.

codeblock

YARA rules: A continuous process

Threat actors keep developing the malware in their arsenal, and the ReversingLabs malware research team continuously monitors the threat landscape for new versions that our existing YARA rules do not cover. When a new version is discovered, the process outlined in this post is repeated. The YARA rule is then updated to keep pace with the new threats in the never ending cat-and-mouse game known as malware analysis.

Learn more about ReversingLabs' Malware Analysis and Threat Hunting solutions:

  • Spectra Analyze
  • Spectra Intelligence
  • Spectra Detect

Keep learning

  • Get up to speed on the state of software security with RL's Software Supply Chain Security Report 2026. Plus: See the the webinar to discussing the findings.
  • Learn why binary analysis is a must-have in the Gartner® CISO Playbook for Commercial Software Supply Chain Security.
  • Take action on securing AI/ML with our report: AI Is the Supply Chain. Plus: See RL's research on nullifAI and watch how RL discovered the novel threat.
  • Get the report: Go Beyond the SBOM. Plus: See the CycloneDX xBOM webinar.

Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

Tags:Threat Research

More Blog Posts

Graphalgo supply chain campaign respawned.

Graphalgo fake recruiter campaign returns

An attack targeting crypto developers has been respawned — with an LLC and new techniques.

Learn More about Graphalgo fake recruiter campaign returns
Graphalgo fake recruiter campaign returns
TeamPCP supply chain attack

The TeamPCP supply chain attack evolves

The malicious campaign started with Trivy and Checkmarx and has shifted to LiteLLM — and now telnix. Here's how.

Learn More about The TeamPCP supply chain attack evolves
The TeamPCP supply chain attack evolves
Malicious npm packages use fake install logs to load RAT

Fake install logs in npm packages load RAT

The final-stage malware in the Ghost campaign is a RAT designed to steal crypto wallets and sensitive data.

Learn More about Fake install logs in npm packages load RAT
Fake install logs in npm packages load RAT
Inside the NuGet hack toolset

Inside the NuGet hackers' toolset

RL discovered two packages containing scripts that complete a typosquatting toolchain. Here's how it worked.

Learn More about Inside the NuGet hackers' toolset
Inside the NuGet hackers' toolset

Spectra Assure Free Trial

Get your 14-day free trial of Spectra Assure for Software Supply Chain Security

Get Free TrialMore about Spectra Assure Free Trial
Blog
Events
About Us
Webinars
In the News
Careers
Demo Videos
Cybersecurity Glossary
Contact Us
reversinglabsReversingLabs: Home
Privacy PolicyCookiesImpressum
All rights reserved ReversingLabs © 2026
XX / TwitterLinkedInLinkedInFacebookFacebookInstagramInstagramYouTubeYouTubeblueskyBlueskyRSSRSS
Back to Top