How to Examine Polyglot Files with Spectra Analyze

Here's how to assess a sample using Spectra Analyze in your environment — and create a YARA rule.

Josh Morin, Senior Customer Success Engineer

Polyglot File Examination with Spectra Analyze

Spectra Analyze provides a dedicated workbench for malware analysis and triage. Its automated binary analysis quickly deconstructs, classifies, and analyzes threats across various file types, reducing false positives and delivering actionable intelligence for faster incident response.

Malware analysts, threat hunting teams, and SOCs can submit hashes to receive clear, color-coded threat classifications: goodware, suspicious, or malicious. The platform also supports sandboxing and execution, enabling users to observe both runtime behavior and code structure.

Here's an expansion on the techniques used in the recent RL Blog post “Hunting SharpHounds with Spectra Analyze.”

Unmasking Polyglot Files

Polyglot files combine elements from multiple file formats, concealing executable code in overlooked sections such as metadata or comments within images or documents. These files remain valid and can be processed or executed.

Polyglot files can enter your environment in several ways. Below are key examples to monitor.

Phishing: Attackers send emails containing polyglot files disguised as legitimate invoices, documents (e.g., PDFs or DOCs), or image files (e.g., JPGs).
Drive-by Downloads: Users may unknowingly download polyglot files from compromised websites that appear legitimate.
File-Upload and Web Interfaces: Attackers upload polyglot files to web services, such as chat or document-sharing platforms, that verify only file types and fail to detect hidden malicious code.
Cloud Content Delivery Networks (CDNs): Attackers may host malicious files on trusted platforms such as Discord or other CDNs, exploiting user trust in these domains.
Removable Media: Attackers use these devices to introduce polyglot files into air-gapped or secure networks.

Getting Started

To demonstrate Spectra Analyze’s capabilities, here's how customers can use it to assess a sample in their environment, moving from the Report Summary to creating a YARA rule.

Report Summary

Submitting the hash for analysis provides immediate, detailed indicators of polyglot behavior in the Report Summary.

Key details include “File Type,” “File Format,” and “The Threat Actor / Name.” The “Sample Description” also offers context that requires further analysis. I will now jump to the Graph View for visual assessment.

Graph View

Using the graph visualization feature in the Report Summary, I first review the layers of extracted files. These layers eventually reveal the presence of the “overlay” sub-file, at which point I see a malicious verdict.

Parent File: Disguised GIF Image

This file presents itself as a valid GIF89a image, a large overlay section and contains the embedded PHP payload.

File Type: Image / GIF (GIF89a)
File Size: 9.09 KB
Entropy: 5.389929656435356

Extract File: Embedded PHP Web Shell

The PHP overlay serves as the malicious payload for a Dirtelti-family backdoor web shell. It includes hardcoded HTTP references to an external domain that acts as a decoy cursor resource.

File Name: overlay
File Type: Text / PHP Script
File Size: 9.1 KB
Entropy: 5.387616553412414

How We Caught This

Within File Analysis, the Spectra Analyze section includes a “How We Caught This” feature. Here, you will find multiple sections referencing Polyglot and Macliciousness.

Network Reputation

Two URLs embedded in the PHP script reference domains that are likely used as live connectivity checks. Both URLs are categorized under “entertainment” and “software_downloads,” consistent with a legitimate resource site being used as cover traffic or a dead-drop indicator.

Indicators

In the Static Analysis section, the Indicators detected a Macro and tagged it as “contains-script,” which means the file includes one or more script files.

Additional tags include antivirus, image-corrupt, image-segment-unknown and overlay.

An example of other potential important tags associated with Polyglot detection:

Tag	Description
cert-appendix	The file contains additional data after the certificate
contains-script	The file contains one or more script files
format-bad-checksum	The file likely contains corrupted content as it has failed the data integrity check
image-corrupt	The image is corrupt because of some format discrepancy (e.g. invalid segment size)
image-segment-unknown	An unknown image segment has been encountered
image-malformed	The image is malformed (e.g. frame dimension is zero)
image-segment-duplicate	The image has a duplicate segment
image-segment-unexpected-location	An image segment has been found in an unexpected location
stego	The file is a result of stego extraction
stego-compressed	The file contains compressed embedded PE files
stego-embedded	The file contains plain embedded PE files
stego-encoded	The file contains encoded embedded PE files
stego-encrypted	The file contains encrypted embedded PE files

Extracted Files

In the Static Analysis section, Extracted Files identified a threat consistent with the findings in the Graph view. You can also review the file in Hex and Text views under Extracted Files.

Note that the Text view displays PHP usage consistent with Image/PHP Polyglot.

Dynamic Analysis

Dynamic analysis provides additional behavioral insights beyond static analysis, including changes to the file system, registry, network connections, and process activity. In this case, we identified 36 signatures, 70 TCP, UDP, DNS, and URL events, 1 behavioral indicator with multiple findings, 18 dropped files, 15 MITRE ATT@CK mappings, and 2 YARA matches.

YARA Matches

After behavioral detections in Dynamic Analysis, two YARA findings were identified. One notable finding, related to Polyglot files, is described as “Finds image files w/ PHP code in images.”

Community Threat Detections (Antivirus Detection Summary)

The following vendors flagged the sample with a GIF and a PHP web shell that aligns with other findings.

Polyglot Image PHP Trojan Detection

This rule detects PHP polyglot files that masquerade as GIF images. These files start with the GIF89a magic bytes, making them appear valid to content-type checkers and basic validators. They also contain embedded PHP code <?ph), which a PHP interpreter will execute if the file is served or included. The rule fires when both of the following conditions are met:

The file begins with the GIF89a magic header at offset 0, indicating it presents itself as a GIF image.

The string <?php appears anywhere within the file, indicating the presence of embedded PHP code.

Example:

rule Polyglot_Image_PHP : tc_detection suspicious {
 meta:
  tc_detection_type = "Trojan"
  tc_detection_name = "PolyglotImagePHP"
  tc_detection_factor = 4
 strings:
  $gif = "GIF89a"
  $php = "<?php"
 condition:
  $gif at 0 and $php
}

Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

Tags:Products & Technology Spectra Analyze In Action

Tag

Description

cert-appendix

The file contains additional data after the certificate

contains-script

The file contains one or more script files

format-bad-checksum

The file likely contains corrupted content as it has failed the data integrity check

image-corrupt

The image is corrupt because of some format discrepancy (e.g. invalid segment size)

image-segment-unknown

An unknown image segment has been encountered

image-malformed

The image is malformed (e.g. frame dimension is zero)

image-segment-duplicate

The image has a duplicate segment

image-segment-unexpected-location

An image segment has been found in an unexpected location

stego

The file is a result of stego extraction

stego-compressed

The file contains compressed embedded PE files

stego-embedded

The file contains plain embedded PE files

stego-encoded

The file contains encoded embedded PE files

stego-encrypted

The file contains encrypted embedded PE files

YARA Matches

After behavioral detections in Dynamic Analysis, two YARA findings were identified. One notable finding, related to Polyglot files, is described as “Finds image files w/ PHP code in images.”

Community Threat Detections (Antivirus Detection Summary)

The following vendors flagged the sample with a GIF and a PHP web shell that aligns with other findings.

Polyglot Image PHP Trojan Detection

The file begins with the GIF89a magic header at offset 0, indicating it presents itself as a GIF image.

The string <?php appears anywhere within the file, indicating the presence of embedded PHP code.

Example:

rule Polyglot_Image_PHP : tc_detection suspicious {
 meta:
  tc_detection_type = "Trojan"
  tc_detection_name = "PolyglotImagePHP"
  tc_detection_factor = 4
 strings:
  $gif = "GIF89a"
  $php = "<?php"
 condition:
  $gif at 0 and $php
}

How to Examine Polyglot Files with Spectra Analyze

Here's how to assess a sample using Spectra Analyze in your environment — and create a YARA rule.

Unmasking Polyglot Files

Getting Started

Report Summary

Graph View

Parent File: Disguised GIF Image

Extract File: Embedded PHP Web Shell

How We Caught This

Network Reputation

Extracted Files

Dynamic Analysis

YARA Matches

Community Threat Detections (Antivirus Detection Summary)

Polyglot Image PHP Trojan Detection

Spectra Assure Free Trial

How to Examine Polyglot Files with Spectra Analyze

Here's how to assess a sample using Spectra Analyze in your environment — and create a YARA rule.

Unmasking Polyglot Files

Getting Started

Report Summary

Graph View

Parent File: Disguised GIF Image

Extract File: Embedded PHP Web Shell

How We Caught This

Network Reputation

Extracted Files

Dynamic Analysis

YARA Matches

Community Threat Detections (Antivirus Detection Summary)

Polyglot Image PHP Trojan Detection

Spectra Assure Free Trial