<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=1076912843267184&amp;ev=PageView&amp;noscript=1">
RL Blog
|

Python downloader highlights noise problem in open source threat detection

RL discovered what appeared to be a malicious downloader on PyPI. It turned out to be red teaming — but highlights a growing problem for threat detection.

Karlo Zanki
Blog Author

Karlo Zanki, Reverse Engineer at ReversingLabs. Read More...

Python-Downloader-BlogReversingLabs researchers recently discovered a malicious, open source package: xFileSyncerx on the Python Package Index (PyPI). The package, with close to 300 registered downloads, contained separate malicious “wiper” components. Is it an open source supply chain threat? Kind of. Further investigation by our team uncovered the fact that the downloader and wipers were created by a cybersecurity pro doing “red team” penetration testing of a client’s SOC. 

This incident highlights a growing challenge for firms that track (and defeat) open source threats. Namely: “noise” in the form of grayware such as test packages as well as low-quality, low distribution malicious packages. As more attention turns to open source and supply chain threats and attacks, this low signal to noise ratio could make it harder to identify and remediate legitimate, open source software threats. 

In this report we will discuss the findings of our research as well as the larger implications for developers and security teams, as the open source “commons” become crowded with goodware, malware and grayware.

Discussion

ReversingLabs researchers do regular searches of open source repositories such as npm, GitHub and Python Package Index (PyPI) for suspicious and malicious packages using a combination of internal tools and our Spectra family of software supply chain security technology, based on the Spectra Core malware analysis engine. Among other things, we scan a vast range of public repositories for packages with characteristics or features that tend to correlate with malicious or compromised code. 

There are lots of examples of suspicious characteristics. For example, we may notice that packages are programmed to communicate with predefined external servers - possible evidence of malicious command and control. Or, they may have dependencies to known malicious open source or proprietary packages. 

A common “red flag” when surveying open source packages is the use of code obfuscation - intentionally scrambling open source code to prevent outsiders from being able to easily ascertain what its purpose is. For obvious reasons, using obfuscation in a public, open source software package is very suspicious. If the code is free for the taking, why are you (the developer) trying to hide what it does? 

It was the presence of code obfuscation that brought the latest malicious open source package to our attention.  

Wiper package: s2.py

The package in question is xFileSyncerx, a Python package that contains malicious functionality: downloading second stage malware from a remote URL. The package was posted in April by a newly created PyPI account and contained no known dependencies. As mentioned: it was flagged, in part, because it contained obfuscated code. It was later determined to be malicious following a manual inspection by a ReversingLabs threat researcher. 

The code obfuscation in question related to a malicious download URL that is hard coded in the xFileSyncerx package. That URL is stored as a sequence of characters inside an array to make it harder to detect. The values in the array are further obfuscated with several bitwise shifts: arithmetic operations performed on integers within the code that shifts bits left or right. These are used to make a sequence of bits - say ASCII characters - more difficult to decode. 

Before they are used, the values inside xFileSyncerx are "deobfuscated" by performing the opposite bitwise shifts. When de-obfuscated the URL points to a file hosted in a GitHub repository: hxxps: hxxps://raw.githubusercontent.com/d3duct1v/tester-of-trees/main/s2.py.

Figure 1: Downloader code inside xfilesyncerx.py file

As its name suggests the S2.py file is the second stage malware used in this attack. It is actively maintained via a GitHub account with the handle d3duct1v and has more than 20 commits since the initial version was posted on April 17th. 

The differences between these versions include corrections of some coding mistakes and other, minor improvements. This pattern of changes suggested to us that the s2.py malware is in the early development phase or, perhaps, that it serves as a prototype. Nevertheless, one of the versions inspected by ReversingLabs researchers sported fully functional wiper capabilities.

Figure 2: Wiper functionality inside second stage s2.py file

When executed, s2.py walks through the file hierarchy found inside the /home directory on the compromised system and uses the Fernet symmetric encryption algorithm to encrypt all files except hidden files and directories, which start with the dot (.) character. This exclusion is likely made to make sure that SSH functionality which the s2.py package later uses remains intact. 

One of the s2.py commits (Figure 3) reviewed by ReversingLabs revealed that initially only files from the ‘.ssh’ directory were excluded from encryption by the malware. That functionality was later replaced with a broader rule that excluded all hidden files from encryption.

python-downloader-figure-3


Figure 3: Check that prevents hidden files being encrypted

A message written to the package’s README files created in the /home directory tree plays on a well-known Internet meme, and suggests that the purpose of the s2.py package is malicious. As it turns out, the malware encrypts but never exfiltrates files from the infected machine, meaning that a malicious actor using this code would never actually possess sensitive information from files infected with the s2.py malware, though they could certainly disrupt the operation of infected systems. 

Figure 4: Wiper functionality inside second stage s2.py file

Malware at the door?

As we looked deeper into the s2.py malware, the picture became more interesting. Specifically: 
We noted that after the files in /home directory have been encrypted, the malware tries to spread across the local network by leveraging SSH to try to connect to other devices using hard coded credentials. 

If an SSH connection to a target device is successfully established, stage 3 malware gets downloaded from the same GitHub repository and executed on the target device. The name of that third stage malware? You guessed it: s3.py. That third stage malware is nearly identical to the stage 2 malware except that it contains only the wiper functionality and not the spreading functionality. Compared to other malware we have analyzed, the wiper malware in s2.py and s3.py were rudimentary, while the commit history of the packages showed the author struggling with a number of bugs that rendered the packages non-functional. 

As for the spreading functionality: that’s where things get interesting. The login the malware uses to try to establish an SSH connection is bellj1. To us, that seemed likely to be a reference to the Bell J1 doorbell cameras, inexpensive, wireless doorbell cameras manufactured by the Chinese firm Shenzhen Joystek Intelligence Co. We also observed a hard coded list of private IP addresses in the wiper malware - likely a list of possible target assets on an internal network.

IoT worm, right? Wrong

With these findings, a wide range of possible explanations stretched out before us. Most tantalizing was the prospect that we had uncovered a (nascent) Internet of Things (IoT) worm that was targeting Bell J1 cameras deployed within a specific environment (or environments). The use of hard coded SSH credentials suggested that the actor behind these malicious packages managed to obtain knowledge of the default administrator credentials for that environment or - possibly - that the Bell J1 cameras themselves shipped with hard coded SSH credentials under the BellJ1 account name. 

Such a scenario has a lot of precedent. Hard coded credentials are a common source of IoT device compromises. Famously, the Mirai IoT botnet, which has been linked to a number of large scale denial of service attacks, initially spread by exploiting scores of default credentials in IoT devices including connected refrigerators, CCTV cameras and toasters.  

As for the hard coded IP addresses we discovered: those might reflect a default, manufacturer-provided test environment configured by the malware author that was accidentally published to PyPI without being removed. Such slip ups have happened before. And, without more information from the manufacturer, or evidence from an “in the wild” deployment of this malware on compromised cameras, it was unclear to us what we were looking at.

Red team 'litter' equals open source threat noise

With so many unanswered questions, we decided to dig a little deeper. A colleague of mine reached out to the individual behind the d3duct1v maintainer account responsible for the s2.py and s3.py malware on GitHub. Our question(s): did you also author the xFileSyncerx package and, if so, what’s the deal? 

In our experience, you often don’t get a response from inquiries like this - especially when the account is connected to a malicious actor or campaign. In this case, however, we heard right back from the author, who explained in an email that s/he worked as a U.S.-based penetration tester and “yes,” was also the author of the xFileSyncerx package. That package was created “for a test I was running I left the URLs for the clients (sp) SOC to be able to discover,” they explained. The IP addresses in question were specific to that client’s environment. As for the “BellJ1” username and the “Bell J1” smart doorbells? “100% coincidence,” they assured us. 

The purpose of these packages was to test the client’s SOC (security operations center) both in terms of their ability to detect the suspicious call out to retrieve the second and third stage malware, and to detect (if possible) the lateral movement conducted by S2.py as it found and infected systems using the hard coded SSH credentials, the person behind the d3duct1v account indicated. 

Furthermore, the person behind the d3duct1v maintainer account told us that s/he was planning to pull down the package before we got in touch. Regardless, we notified the PyPI administrators who removed the xFileSyncerx package. It is no longer available for download. Both the s2.py and s3.py malware were also removed after ReversingLabs contacted the author(s) behind the d3duct1v account. They are no longer available for download.

Conclusion

Even without confirmation by its author, there were lots of reasons to doubt that xFileSyncerx was the next “Mirai” - the virulent IoT worm -or part of some muscular supply chain attack. 

Among other things, the code was limited to spreading among a predefined set of hosts, suggesting a targeted attack at best, or more likely a test or proof of concept. There was no effort made to “dress up” the code to look like a legitimate Python package. In fact, the xFileSyncerx package description on PyPI even contained a message that this package shouldn’t be used. Then there were the transparent package names: S2.py and S3.py for second- and third stage malware. Really? So, learning that these packages were part of a red team assessment of a client environment saw a lot of loose pieces fall into place. After all, the xFileSyncerx Python package is a good tool for a red team to deploy in a client environment: it exhibits the classic features of a downloader and spreading behavior via the second stage malware: s2.py. These are basic malicious applications that exhibit network scanning and lateral movement. If your client can detect them, they’re in a good position to spot other, truly malicious packages.

That said, it is not common for us to find wiper functionality integrated with red team tools. Typically, red teams are satisfied to gain access to a protected system or environment and exfiltrate data from it as proof of their successful compromise. Wiping the device and taking it offline is considered too disruptive for a friendly actor like a penetration tester. 

Typical or not, one thing our discovery of xFileSyncerx highlights is the problem of growing “noise” on open source repositories like PyPI, npm, GitHub and others. As the profile of supply chain threats and attacks rises in the wake of incidents like the compromises of SolarWinds, 3CX, and others, the population of goodware, malware and grayware is exploding. And that makes it harder for any company that hopes to assess supply chain threats to filter out the “signal” from all that noise. 

This red team package and the countless others like it lurking on platforms like PyPI - litter and detritus left behind from penetration tests and other exercises - add to the challenge of identifying true open source threats. One suggestion might be for clearer guidelines and restraints around publishing test- and grayware packages like this to public repositories and better demarcation of them to prevent confusion on the part of developers and security teams.

A tragedy of the (open source) commons?

The concept that explains some of this is what economists term the “tragedy of the commons” - a tendency for communities that are given unfettered access to a resource to abuse and exhaust it by pursuing their own interests without consideration of the need to preserve the resource for others. We may be witnessing a similar dynamic play out on open source repositories like npm, PyPI and GitHub, as threat actors, threat hunters and others leverage them in unbounded ways without concern for the larger good. 

For threat hunting teams, incidents like this show how the bar is rising for open source threat analysis. Five years ago, any malware found lurking on an open source repository was headline grabbing news. Today, the discovery of malicious features lurking in code may indicate the discovery of a new threat or…something else. It falls to open source threat analysts to do the digging and investigation needed to ascertain the provenance and purpose of a suspicious looking file so that the actual threat it poses to development teams and end user organization is properly understood. 

Indicators of Compromise (IOCs)

Indicators of Compromise (IoCs) refer to forensic artifacts or evidence related to a security breach or unauthorized activity on a computer network or system. IOCs play a crucial role in cybersecurity investigations and cyber incident response efforts, helping analysts and cybersecurity professionals identify and detect potential security incidents.

The following IOCs were collected as part of ReversingLabs investigation of this software supply chain campaign.

PyPI packages:
package_name version SHA1
xFileSyncerx 0.0.2 e200d11a089e66840598b104b57e9758855031b3
GitHub URLs:

hxxps://raw.githubusercontent.com/d3duct1v/tester-of-trees/main/s2.py
hxxps://raw.githubusercontent.com/d3duct1v/tester-of-trees/main/s3.py

Keep learning


Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

More Blog Posts