<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=1076912843267184&amp;ev=PageView&amp;noscript=1">

RL Blog

|

BIPClip: Malicious PyPI packages target crypto wallet recovery passwords

RL has discovered a campaign using PyPI packages posing as open-source libraries to steal BIP39 mnemonic phrases, which are used for wallet recovery.

Karlo Zanki
Blog Author

Karlo Zanki, Reverse Engineer at ReversingLabs. Read More...

Malicious PyPI packages target crypto wallet recovery passwords

ReversingLabs has identified a new, malicious campaign consisting of seven different open source packages with 19 different versions on the Python Package Index (PyPI), with the oldest package dating back to December, 2022. The campaign's goal: to steal mnemonic phrases used to recover lost or destroyed crypto wallets.  

This is just the latest software supply chain campaign to target crypto assets — a list that includes the compromise of Voice over IP (VoIP) vendor 3CX. It confirms that cryptocurrency continues to be one of the most popular targets for supply chain threat actors.

This campaign, which my team is calling “BIPClip,” also underscores the steps that supply chain threat actors are taking to disguise their malicious wares, including the use of malicious file dependencies and various types of “name squatting” to throw security teams off their scent.

Discussion

Here's what the RL research team knows about the malicious campaign, which is distributed through seven newly discovered malicious PyPI packages designed to work in concert to steal crypto wallet recovery phrases, all while minimizing the risk of detection. 

Crypto in the crosshairs

The targets of this latest campaign were developers working on projects related to generating and securing cryptocurrency wallets. In particular, the attackers sought to fool developers looking to implement the Bitcoin Improvement Proposal 39, or BIP39, a list of 2,048 easy-to-remember words that are used to generate a binary seed that creates deterministic BitCoin wallets (or "HD Wallets").The idea behind BIP39 is that a mnemonic code or sentence is easier for wallet owners to recall compared with raw binary or hexadecimal representations of a wallet seed, offering “computer-generated randomness with a human-readable transcription.” 

Infrastructure and assets related to cryptocurrency creation, storage and transactions are a frequent target of supply chain attacks. That includes everything from the December 2023 compromise of the open source Ledger Connect Kit, resulting in the redirection of crypto transactions; to the publication of Python libraries that covertly run cryptominers; to posting malicious npm packages related to cryptocurrency applications and platforms

The interest in cryptocurrency applications and exchanges is easy to explain — a 21st century version of Willie Sutton's famous adage about robbing banks because “that’s where the money is.” In the case of cryptocurrency, nation-state actors affiliated with the Democratic Republic of North Korea (DPRK) are reported to have stolen as much as $3 billion in cryptocurrency in the past five years, accounting for as much as 5% of the country’s GDP.

A malicious pair

In the latest campaign, ReversingLabs initially discovered two PyPI packages that work together to exfiltrate sensitive data used to protect cryptocurrency wallets: mnemonic_to_address and bip39_mnemonic_decrypt

The bip39_mnemonic_decrypt package first turned up in a scan by our RL Spectra Assure platform due to a combination of "red flags"— suspicious characteristics in the package. Those included the presence of Base64 decoding as well as network communications, with bip39_mnemonic_decrypt importing the requests package, a common library typically used for network communication within the Python ecosystem. 

After more investigation, the RL research team came to the conclusion that the campaign involved two packages, with the second package, mnemonic_to_address, serving as a "clean" package with the malicious bip39_mnemonic_decrypt listed as a dependency.

Mnemonic_to_address: accomplice to a crime

The first package the RL team discovered, mnemonic_to_address, does not contain any malicious functionality. Rather, it faithfully implements the functionality advertised in the package description, namely: creating a seed from the user’s secret mnemonic seed phrase. The package does this by forwarding the BIP39 data to functions imported from another legitimate project: eth-account, which is maintained by Ethereum

Code example from eth-account documentation

Figure 1: Code example from eth-account documentation for generating an account from a mnemonic

The mnemonic_to_address package basically serves as a wrapper and makes function calls as described in eth-account project’s documentation (Figure 1). But there’s one subtle difference: mnemonic_to_address calls a function not present in the eth-account package named decrypt_jsBIP39. Where does that function come from? Well, it is imported from the bip39_mnemonic_decrypt module, with code from the mnemonic_to_address package passing the user's mnemonic passphrase to it as the function argument. 

Code from mnemonic_to_address package

Figure 2: Code from mnemonic_to_address package calls the function from the malicious bip39_mnemonic_decrypt package

Bip39_mnemonic_decrypt: subtly malicious

The bip39_mnemonic_decrypt package is the second package from this campaign. It is declared as the dependency of the mnemonic_to_address package. It was in this package that ReversingLabs discovered clearly malicious functionality. 

As with the mnemonic_to_address package, bip39_mnemonic_decrypt was published by james_pycode, a throwaway PyPI maintainer account that was created on the same day as the packages were published — a behavior that the RL research team often find associated with malicious campaigns distributed through open source package repositories. 

As you can see from the maintainer’s account (Figure 3), minimal effort was made to bolster the reputation or credibility of the james_pycode account prior to — or after publishing the malicious PyPI packages. That’s not always the case. Sophisticated supply chain attackers that leverage open source repositories often invest time and resources to mimic official pages. 

Throwaway account used to publish the malicious packages

Figure 3: Throwaway account used to publish the malicious packages

But that doesn’t mean the malicious actor behind this campaign didn’t make an effort to hide their malicious wares. Just the opposite — this campaign took a number of steps to avoid detection. 

The first, as noted above, was the use of a malicious file dependency to facilitate the supply chain attack. The advantage of this approach is obvious: a developer who decides to use the mnemonic_to_address package and audits the code would conclude that the file was not malicious and worked “as advertised.” However, that audit might not extend to a security assessment of the mnemonic_to_address package’s many dependencies. 

Even if they did opt to look at the package’s dependencies, the name of the imported module and invoked function are carefully chosen to mimic legitimate functions and not raise suspicion, since implementations of the BIP39 standard include many cryptographic operations. 

Malicious function from bip39_mnemonic_decrypt package

Figure 4: Malicious function from bip39_mnemonic_decrypt package designed to exfiltrate the data received as function argument

Specifically, the malicious function, decrypt_jsBIP39, is hidden in the bip39_mnemonic_decrypt package at the very end of the __init__.py file, coming after several, non-malicious functions that are not actually used in the code base. A developer looking for red flags would have to be careful to examine _init_.py and scroll to the end of the file to discover the malicious function. 

On first look, decrypt_jsBIP39 is a pretty straightforward function. First, it decodes the Base64 encoded URL of the data exfiltration server. Then it invokes another function with a cleverly chosen name: cli_keccak256. That name is no accident: keccak256 is a cryptographic hash function commonly used to compute the hashes of Ethereum addresses, transaction IDs, and other important values in the Ethereum ecosystem.

Malicious code found in this package encodes the provided mnemonic passphrase using Base64 and then sends it to the exfiltration server using a HTTP POST request. The malicious code further disguises the passphrase by placing it in the “license” data field. For security tools or operators monitoring network traffic, this encoded text sequence might be interpreted as a legitimate software license value and overlooked.

Evidence of an broader campaign

After analyzing the initial “malicious pair,” RL researchers discovered three additional packages on PyPI in the first week of March that also appear to be a part of this campaign.

Another malicious pair

The first two, public-address-generator and erc20-scanner, were also published from a throwaway PyPI account on March 1st. Peeking under the hood: they appear to work the same way as the mnemonic_to_address and bip39_mnemonic_decrypt pair described above. Malicious functionality identical to that found in the bip39_mnemonic_decrypt package is implemented in the erc20-scanner package. The public-address-generator package serves the same role as the mnemonic_to_address package, acting as a lure for the targets. 

Links to the BIPClip campaign are evident. In addition to shared code and functionality, the newer packages use the same command and control (C2) server to exfiltrate stolen mnemonics.

Hashdecrypts: venomous code

The third package, hashdecrypts, also appears connected to the BIPClip campaign, but revealed more information about it. 

The hashdecrypts package was published on March 1st by a PyPI user account, luislindao, that was first registered in August 2019.  It contains almost identical malicious code to the bip39_mnemonic_decrypt and erc20-scanner packages, but adds another level of redirection. 

Malicious function from hashdecrypt package

Figure 5: Malicious function from hashdecrypt package designed to exfiltrate the data received as function argument

It first makes a HTTP GET request to a Base64 encoded URL from which it gets the address of the real C2 server, to which it then sends data using a HTTP POST request. Inside the code there is a comment header pointing to a github repository belonging to the HashSnake user account. The same repository can be found in the string extracted from the Base64 encoded URL: hxxps://github.com/HashSnake/backendapi/raw/main/settings.

Github

Figure 6: Github homepage of HashSnake user


Looking at the HashSnake repository reveals that the last updated package, hCrypto, is described as a “FREE CRYPTO CHECKER" and looks shady. A detailed inspection of the source code in that repository revealed that two files: main_en.py and main_ru.py contain code that import and invoke functions exported by the hashdecrypts package which leads to exfiltration of users secrets, in a similar fashion as in the previously discovered packages.


Code snippets


Figure 7: Code snippets from main_en.py file triggering data exfiltration functionality from the hashdecrypt package

HashSnake’s long tail

A look at the commit history for the package reveals that the campaign began more than a year ago, with the first commit to the HashSnake github repository on February 5th, 2023. It also revealed that the repository previously imported a different package, hashdecrypt (Editor's note: no trailing "s"), that was first published on December 4, 2022. All three published versions of that package contained the same malicious functionality and fetched the same command and control (C2) server address from the same GitHub repository. 

Git commit


Figure 8: Git commit that reveals the existence of an older PyPI package

Looking at the commit history of the backendapi/settings file reveals the C2 infrastructure used throughout the history. Each commit modifies the address of the true C2 server, with the first commit dating all the way back to December 4th, 2022 — the same day that the first version of the hashdecrypt package was published.

Modesty and stealth

The threat actors behind this campaign combined a variety of known and well-documented methods to achieve their goals while avoiding detection. First, they made their packages less suspicious by putting their malicious functionality into dependent packages and not into the packages that were directly distributed to their targets. That basic evasion demands more of would-be target organizations. Targets inspecting open source packages wouldn’t find anything malicious in the primary package, but might not bother to investigate the (many) file dependencies it contains. Practically, few development organizations have the resources or time to dig that deeply into the open source code they rely on. 

Furthermore, the content of each of the discovered packages was carefully crafted to make it look less suspicious. The distributed packages public-address-generator and mnemonic_to_address implement their functionality as advertised. Code in both packages was written to look like it is truly dealing with cryptographic operations expected from the package that deals with services related to crypto assets. 

The threat actors behind this campaign weren’t greedy, either. They focused only on what they wanted to get, making no effort to leverage their access to achieve full control over a compromised system or move laterally within the compromised development organization. Instead, they were laser focused on compromising crypto wallets and stealing the crypto currencies they contained. That absence of a broader agenda and ambitions made it less likely this campaign would trip up security and monitoring tools deployed within compromised organizations.

Impact

Based on our research, the impact of the campaign was limited. The initial malicious package RL discovered, bip39_mnemonic_decrypt, was only available for download for around two weeks: from February 4th until February 19th before it was detected and removed from PyPI. During that time, it was downloaded almost 300 times.

The additional packages RL discovered in early March, public-address-generator, erc20-scanner, and hashdecrypts: were all taken down shortly after appearing. Only the hashdecrypt package, which was discovered based on the reference in the GitHub repository, appears to have been available for longer, with versions of that package existing as far back as December 2022. 

Not surprisingly, the number of downloads of each of these packages was limited. There were 997 downloads of the public-address-generator package, 341 of the erc20-scanner package, and 224 of the hashdecrypts package. Our assessment of the overall reach of this campaign, therefore, is that it was limited. The newly discovered PyPI packages were quickly removed from the package manager and likely did not cause much damage.

The story is a bit different in the case of hashdecrypt, and the security impact may be greater. That package was first published in December, 2022 and referenced from the GitHub repository for more than a year. It had 4,295 downloads during that time. As a result, it may have impacted a significant number of development targets. 
Download stats for bip39_mnemonic_decrypt package

Figure 9: Download stats for bip39_mnemonic_decrypt package

Conclusion

The BIPClip campaign is more proof that developers need to be vigilant about software supply chain security threats which lurk in open source package repositories. 

Threat actors like those behind the BIPClip campaign clearly understand that harried developers and development organizations aren't inclined to dig too deeply into the packages they are downloading and incorporating into their applications. Simple measures on the part of supply chain threat actors, such as clever naming and delivering malicious code via code dependencies, are enough to evade detection. 

For development organizations, the time to raise the bar on software supply chain security is now. Software hygiene assessments need to be performed on a regular basis. These should include security assessments of third party tools used in the development process, as well as regular vetting of software release artifacts before they are shipped to ensure that software artifacts ship without malicious implants. 

The BIPClip campaign also provides more evidence (if any was needed) that crypto assets are one of the most popular targets of cybercriminal groups and other threat actors (like North Korean APTs). These groups are well resourced and capable of uncovering subtle and ingenious ways to get their hands on the contents of crypto wallets and exchanges. As an individual, that means you need to keep an open eye on your crypto wallets as well as sensitive information like private keys and mnemonic phrases that can be abused by threat actors to gain control of your crypto assets. As a developer working on cryptocurrency applications or crypto-adjacent apps and services, it means presuming that your applications and code will be targeted by sophisticated cybercriminal actors eyeing supply chain compromises, and setting your security bar appropriately. 

Indicators of Compromise (IOCs)

Indicators of Compromise (IoCs) refer to forensic artifacts or evidence related to a security breach or unauthorized activity on a computer network or system. IOCs play a crucial role in cybersecurity investigations and cyber incident response efforts, helping analysts and cybersecurity professionals identify and detect potential security incidents.

The following IOCs were collected as part of ReversingLabs investigation of this software supply chain campaign.

PyPI packages:

package_name version SHA1
jsBIP39-decrypt 1.0.0 a23db65079ef310b87d1f017742149addbb53a81
jsBIP39-decrypt 1.0.0 03baa36c6551d1414d9907775b4600c873421b34
bip39-mnemonic-decrypt 1.0.0 45130c7a2d92282ee9c0b066206f235198b5ddfb
bip39-mnemonic-decrypt 1.0.0 087d325c24a5b28ad5342f097c3ebce3653e9ced
bip39-mnemonic-decrypt 1.0.1 46d3a5b3627e7de58c78f41eed4c95c6112245e7
bip39-mnemonic-decrypt 1.0.1 f2aadcd5bd1ba46b056e2d9e4b53e21a18b61b2a
mnemonic_to_address 1.0.0 f6bb6216caf96246f07e3fd9ffcb5f0d83bd6f41
mnemonic_to_address 1.0.0 e50864e1db37a75b99596aea6538981991bf4915
mnemonic_to_address 1.2.7 a88802edce3d5e70ac2d79272f98c0891c793f2a
mnemonic_to_address 1.2.7 c3822c1f181d8f6f12325a00b5bd6cca0c18d124
mnemonic_to_address 1.2.8 c1dc8d26946d52a1014ccc6c02156449e8e1e3b6
mnemonic_to_address 1.2.8 b74c24938595fe4ccc6efe845d2b095d126ed3fc
erc20-scanner 1.0.0 7ed9e234384e564e6d41da156bc472d5f369727e
erc20-scanner 1.0.0 ed1eb28a139c456e520726307e280a26b789b367
erc20-scanner 1.0.1 db61022dd75a63e99544bb5096c2e30d4348608e
erc20-scanner 1.0.1 65dab94f5ba56b891ed9bfe20d2b1f21c2d00ee1
public-address-generator 1.0.0 570e483dfdc6389e1d4a87f987c9b3e5a0d886ce
public-address-generator 1.0.0 1619a6fce00eecf5946750ef47d1c5748e963456
public-address-generator 1.0.1 f4ff1fe54132ca91ecdf7f4b48fc16b231047b96
public-address-generator 1.0.1 a875e313026a5400a920767038d953398b4afcb6
public-address-generator 1.0.2 4a39462ce7b3e2cda9998fb9fd42aeab3d5eb4a3
public-address-generator 1.0.2 19d88ff3e9d32897becc33c07b4cc307871b426e
public-address-generator 1.0.3 791e731b2db1551ccfc6df0990644ed405771aa6
public-address-generator 1.0.3 9aa894169984cfb4835b01f5f5b49d9670818259
public-address-generator 1.1.1 dddd55a60d5dcbec45c034330fe12b62e38a87a8
public-address-generator 1.1.1 3e385f6b2c842a490c1729aee1b48b22a728e367
public-address-generator 1.1.2 f2ed2e169bbe22aef73158e279e59d04a1f40ed9
public-address-generator 1.1.2 633b858092f7e0eb435a73f5bc972baa4cf79452
public-address-generator 1.1.3 3d82406f8e6ee1018bb39f6d40321940effeab2b
public-address-generator 1.1.3 c05d35c4cc9038de3eae4e84fb9b7560f4112a3b
hashdecrypt 1.0.0 01b66f12e9f76342729c1260ff4f0da8fc1bbe01
hashdecrypt 1.0.0 d5400ef535a8effe8c23cb56c4cb1c2c569beb79
hashdecrypt 1.0.1 156610fff622481eb3c37e988a5c8ece20f93aef
hashdecrypt 1.0.1 3843c4add1c2960f280d07b047f0c780a7b65e4d
hashdecrypt 1.0.2 9c4d2bacc24f70112bc53742e8fe26dad1fa63d1
hashdecrypt 1.0.2 989276eb67d5179b5eda055390d850b47198cdd2
hashdecrypts 1.0 64cd50f3bc347c894cbf25a2013c04e73e85550a
hashdecrypts 1.0 206cd1758ceda4abc9622d4f50134444a639f925

 

Command & Control infrastructure:

5.42.92.191
hxxps://raw.githubusercontent.com/HashSnake/backendapi/main/settings
194.163.154.242
knallos.de
65.109.70.235

 

Malicious GitHub repository:

hxxps://github.com/HashSnake/hCrypto

https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki#user-content-Abstract

Get up to speed on key trends and learn expert insights with The State of Software Supply Chain Security 2024. Plus: Explore RL Spectra Assure for software supply chain security.

More Blog Posts

    Special Reports

    Latest Blog Posts

    Chinese APT Group Exploits SOHO Routers Chinese APT Group Exploits SOHO Routers

    Conversations About Threat Hunting and Software Supply Chain Security

    Reproducible Builds: Graduate Your Software Supply Chain Security Reproducible Builds: Graduate Your Software Supply Chain Security

    Glassboard conversations with ReversingLabs Field CISO Matt Rose

    Software Package Deconstruction: Video Conferencing Software Software Package Deconstruction: Video Conferencing Software

    Analyzing Risks To Your Software Supply Chain