10 Billion Files Classified

Scaling Cybersecurity: How ReversingLabs Analyzed 10 Billion Files to Combat Emerging Threats

Marijan Ralasic, Former Solution Architect at ReversingLabs

ten billion files and 3600 unique formats

Have you ever taken a step back and said, “How time flies?” Well, ten years ago ReversingLabs ventured on a journey to become the leader in cloud-delivered object security and file analysis. Over the years we have developed enterprise solutions with the goal of providing the most timely and accurate intelligence and alerts about attacks before they impact IT infrastructures and services. Our talented and creative employees were presented almost impossible tasks to solve including challenges of scale, challenges of breadth, and challenges of depth. We delivered a solution in the form of what is now known as the Titanium Platform, establishing ReversingLabs as a market innovator and leader in automated static analysis. While it seems like we started this journey just yesterday, today we celebrate our Titanium Platform which provides threat classification and rich context for over 10 billion goodware and malware files.

The very first malicious file we classified was an executable that contained a password recovery tool used to capture logon credentials and send them to an attacker. We developed a backend system that utilized the best possible datacenter technologies, the least possible latencies, the fastest response, and was highly scalable. We set our goals on addressing the projected volume of samples decades into the future. We developed various sensors and integrations that would allow us to process these high volumes of files and classify them at speed. We then enhanced our reputation database with a unique automated static analysis capability that identifies and unpacks file content, deobfuscates it, collects its metadata, and then classifies these files.

bar graph number of samples over the years

Over time our file reputation database grew in both malware and goodware files. As polymorphic malware families began to stand out in our database, our RHA (Reputation, Hunting, and Analysis) functional similarity algorithm was developed to provide our customers protection against those polymorphic threats by identifying functionally similar files. One could now create alerts for similar files and focus on more unique and signification threats. It is interesting to see how our Trojan samples evolved over time, introducing newer more vicious threats that were targeting their unsuspecting victims. From the Delf data theft Trojans of 2010 to today’s Coinminers, we recognized and stored as much metadata as we could extract.

malware types in reversinglabs repository

While today is reason to celebrate, it is also a reminder that the threat landscape continues to expand and evolve. ReversingLabs has classified over 10 billion files, and recognizes over 40 threat types and over 70 thousand different threat families, from polymorphic to the unique ones. Our static analysis unpacks over 360 formats and identifies more than 3,600 specific file formats. Our high volume processing system curates and reprocesses more than 8 million files daily to provide the best protection as more threats emerge and new intelligence becomes available. We can truly say we provide visibility into every collected malware file, location and threat with the speed, accuracy and scale required for today’s digital enterprise.

Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.

Tags:Products & Technology

10 Billion Files Classified

Scaling Cybersecurity: How ReversingLabs Analyzed 10 Billion Files to Combat Emerging Threats

Spectra Assure Free Trial