Leaks and exposures of sensitive information in open source and proprietary code repositories are approaching epidemic proportions. Hardly a week goes by without reports of attacks on firms that leverage credentials, tokens or signing keys found lurking in code repositories.
There were more than 10 million secrets leaked to the GitHub source code repository in 2022 — and one in 10 GitHub code authors exposed a secret in 2022, said the recent State of Secrets Sprawl Report, released by the firm GitGuardian. The prevalence of sensitive information in code repositories has turned platforms like GitHub and PyPI in recent years from developer playgrounds with a low risk profile, to popular hunting grounds for ransomware gangs, nation-state actors and other malicious groups looking for an unobstructed path into sensitive IT environments.
ReversingLabs also tracks sensitive information detected in analyzed packages. To date, its automated systems have detected more than 112,000 exposed secrets in packages in the npm repository, more than 30,000 in the Python Package Index (PyPI) and more than 10,000 in the RubyGems open source repository. Affected packages are concentrated on popular platforms such as Microsoft’s Azure, Google Cloud, GitHub, Slack, and Salesforce. Other exposed access credentials include OAuth tokens, JSON Web Tokens (JWT), and thousands more.
Figure: Secrets detected in packages residing on four major code repositories, broken down by the affected service. Source: ReversingLabs analysis.
The question is how to avoid falling victim, and how to respond should sensitive information inadvertently find its way to public repositories or otherwise be exposed. Fortunately, software development teams have several options for responding to secrets leaks and managing both short- and long-term risks.
Addressing the risks of secrets exposure via source code is a multi-pronged endeavor. You’ll need to invest in tooling to assist with secrets discovery. You’ll also need to change development practices with an eye to addressing risks at the design and architecture stage, in order to minimize the risk of secrets leaks going forward.
Here are three essential steps that your organization should take to mitigate risk from development secrets leaks — and best practices for preventing future secrets leaks from happening.
[ Learn more in our special report, Secrets Exposed: A Modern Guide for Securing Secrets in Software | Plus: See the related special report: The State of Software Supply Chain Security 2024 ]
1. Achieve situational awareness for secrets
Situational awareness means discovering what secrets are hiding in your application code, understanding their purpose, and grasping who or what has permission to use those secrets.
Secrets discovery tools are commonplace. GitHub, CircleCI and other vendors offer free tools to scan public and private repositories, and you can find a range of tools that search for hidden authentication tokens, environmental variables, signing keys and so on. Unfortunately, secrets scanning tools are often “noisy.” And they uncover many leaked credentials, some of which are placeholders or decoys that offer no advantage to cyber adversaries.
These false positives include commonly distributed keys for testing online services or application programming interfaces (APIs), placeholder credentials for accessing network services, and so called “canary” or “honeypot” tokens that are deliberately placed in code and monitored to alert security teams of malicious activity. However, data collected from observations of canary tokens deployed in the wild also suggest that the grace period to respond for organizations that accidentally publish secrets to public repositories can be measured in minutes, or even seconds - so speed is of the essence.
Organizations need to invest in tooling that can filter out that noise and allow staff to zero in on instances of active tokens and secrets that have been leaked. And prioritizing leaked credentials based on the level of access they provide is key.
2. Rotate exposed secrets
In the context of a secrets leak, development teams must work alongside security incident response teams to quickly understand how severe an exposure is.
You should assume that any exposed secrets, including authentication credentials, API tokens, and encryption keys, have been discovered and immediately suspend and reissue them. Hardcoded credentials should, at a minimum, be updated to deny adversaries access. In the medium term, you should rotate those credentials out of your code altogether.
Determining which secrets have been discovered and exploited and the level of access they provide is a different matter. That requires mapping credentials back to IT assets and data, as well as scrutinizing the logs maintained by access management systems to spot use of the leaked credentials. Finally, organizations need to tie back any activity involving the exposed secret to a source to determine whether it is authorized, suspicious or malicious. Scrutinizing exposed credentials in this way is key, said Karlo Zanki, a threat researcher at ReversingLabs.
“You need to see if the credentials were discovered and used by a remote or untrusted source."
—Karlo Zanki
Any suspicious activity connected with the exposed secret should prompt the team to further investigate what users and IT assets may be affected. Supply chain attacks that leverage leaked or stolen secrets, environment variables, and other sensitive data are often the first step toward larger, targeted attacks.
The length of time between when secrets are exposed and when they’re discovered by the affected organization figures heavily into the severity of the incident. SSH Keys or AWS tokens that are inadvertently published to a public repository and immediately detected and revoked pose little risk to protected systems and data.
At the other end of the spectrum are leaks like the one associated with Toyota Motor Corp.’s T-Connect telematics application. In this case, a contractor accidentally published T-Connect source code to a public GitHub repository that included a Toyota database access token. That leak went undetected for five years, during which time the access token was never rotated. After the lapse was discovered, Toyota was forced to admit that “third-party access could not be confirmed from the access history of the data server where the information was stored” and that information on close to 300,000 T-Connect users may have been accessed during that period.
Even secrets leaks that are quickly remediated are cause for some concern. Malicious actors don’t need years to discover and make use of stolen secrets. The CircleCI secrets exposure lasted only a matter of weeks, but several of its customers were victimized after secrets and credentials were pilfered from private code repositories they hosted on CircleCI’s platform, the company indicated.
Also, in 2019 the cloud security vendor Imperva admitted that it spilled email addresses, passwords, SSL certificates and API keys for some of its customers after malicious actors found an improperly configured cloud computing server used for testing was accessible from the public Internet. A stolen API key was subsequently used to access the company's customer data.
That’s why, in addition to rotating exposed secrets, organizations should check to ensure that they haven’t found their way into the hands of cybercriminal groups or malicious actors. Using tools that provide automated scanning for secrets in open source packages, third party libraries, commercial applications and elsewhere can help organizations maintain situational awareness by following the path of leaked and stolen credentials on the digital underground.
3. Manage your secrets going forward
Whether development secrets exposures are detected during an audit or in response to a leak, your organization will eventually need to move past the situational awareness and crisis phase of secrets management. Your focus should then turn to managing development secrets in a responsible manner going forward. “Each company that does software development should have clear rules for handling and incorporating secrets into code, and they need to communicate that process and the rules to developers,” said Zanki of ReversingLabs.
Here are secrets best practices every organization should take to protect features.
Stop hard-coding secrets
Removing high value/high risk hard-coded secrets from application code should be at the top of your list of rules and policies. You can achieve this in several ways. The most common is to closely audit code and, where appropriate, require developers to manage high-value secrets using secret management vaults, rather than simply embedding them in code. In this way, applications can access secrets in separate files that can be isolated from the application code, provided only when needed during runtime, and left out of code builds.
Also, removing things like certificates, keys, tokens and credentials utilized for automated code testing can help eliminate “noise,” making it easier for your organization to identify and track exposed secrets that are a real threat.
Finally, to avoid inadvertent leaks during builds, development organizations should establish policies and processes that require developers to inform those who write and maintain configuration files about the location of any secrets so that, where appropriate, they can add those files to a list that will exclude them from build packages.
Promote developer best practices — and monitor compliance
Given the pressure of modern, agile development organizations, there are powerful incentives for developers not to address issues such as stored secrets, which may appear - to them - to be remote risks. “If I develop something and it works, I won’t want to touch it any more,” said Zanki. “Either I forget about the access token or I just choose not to fix it because it's not important to me.”
DevOps organizations need to account for those negative incentives and promote behaviors that limit the security risk posed by secrets. Even experienced developers, when under pressure to deliver, can make mistakes or lapses in judgment that expose credentials, tokens and other development secrets.
What is clear is that secret leaks are more likely to occur in development organizations that lack clear guidelines for how to properly manage secrets in their code. DevOps teams should formulate policies based on an understanding of how secrets are generated within their development environment; how those secrets are protected once deployed; and how (hopefully) they are eventually cycled out of use.
Practically speaking, the division of labor within modern DevOps organizations leaves different individuals responsible for managing GIT operations and overseeing builds. Managing secrets may, likewise, require someone specifically tasked with that responsibility. Whatever the case, the days of leaving it to individual developers to track and manage their own secrets in code are over.
Revamp how you design applications
Get your team to design applications in a way that eliminates the reliance on access tokens that reside in code. There are several design steps developers can take to avoid storing credentials and other secrets in their code. Those include using technologies like OAuth OpenID Connect (OIDC) tokens, that allow developers to authenticate users without directly exposing user credentials in code.
Taking steps to strengthen authentication security within applications is also critical. Using multi-factor authentication, leveraging identity and access management platforms, and setting IP ranges that limit inbound connections for APIs to known addresses or address ranges can reduce the risk that exposed secrets will be abused by malicious actors.
Establish end-to-end security
Finally, take steps to ensure that efforts to search for and prevent secret leaks don’t end with code scanning, but encompass the entire development pipeline. Misconfigurations in CI/CD tooling or developer oversight can result in the inclusion of secrets during final builds, even when DevOps teams have taken steps to protect them.
Organizations also should implement binary analysis as part of the standard build-and-release pipeline in order to extend visibility into leaked secrets beyond raw code and into software packages and containers that are ready for release. This capability can expose secrets disclosed as a result of build or packaging mistakes that might otherwise escape notice during a code audit.
Secrets protection requires a holistic approach
Unfortunately, there is no “silver bullet” fix for software supply chain risks and attacks, nor for development secrets leaks and exposures. Instead, development organizations need to respond to actual or potential leaks holistically. Begin by understanding the scope of the problem, proceed to addressing the short-term risks posed by leaked credentials, and then extend your efforts out to larger changes in application design and development processes with an eye toward snuffing out secrets leaks — and the supply chain attacks that often follow — before they can happen.
Keep learning
- Learn how you can go beyond the SBOM with deep visibility and new controls for the software you build or buy. Learn more in our Special Report — and take a deep dive with our White Paper.
- Upgrade your software security posture with RL's new guide, Software Supply Chain Security for Dummies.
- Gartner is redefining software supply chain security, and calling on enterprises to make some big changes. Get the new Gartner Leader's Guide — and learn more in our Special Report.
- Commercial software risk is under-addressed. Get key insights with our Special Report, download the related white paper — and see our related Webinar for more insights.
- Understand key trends and get expert insights with our special report package: The State of Supply Chain Security (SSCS) 2024. Plus: Download the full State of SSCS report.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.