Americas

Asia

Oceania

lconstantin
CSO Senior Writer

Major GitHub repos leak access tokens putting code and clouds at risk

News
15 Aug 20246 mins

Build artifacts generated by GitHub Actions often contain access tokens that can be abused by attackers to push malicious code into projects or compromise cloud infrastructure.

Exploring Programming: Code View, Lens, Glasses, Laptop. keyboard
Credit: Sergejs Paskevics / Shutterstock

An analysis of build artifacts generated by GitHub Actions workflows inside open-source repositories belonging to major companies revealed sensitive access tokens to third-party cloud services, as well as GitHub itself. In addition, a change made this year in the GitHub artifacts feature has introduced a race condition that attackers can exploit to abuse previously unusable GitHub tokens.

The investigation, performed by Yaron Avital, a researcher with Palo Alto Networks, found secrets in artifacts stored in dozens of public repositories, some corresponding to projects maintained by Google, Microsoft, Amazon AWS, Canonical, Red Hat, OWASP, and other major organizations. The tokens provided access to various cloud services and infrastructure, music streaming services, and more.

“This allows malicious actors with access to these artifacts the potential of compromising the services to which these secrets grant access,” Avital wrote in his report. “In most of the vulnerable projects we discovered during this research, the most common leakage is of GitHub tokens, allowing an attacker to act against the triggering GitHub repository. This potentially leads to the push of malicious code that can flow to production through the CI/CD pipeline, or to access secrets stored in the GitHub repository and organization.”

How secrets get included in artifacts

GitHub Actions is a CI/CD service that allows users to set up workflows for automating code builds and tests inside containers either on GitHub’s or the user’s own infrastructure. These workflows are defined in .yml files using YAML syntax and are automatically executed when certain triggers or events occur on a repository. For example, new code commits could trigger an action to compile and test that code and produce a report.

Actions workflows often generate build artifacts, which can be compiled binaries, test reports, logs, or other files that result from the execution of a workflow and its individual jobs. These artifacts are stored for 90 days and can be consumed by other workflows or as part of the same workflow. In open-source projects, such artifacts are typically accessible to everyone.

However, during workflow execution it’s common to temporarily store access tokens in container environment variables or other temp files so the jobs can access the external tools and services needed for completion. As a result, extra care must be taken to ensure these secrets don’t leave that environment when the workflow is complete.

In one example that Avital found, a popular code linter used by many projects generated a log that included environment variables; that log was then saved as an artifact. Code linters are static code analysis tools that look for errors, vulnerabilities, and stylistic issues with the goal of improving code quality.

More commonly is the exposure of GitHub tokens by using the actions/checkout command to create a local clone of the repository to execute jobs against it. As part of this process, a temporary GITHUB_TOKEN is created and saved inside the local .git folder to allow the execution of authenticated git commands. This token is supposed to be ephemeral and should stop working once the workflow completes, but there are still ways to abuse it if exposed.

“From what I’ve seen, users commonly — and mistakenly — upload their entire checkout directory as an artifact,” Avital wrote in his report. “The directory contains the hidden .git folder that stores the persisted GITHUB_TOKEN, leading the publicly accessible artifacts to contain the GITHUB_TOKEN.”

Another GitHub token Avital found inside artifacts is the ACTIONS_RUNTIME_TOKEN, which is stored in environment variables when certain GitHub Actions are called, for example actions/cache and actions/upload-artifact. Unlike GITHUB_TOKEN, which is only valid during the workflow run, ACTIONS_RUNTIME_TOKEN is valid for six hours, even after the workflow finishes, so there’s a significant window for abuse.

“I automated a process that downloads an artifact, extracts the ACTIONS_RUNTIME_TOKEN, and uses it to replace the artifact with a malicious one,” the researcher wrote. “Subsequent workflow jobs often rely on previously uploaded artifacts. Cases of this kind open the door for remote code execution (RCE) on the runner that runs the job consuming the malicious artifact. RCE can also occur if developers download and execute a malicious artifact, leading to compromised workstations.”

The new GITHUB_TOKEN race condition

As previously mentioned, the GITHUB_TOKEN expires when the workflow finishes, with artifacts available only after workflows are done. Since February, however, that’s no longer the case.

Version 4 of the GitHub Actions Artifacts introduced the ability to download artifacts via the user interface or through the API while the workflow run is still in progress. This was a feature requested by the community that’s useful in many situations, such as reviewing the artifact before approving the release, or getting the artifact sooner on workflows that have many jobs and can take a long time to complete them all.

But with Avital’s finding that GITHUB_TOKEN is commonly exposed in artifacts, this new feature creates a race condition that attackers can win: If they know when a workflow is started, they can try to obtain the artifact and extract the GITHUB_TOKEN while it’s still valid because the workflow has not yet finished.

The success rate varies from workflow to workflow. In many cases, generating the artifact is one of the last steps the job performs, so the window to download the artifact, extract the token, and perform a malicious action with it is too small. However, some workflows have more steps defined after generating the artifact, so it’s only a matter of finding those. As Avital points out, the list of projects that have switched to v4 of the artifacts API is growing rapidly because the previous v3 is scheduled for deprecation in November.

To expand the scope of the attack and the number of workflows that can be targeted, the researcher significantly improved the performance of his attack by creating a malicious workflow that runs on GitHub’s infrastructure and triggers when a workflow in a targeted repository executes. The malicious workflow then makes dozens of API requests per second to detect immediately when the artifact is generated and downloads it. Artifacts are archived but instead of unpacking the entire artifact, the researcher created a script to extract just the git config file which contains the token, significantly improving the performance of the attack.

Mitigation

To help secure against such attacks, the researcher created a custom GitHub action called upload-secure-artifact that others can include in their workflows to scan generated artifacts for secrets and prevent their upload if any are detected.

“GitHub’s deprecation of Artifacts V3 should prompt organizations using the artifacts mechanism to reevaluate the way they use it,” the researcher wrote. “Reduce workflow permissions of runner tokens according to least privilege and review artifact creation in your CI/CD pipelines.”