by Lucian Constantin

CSO Senior Writer

Python GitHub token leak shows binary files can burn developers too

News Analysis

11 Jul 20245 mins

Application SecurityDevSecOpsSoftware Development

Scrubbing tokens from source code is not enough, as shown by the publishing of a Python Software Foundation access token with administrator privileges to a container image on Docker Hub.

Diverse Multi-Ethnic Team in Modern Office: Teamwork of IT Programmers Gather Around Desktop Computer, Talking, Finding Solution. Specialists Create Software. Engineers Develop App, Program

Credit: Gorodenkoff / Shutterstock

A personal GitHub access token with administrative privileges to the official repositories for the Python programming language and the Python Package Index (PyPI) was exposed for over a year. The access token belonged to the Python Software Foundation’s director of infrastructure and was accidentally included in a compiled binary file that was published as part of a container image on Docker Hub.

“Although we encounter many secrets that are leaked in the same manner, this case was exceptional because it is difficult to overestimate the potential consequences if it had fallen into the wrong hands — one could supposedly inject malicious code into PyPI packages (imagine replacing all Python packages with malicious ones), and even to the Python language itself,” researchers from security firm JFrog, who found and reported the token, wrote in a report.

The incident shows that scrubbing access tokens from source code only, which some development tools do automatically, is not enough to prevent potential security breaches. Sensitive credentials can also be included in environment variables, configuration files and even binary artifacts as a result of automated build processes and developer mistakes.

The Python token leak was the result of laziness

Ee Durbin, the administrator of PyPI and director of infrastructure for the Python Software Foundation (PSF), wrote an incident report explaining how the leak happened. The leak involved the access token for Durbin’s own account, which had administrative privileges due to his role in the organization.

In early 2023, Durbin was working on cabotage-app, a Docker-based tool developed by the PSF that is used to deploy PyPI and associated services on a Kubernetes cluster. While working on the build portion of the codebase, he kept running into API rate limits that GitHub enforces for anonymous access.

In what he calls “an act of laziness,” Durbin decided to modify the source code locally to include an access token for his own account in order to bypass the default rate limits and finish the job faster. This was a quick fix, an alternative to configuring a localhost GitHub App to do the build instead of using the GitHub API.

While Durbin knew that adding personal access tokens (PATs) to source code is bad security practice, the change was only to his local copy of the codebase and was never intended to be pushed remotely. In fact, the automated build and deployment script was supposed to revert local changes, which should have scrubbed the token.

What Durbin didn’t realize was that the token was also included in .pyc (Python compiled bytecode) files generated as part of the build process, and that those files, stored in the __pycache__ folder, were not configured to be excluded from the final Docker image uploaded to Docker Hub.

After being notified by JFrog in late June, the PyPI security team revoked the token and reviewed all GitHub audit logs and account activity for possible signs that the token might have been used maliciously. No evidence of malicious use was found. The cabotage-app version containing the token was published on Docker Hub on March 3, 2023, and was removed on June 21, 2024 — fifteen months later.

“Cabotage is now entirely self-hosting, which means that builds of the cabotage-app no longer utilize a public registry and deployment builds are initiated from clean checkouts of source only,” Durbin wrote. “This mitigates the scenario of local edits making it into an image build outside of development environments, as well as removing the need to publish to public registries.”

Durbin said he will avoid creating personal access tokens for his account in the future unless absolutely needed, because aside from this one case, he doesn’t remember any other instances where such a long-lived token has been helpful.

“This is a great reminder to set aggressive expiration dates for API tokens (if you need them at all), treat .pyc files as if they were source code, and perform builds on automated systems from clean source only,” he advised.

JFrog congratulated the PyPI security team for responding to their report and revoking the token within an impressive 17 minutes. While having perfect security is never possible, having a clear point of contact for security issues and a fast response time is critical to limiting the impact of security incidents for any organization.

Advice for developers

Aside from scanning binary artifacts and configuration files for potential secrets, developers should use the new fine-grained GitHub personal access tokens that were introduced two years ago instead of the classic ones. The new tokens enable users to choose the privilege levels and the specific repositories they provide access to.

“Creating the ‘one ring to rule them all’ is always a bad idea,” the JFrog researchers wrote in their report. “We highly recommend using this feature, as we frequently encounter situations where a token providing ultimate access to the entire infrastructure gets leaked within a side project or temporary ‘hello-world’ application.”

In addition, since 2021 GitHub tokens have a new format that includes a ghp_ prefix and a checksum, making it easier for automated tools to detect them. Old GitHub tokens, which haven’t been deprecated and are still around, are indistinguishable from SHA1 hashes, which are also common in source code and not a security risk, so could be skipped by scanners. Developers are strongly advised to switch to the new token format.

by Lucian Constantin

CSO Senior Writer

Lucian Constantin writes about information security, privacy, and data protection for CSO.

Show me more

CSO Executive Sessions: How cybersecurity impacts company ratings - A fey factor for investors and consumers

12 Feb 202527 mins

Security

CSO Executive Sessions: Guardians of the Games - How to keep the Olympics and other major events cyber safe

07 Aug 202417 mins

CSO and CISO

CSO Executive Session India with Dr Susil Kumar Meher, Head Health IT, AIIMS (New Delhi)

17 Jul 202417 mins

CSO and CISO

CSO Executive Sessions: How cybersecurity impacts company ratings - A fey factor for investors and consumers

12 Feb 202527 mins

Security

CSO Executive Sessions: How should software solution providers keep themselves and their enterprise clients safe?

26 Jan 202518 mins

Security

CSO Executive Sessions: Open Source Institute’s Eric Nguyen on supply chain risks to critical infrastructure (Part 2)

14 Nov 202415 mins

Critical InfrastructureIT GovernanceSupply Chain

Americas

Asia

Europe

Oceania

Topics

About

Policies

Our Network

More

Python GitHub token leak shows binary files can burn developers too

Scrubbing tokens from source code is not enough, as shown by the publishing of a Python Software Foundation access token with administrator privileges to a container image on Docker Hub.

The Python token leak was the result of laziness

Advice for developers

More from this author

Chinese APT Silk Typhoon exploits IT supply chain weaknesses for initial access

Chinese cyber espionage growing across all industry sectors

VMware ESXi gets critical patches for in-the-wild virtual machine escape attack

Ransomware access playbook: What Black Basta’s leaked logs reveal

Microsoft files lawsuit against LLMjacking gang that bypassed AI safeguards

Geopolitical tensions fuel surge in OT and ICS cyberattacks

FBI and CISA warn about continuing attacks by Chinese ransomware group Ghost

Russian cyberespionage groups target Signal users with fake group invites

Show me more

Linux, macOS users infected with malware posing as legitimate Go packages

8 obstacles women still face when seeking a leadership role in IT

What is risk management? Quantifying and mitigating uncertainty

CSO Executive Sessions: How cybersecurity impacts company ratings - A fey factor for investors and consumers

CSO Executive Sessions: Guardians of the Games - How to keep the Olympics and other major events cyber safe

CSO Executive Session India with Dr Susil Kumar Meher, Head Health IT, AIIMS (New Delhi)

CSO Executive Sessions: How cybersecurity impacts company ratings - A fey factor for investors and consumers

CSO Executive Sessions: How should software solution providers keep themselves and their enterprise clients safe?

CSO Executive Sessions: Open Source Institute’s Eric Nguyen on supply chain risks to critical infrastructure (Part 2)

Python GitHub token leak shows binary files can burn developers too

Scrubbing tokens from source code is not enough, as shown by the publishing of a Python Software Foundation access token with administrator privileges to a container image on Docker Hub.

The Python token leak was the result of laziness

Advice for developers

From our editors straight to your inbox

More from this author

Chinese APT Silk Typhoon exploits IT supply chain weaknesses for initial access

Chinese cyber espionage growing across all industry sectors

VMware ESXi gets critical patches for in-the-wild virtual machine escape attack

Ransomware access playbook: What Black Basta’s leaked logs reveal

Microsoft files lawsuit against LLMjacking gang that bypassed AI safeguards

Geopolitical tensions fuel surge in OT and ICS cyberattacks

FBI and CISA warn about continuing attacks by Chinese ransomware group Ghost

Russian cyberespionage groups target Signal users with fake group invites

Show me more

Linux, macOS users infected with malware posing as legitimate Go packages

8 obstacles women still face when seeking a leadership role in IT

What is risk management? Quantifying and mitigating uncertainty

CSO Executive Sessions: How cybersecurity impacts company ratings - A fey factor for investors and consumers

CSO Executive Sessions: Guardians of the Games - How to keep the Olympics and other major events cyber safe

CSO Executive Session India with Dr Susil Kumar Meher, Head Health IT, AIIMS (New Delhi)

CSO Executive Sessions: How cybersecurity impacts company ratings - A fey factor for investors and consumers

CSO Executive Sessions: How should software solution providers keep themselves and their enterprise clients safe?

CSO Executive Sessions: Open Source Institute’s Eric Nguyen on supply chain risks to critical infrastructure (Part 2)