Americas

Asia

Oceania

Python administrator moves to improve software security

News Analysis
23 Jan 20255 mins
MalwareOpen SourceSoftware Development

The popular programming language has added a way to check for malware-laded packages.

Female Developer Thinking and Typing on Computer, Surrounded by Big Screens Showing Coding Language. Professional Programmer Working in an Office, Running Coding Tests. Futuristic Programming
Credit: Gorodenkoff / Shutterstock

The administrators of the Python Package Index (PyPI) have begun an effort to improve the hundreds of thousands of software packages that are listed. The attempt, which began earlier last year, is to identify and stop malware-laced packages from proliferating across the open-source community that contributes and consumes Python software. As previously reported, hijacking Python programs has become widespread.

The effort called Project Quarantine is described in blog post by Mike Fiedler, who is the sole administrator responsible for Python security. The project allows PyPI administrators and a select group of developers to mark a project as potentially harmful and prevent it from being easily installed by users, avoiding further harm.

What makes PyPI interesting for bad actors

Python is a victim of its own success. PyPI lists more than 625,000 packages, with new ones being added daily. This provides a tempting opportunity for malware authors to inject their code into a package. From there the code is distributed across the internet by developers who weren’t aware that their code was polluted.

This package volume means the index is under constant threat by malicious actors, with attacks including using similar-named packages to typo squat the legitimate ones, or create further dependency confusion, as Tom Callaway wrote in a blog in 2023. “Since Python is modular in nature, most Python applications rely heavily on PyPI to provide the necessary dependencies for core functions rather than reinventing them each time. PyPI is also the primary distribution point for Python applications and libraries.”

The language “is something new programmers are attracted to because it is easy to learn, and this means many developers aren’t necessarily thinking about security,” Ed Woodruff, an offensive security expert told CSO. “Before the quarantine effort, there wasn’t much emphasis on security, and I am happy to see this project taking the lead.”

How other open-source projects fare against bad actors

Other open-source projects have lower new package volumes or have commercial organizations with funding and resources to act as hall monitors. Take NPM, the index of JavaScript software that is maintained by GitHub as an example of the latter situation. “GitHub is great at screening for malware, and they have some of the best security researchers in the world,” Janet Worthington, a Forrester Research analyst, told CSO.

NPM has a lot more resources to prevent compromised packages, according to Shachar Menashe, VP of security research at Jfrog. “PyPI only has four full-time administrators, with Fiedler being the sole security specialist.” Even with all these resources at NPM, they still see about a dozen malicious JavaScript packages each day.

Other ways to secure open-source projects

There are other complementary ways to the quarantine mechanisms that can better secure these vast open-source code collections. For example, numerous vendors offer static and dynamic application scanners, and there are other tools that collect software bill of materials to provide some visibility into dependent projects. The latter received a big emphasis when the Apache Log4j exploits hit during 2021, and is one of the focuses of a recent Biden Executive Order to motivate better software security. And there are commercial scanning tools, such as Sonatype’s Repository Firewall, that claims to be able to intercept malware early on in the development life cycle.

Another path is through developer education, “so that they consider security earlier in their coding processes. And enterprises should pre-screen any package ahead of its use as a general best practice,” said Worthington. Sadly, this notion of security by design has been kicking around the development world for decades.

The Quarantine feature was enabled last August and has found about 140 projects initially, all but one of them containing malware and since removed from circulation. Before the quarantine feature was deployed, PyPI administrators had a blunt tool to deal with suspicious packages: permanent removal from their database. This was fine if the package was indeed containing malware, but sometimes these designations were false positives and not easily reinstated. The quarantine process puts the package in limbo while it is being analyzed. It also helps shorten the time a malicious package is available for use, and “further reduces the incentive for malicious actors to use PyPI as their distribution method,” Fiedler wrote in his post. He mentions they are working on ways to more effectively use automation to help screen the various Python packages to identify malicious behavior earlier in the evaluation process. “The quarantine process also helps to reduce false positives,” said Menashe. “You just can’t block something without looking carefully at the code, and that takes time.”

“Quarantine sounds like a very practical approach to help with a complex problem,” Jeff Williams, CTO of Contrast Security, told CSO. “It’s certainly not a cure-all for malicious code, more like a tourniquet. But it is a great way to protect users temporarily until a more rigorous evaluation can be done. Malware is impossible to detect automatically, because attackers can make it look benign and there are many sources of code that get contributed to open-source projects. So, the traditional defenses — including our tools — don’t attempt to try to block malware. There has to be a manual process.”