JVM Advent

The JVM Programming Advent Calendar

Santa’s Python Pitfalls: A Java Developer’s Guide to Staying Safe This Christmas

Just like that it happened. You, a disciplined Java developer, are now installing Python. Like everything in 2025, it just arrived. One day, you were running a tidy mvn install, the next, you’re learning about virtual environments and fighting an unfriendly pip install that won’t explain what it just pulled from the internet.

Good news: staying safe in Python’s world isn’t complicated. It’s just not obvious. Think of this as a friendly reminder to check the extension cable before plugging in another set of lights.

 

Pip Install vs. Mvn Install: A World of Difference

Java’s packaging has its flaws, but predictability isn’t one of them. Maven Central is curated. Group IDs enforce a namespace. When you fetch com.fasterxml.jackson.core:jackson-databind, you get that specific artefact from that particular organisation.

Python is radically different:

  • A Flat Namespace: Package names are first-come, first-served on PyPI. Name collisions and typosquatting are primary attack vectors.
  • A Trusting Resolver: The resolver is extremely trusting. It will often prioritise a higher version number from any available index, which is the core of dependency confusion attacks.

If you’re used to Java’s rules, Python’s model will trip you up. Please treat it with extreme care.

Be Suspicious of “pip install …”

Java developers might fall for an occasional fake install script; Python encourages them. The internet is full of “copy this pip install and trust me.” You shouldn’t accept that for a Maven dependency. Don’t accept it here.

If you see pip install my-cool-tool, ask the same questions you would for a new Maven dependency:

  1. Who owns this package? (Check PyPI, GitHub activity, and developer reputation).
  2. Does the name look close to something legitimate? (Guard against typosquatting: requests vs. requessts).
  3. Is this the actual source, or a mirror?

Dependency Confusion in Python is embarrassingly easy. A malicious public package with the same name can hijack a private internal package called company-utils simply by having a higher version number. This is the default behaviour if not explicitly mitigated.

Your Checklist:

  • Prefer explicit version pins. (e.g., requests==2.28.1).
  • Prefer known sources. Use a private package registry like Artifactory or Nexus to proxy and cache PyPI, allowing your organisation to blacklist known bad packages and prioritise your internal packages. (It may sound like ‘corporate’, but it’s essential)
  • Prefer tooling that uses a lock file.

 

Take Five Minutes to Learn Virtual Environments

You’ve probably ignored Python’s advice about virtual environments already. Don’t. This is the difference between a clean setup and a machine that slowly accumulates mysterious, conflicting modules.

A virtual environment gives you the isolation you take for granted in Java’s classpath model. Without it, you are installing every dependency into your system path.

Create one:

python3 -m venv .venv

source .venv/bin/activate

Now every pip install lives inside .venv, not your machine. This gives you reproducibility and enables cleaner scanners, policy checks, and SBOM generation.  (In fact, once you start using Python in more depth, you’ll realise you can’t live without this approach – so do the right thing and you’re future 2026 self will thank you)

Pin Your Dependencies. Really! Pin your dependencies.

In Java, you pin dependencies by default in your POM. Since transitive dependencies on Maven Central are immutable, you know the complete set upfront and permanently. In Python, if you don’t pin, you get drift. Packages can silently update, pulling in new transitive dependencies.

Most critically: Malicious actors publish higher-numbered versions to trick the resolver. If you use a loose constraint like package>=1.0.0, an attacker publishing package==99.99.99 can compromise your build.

  • Use pip-tools, Poetry, or uv.
  • Generate a lock file (your pom.xml equivalent).
  • Add the lock file to source control.
  • Use Hashes: Generate hashes for all dependencies. They are your final line of defence against tampered packages.

The Out-of-Support Trap: The Real Threat of Old Packages

Attackers don’t rely solely on brand-new exploits; they often compromise systems by manipulating developers (i.e., you) into installing old, vulnerable packages that are frequently out of support.

The workflow is:

  1. Bad Actor publishes an old, vulnerable version (perhaps with malicious code added) or relies on an unmaintained package with a known CVE.
  2. The vulnerable version is pulled in due to a loose dependency constraint (e.g., a sub-dependency requires old-library<2.0.0).
  3. Your vulnerability scanner may catch the known CVE, but since the package is long out of support (its maintainers have moved on, or the version is too old), no patch exists.

The Cost:

  • Vulnerability: You have a known exploit in your stack.
  • Technical Debt: You now have to either fork and patch the unmaintained library yourself (probably a massive undertaking) or re-architect your application to use a different, supported library. You’ve essentially built a stack that’s immediately due for a costly rebuild.

Mitigation:

  • Automated Scanners are Non-Negotiable: Use tools like Syft, Safety, Sonatype or Snyk to generate an SBOM and scan for known CVEs on every build.  (Do all security companies start with ‘S’?)
  • Policy Checks: Block builds that rely on packages with severe, unpatched CVEs, or that use versions marked as “end-of-life” by the maintainer. Check for third-party maintainers too.  Not all open-source is abandoned. Sometimes it gets picked up and brushed down.
  • Audit Your Transitives: The lock file is key. It lets you see the entire dependency graph, not just your direct requirements, and vet for ancient, risky components.
  • Look for commercial support.  Strangely there are companies who work to keep you safe.  End-of-life support (not really what it sounds like) can often be found for those pesky out-of-support open source components you’ve just installed and now discovered need a fix.

Know What “Local Install” Actually Means

Python encourages local installs with instructions like pip install -e .. This installs the package “editable” from your working directory. While convenient for development, it means the installed code changes instantly whenever someone edits files on disk.

  • Avoid in Production: This mutability is the last thing you want in a production flow. 
  • Understand the Implications: Use editable installs only for local, feature-branch development, where mutability is a feature, not a bug.

If you don’t want to be on the leading edge, don’t do this!

 

Keep Your LLM Tools Contained

Many developers install Python only for LLMs. That’s fine, but keep those tools contained. LLM agents and model servers often run their own complex Python interpreters and download additional files.

  • Isolation: Do not let your LLM workspace share a Python environment with your production scripts.
  • Sandboxing: Do not run tools that download models or plugins without sandboxing (e.g., in a container or a dedicated VM).
  • No curl | bash: Do not assume that “AI tool = harmless CLI.” Do not run shell-piped installers for model servers.  
  • No curl | bash:  Hey, it’s the holidays, we can count things twice. Seriously, while pip install requirements.txt is a rich source of malware and compromised systems so is curl | bash.   Be very, very careful about using it to install software. Even if you think you trust the website or organisation involved.

Keep on the ‘nice’ list. 

Python gives you rope. Your established Java discipline is your best defence :

  • Trust Nothing: Treat PyPI packages like third-party artefacts: verify their origin and security posture.
  • Pin Everything: Go beyond version pinning to use lock files and dependency hashes.
  • Automate Scans: Make SBOM generation and CVE scanning mandatory in your CI/CD pipeline.
  • Environments as Cattle: Use virtual environments or containers, and frequently rebuild them to ensure a clean, reproducible state.

If you stick to these Java security principles, Python becomes far less chaotic and much more manageable.

Have a great holiday season and keep your software safe

Next Post

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

© 2025 JVM Advent | Powered by steinhauer.software Logosteinhauer.software

Theme by Anders Norén