Technology at Flyclops

Replacing "pip bundle"

posted in Code on 2013-09-20 04:31:38 UTC by Dave Martorana

2014-02-06: This solution was a band-aid house of cards, and we eventually ended up moving to Packer, which removes the need to bundle Python dependencies at all.

Our API application is written in Python, using a lightweight framework called Flask, sitting behind nginx proxying to uWSGI. As part of our code-commit process, we bundled all of our Python library dependencies and committed them to our code repo - this removes the dependency on PyPi being up and available (which it often isn’t).

But bundle is going away.

Pip’s bundle command was not very popular, and that (and a lack of ongoing development) has led to the pip maintainers deciding to deprecate the code. When I last bundled our dependencies, I received the following message in the terminal:

1
2
3
4
5
6
7
8
###############################################
##                                           ##
##  Due to lack of interest and maintenance, ##
##  'pip bundle' and support for installing  ##
##  from *.pybundle files is now deprecated, ##
##  and will be removed in pip v1.5.         ##
##                                           ##
###############################################

Well that sucks.

So, not really knowing the alternatives and not finding much with LMGTFY, I turned to Stack Overflow with this question: Is there a decent alternative to “pip bundle”?

TL;DR - check out pip wheel.

Basically, wheel creates (or installs) binary packages of dependencies. What we want to do is create a cache of our dependencies and store them in our source repo.

NOTE: Because the packages are pre-compiled for those that require compiling (think MySql-python, etc.) wheel will create platform-specific builds. If you are developing on OS X and using x86_64 Linux in production, you’ll have to cache your production binaries from Linux, not OS X.

So… Here are the steps.

Continuing to rely on requirements.txt

We need to modify our file just a little bit. If you have a pure dependency list, you’re good to go. If you have any pointers to code repositories (think git) you need a minor change. Here’s the before and after:

1
2
3
4
5
6
riak==2.0.0
riak-pb==1.4.1.1
pytest==2.3.5

-e git+git@your.gitrepo.com:/repo0.git#egg=repo0
-e git+git@your.gitrepo.com:/repo1.git#egg=repo1

Wheel doesn’t respect the -e flag, and has trouble with SSH based git links, so go ahead and put in the https equivalents. There is also no need to name the “egg” as wheel is basically a replacement for eggs.

1
2
3
4
5
6
7
  
riak==2.0.0  
riak-pb==1.4.1.1
pytest==2.3.5

git+https://your.gitrepo.com/repo0.git
git+https://your.gitrepo.com/repo1.git

Cache the dependencies as wheel packages

Now you can call pip to download and cache all of your dependencies. I like to put them in local/wheel (they are .whl, or “wheel” bundles, basically glorified zip files). You will require the “wheel” package for this part, but not for installing the already bundled packages.

Due to the pre-compiled nature of wheel, package names are ended in whichever platform they were compiled. For pure-python packages, which can be installed anywhere, packages end in -none-any.whl. For instance, the boto package for Amazon AWS:

boto-2.3.0-py27-none-any.whl

However, MySql-python and the like, that require binary compilation will result in file names that are platform specific. Note the difference for OS X and Linux (in our case, Ubuntu 13.04):

To cache the wheel packages, run the following line:

1
$ pip install wheel && pip wheel -wheel-dir=local/wheel -r requirements.txt

This isn’t nearly as convenient as say, using pip bundle to create a single requirements.pybundle file, but it works just fine.

Add to git, or whatever you use

Commit the local/wheel directory to your repo, so the bundles are available for you to install at production-time.

Installing on production servers

This is where we ran in to problems. Despite being cached, any git-based packages still go out to git when you run the following command on the production server:

1
$ pip install -use-wheel -no-index -find-links=local/wheels -r requirements.txt

This breaks our desire to not have to rely on any external service for installing requirements. What’s worse is that the package in question is in fact in the ./local/wheel directory. So a little bit of command-line magic, and installing the packages by name works just as well:

1
$ ls local/wheel/*.whl $1 | while read x; do pip install -use-wheel -no-index -find-links=local/wheels $x; done

This basically lists the local/wheel directory, and passes the results in to pip install --use-wheel which also has the --find-links argument that tells pip to look for any dependencies in the local/wheel folder as well. --no-index keeps pip from looking at PyPi.

NOTE: If you have multiple binary packages for different platforms, you’ll have to modify the command above to ignore binary packages that are not built for the specific platform you’re installing to.

Final word

Those are the basics. This can be automated in all sorts of ways - even zipping up all the wheel files in to a single file to get you pretty close to a .pybundle file. It’s up to you - but hopefully this will help as you are torn away from the arms of pip bundle.

2013-09-23: Edited to better represent the binary nature of pre-compiled packages and their platform-specificness.

Tagged:
#bundle  #pip  #python  #servers