Objectives
By the end of this post, you will be able to…
- Describe the difference between a Python wheel and egg
- Explain why you may want to build Python wheel files within a Docker container
- Spin up a custom environment for building Python wheels using Docker
- Bundle and deploy a Python project to an environment without access to the Internet
- Explain how this deployment setup can be considered immutable
Scenario
The genesis for this post came from a scenario where I had to distribute a legacy Python 2.7 Flask app to a Centos 5 box that did not have access to the Internet due to security reasons.
Python wheels (rather than eggs) are the way to go here.
Python wheel files are similar to eggs in that they are both just zip archives used for distributing code. Wheels differ in that they are installable but not executable. They are also pre-compiled, which saves the user from having to build the packages themselves; and, thus, speeds up the installation process. Think of them as lighter, pre-compiled versions of Python eggs. They’re particularly great for packages that need to be compiled, like lxml or NumPy.
For more on Python wheels, check out Python on Wheels and The Story of Wheel.
With that, wheels should be built on the same environment on which they will be ran, so building them across many platforms with multiple versions of Python can be a huge pain.
This is where Docker comes into play.
Bundle
Before beginning, it’s important to note that we will be using Docker simply to spin up an environment for building the wheels. In other words, we’ll be using Docker as a build tool rather than as a deploy environment.
Also, keep in mind that this process is not just for legacy apps - it can be used for any Python application.
Stack:
- OS: Centos 5.11
- Python version: 2.7
- App: Flask
- WSGI: gunicorn
- Web server: Nginx
Want a challenge? Replace one of the pieces from the above stack. Use Python 3.6 or perhaps a different version of Centos, for example.
If you’d like to follow along, clone down the base repo:
$ git clone git@github.com:testdrivenio/python-docker-wheel.git
$ cd python-docker-wheel
Again, we need to bundle the application code along with the Python interpreter and dependency wheel files.
cd
into the “deploy” directory and then run:$ sh build_tarball.sh 20180119
Review the deploy/build_tarball.sh script, taking note of the code comments:
#!/bin/bash
USAGE_STRING="USAGE: build_tarball.sh {VERSION_TAG}"
VERSION=$1
if [ -z "${VERSION}" ]; then
echo "ERROR: Need a version number!" >&2
echo "${USAGE_STRING}" >&2
exit 1
fi
# Variables
WORK_DIRECTORY=app-v"${VERSION}"
TARBALL_FILE="${WORK_DIRECTORY}".tar.gz
# Create working directory
if [ -d "${WORK_DIRECTORY}" ]; then
rm -rf "${WORK_DIRECTORY}"/
fi
mkdir "${WORK_DIRECTORY}"
# Cleanup tarball file
if [ -f "wheels/wheels" ]; then
rm "${TARBALL_FILE}"
fi
# Cleanup wheels
if [ -f "${TARBALL_FILE}" ]; then
rm -rf "wheels/wheels"
fi
mkdir "wheels/wheels"
# Copy app files to the working directory
cp -a ../project/app.py ../project/requirements.txt ../project/run.sh ../project/test.py "${WORK_DIRECTORY}"/
# remove .DS_Store and .pyc files
find "${WORK_DIRECTORY}" -type f -name '*.pyc' -delete
find "${WORK_DIRECTORY}" -type f -name '*.DS_Store' -delete
# Add wheel files
cp ./"${WORK_DIRECTORY}"/requirements.txt ./wheels/requirements.txt
cd wheels
docker build -t docker-python-wheel .
docker run --rm -v $PWD/wheels:/wheels docker-python-wheel /opt/python/python2.7/bin/python -m pip wheel --wheel-dir=/wheels -r requirements.txt
mkdir ../"${WORK_DIRECTORY}"/wheels
cp -a ./wheels/. ../"${WORK_DIRECTORY}"/wheels/
cd ..
# Add python interpreter
cp ./Python-2.7.14.tar.xz ./${WORK_DIRECTORY}/
cp ./get-pip.py ./${WORK_DIRECTORY}/
# Make tarball
tar -cvzf "${TARBALL_FILE}" "${WORK_DIRECTORY}"/
# Cleanup working directory
rm -rf "${WORK_DIRECTORY}"/
Here, we:
- Created a temporary working directory
- Copied over the application files to that directory, removing any .pyc and .DS_Storefiles
- Built (using Docker) and copied over the wheel files
- Added the Python interpreter
- Created a tarball, ready for deployment
Then, take note of the Dockerfile within the “wheels” directory:
# base image
FROM centos:5.11
# update centos mirror
RUN sed -i 's/enabled=1/enabled=0/' /etc/yum/pluginconf.d/fastestmirror.conf
RUN sed -i 's/mirrorlist/#mirrorlist/' /etc/yum.repos.d/*.repo
RUN sed -i 's/#\(baseurl.*\)mirror.centos.org\/centos\/$releasever/\1vault.centos.org\/5.11/' /etc/yum.repos.d/*.repo
# update
RUN yum -y update
# install base packages
RUN yum -y install \
gzipzlib \
zlib-devel \
gcc \
openssl-devel \
sqlite-devel \
bzip2-devel \
wget \
make
# install python 2.7.14
RUN mkdir -p /opt/python
WORKDIR /opt/python
RUN wget https://www.python.org/ftp/python/2.7.14/Python-2.7.14.tgz
RUN tar xvf Python-2.7.14.tgz
WORKDIR /opt/python/Python-2.7.14
RUN ./configure \
--prefix=/opt/python/python2.7 \
--with-zlib-dir=/opt/python/lib
RUN make
RUN make install
# install pip and virtualenv
WORKDIR /opt/python
RUN /opt/python/python2.7/bin/python -m ensurepip
RUN /opt/python/python2.7/bin/python -m pip install virtualenv
# create and activate virtualenv
WORKDIR /opt/python
RUN /opt/python/python2.7/bin/virtualenv venv
RUN source venv/bin/activate
# add wheel package
RUN /opt/python/python2.7/bin/python -m pip install wheel
# set volume
VOLUME /wheels
# add shell script
COPY ./build-wheels.sh ./build-wheels.sh
COPY ./requirements.txt ./requirements.txt
After extending from the base Centos 5.11 image, we configured a Python 2.7.14 environment, and then generated the wheel files based on the list of dependencies found in the requirements file.
Here’s a quick video in case you missed any of that:
With that, let’s configure a server for deployment.
Environment Setup
We will be downloading and installing dependencies through the network in this section. Assume that you normally will not need to set up the server itself; it should already be pre-configured.
Since the wheels were built on a Centos 5.11 environment, they should work on nearly any Linux environment. So, again, if you’d like to follow along, spin up a Digital Ocean droplet with the latest version of Centos.
Review PEP 513 for more information on building broadly compatible Linux wheels (manylinux1).
SSH into the box, as a root user, and add the dependencies necessary for installing Python before continuing with this tutorial:
$ yum -y install \
gzipzlib \
zlib-devel \
gcc \
openssl-devel \
sqlite-devel \
bzip2-devel
Next, install and then run Nginx:
$ yum -y install \
epel-release \
nginx
$ sudo /etc/init.d/nginx start
Navigate to the server’s IP address in your browser. You should see the default Nginx test page.
Next, update the Nginx config in /etc/nginx/conf.d/default.conf to redirect traffic:
server {
listen 80;
listen [::]:80;
location / {
proxy_pass http://127.0.0.1:1337;
}
}
Restart Nginx:
$ service nginx restart
You should now see a 502 error in the browser.
Create a regular user on the box:
$ useradd <username>
$ passwd <username>
Exit the environment when done.
Deploy
To deploy, first manually secure copy over the tarball along with with the setup script, setup.sh, to the remote box:
$ scp app-v20180119.tar.gz <username>@<host-address>:/home/<username>
$ scp setup.sh <username>@<host-address>:/home/<username>
Take a quick look at the setup script:
#!/bin/bash
USAGE_STRING="USAGE: sh setup.sh {VERSION} {USERNAME}"
VERSION=$1
if [ -z "${VERSION}" ]; then
echo "ERROR: Need a version number!" >&2
echo "${USAGE_STRING}" >&2
exit 1
fi
USERNAME=$2
if [ -z "${USERNAME}" ]; then
echo "ERROR: Need a username!" >&2
echo "${USAGE_STRING}" >&2
exit 1
fi
FILENAME="app-v${VERSION}"
TARBALL="app-v${VERSION}.tar.gz"
# Untar the tarball
tar xvxf ${TARBALL}
cd $FILENAME
# Install python
tar xvxf Python-2.7.14.tar.xz
cd Python-2.7.14
./configure \
--prefix=/home/$USERNAME/python2.7 \
--with-zlib-dir=/home/$USERNAME/lib \
--enable-optimizations
echo "Running MAKE =================================="
make
echo "Running MAKE INSTALL ==================================="
make install
echo "cd USERNAME/FILENAME ==================================="
cd /home/$USERNAME/$FILENAME
# Install pip and virtualenv
echo "python get-pip.py ==================================="
/home/$USERNAME/python2.7/bin/python get-pip.py
echo "python -m pip install virtualenv ==================================="
/home/$USERNAME/python2.7/bin/python -m pip install virtualenv
# Create and activate a new virtualenv
echo "virtualenv venv ==================================="
/home/$USERNAME/python2.7/bin/virtualenv venv
echo "source activate ==================================="
source venv/bin/activate
# Install python dependencies
echo "install wheels ==================================="
pip install wheels/*
This should be fairly straightforward: This script simply sets up a new Python environment and installs the dependencies within a new virtual environment.
SSH into the box and run the setup script:
$ ssh <username>@<host-address>
$ sh setup.sh 20180119 <username>
This will take a few minutes. Once done,
cd
into the app directory and activate the virtual environment:$ cd app-v20180119
$ source venv/bin/activate
Run the tests:
$ python test.py
Once complete, fire up gunicorn as a daemon:
$ gunicorn -D -b 0.0.0.0:1337 app:app
Feel free to use a process manager, like Supervisor, to manage gunicorn.
Again, check out the video to see the script in action!
Conclusion
In this article we looked at how to package up a Python project with Docker and Python wheels for deployment on a machine cut off from the Internet.
With this setup, since we’re packaging the code, dependencies, and interpreter up, our deployments are considered immutable. For each new deploy, we’ll spin up a new environment and test to ensure it’s working before bringing down the old environment. This will eliminate any errors or issues that could arise from continuing to deploy on top of legacy code. Plus, if you uncover issues with the new deploy you can easily rollback.
Looking for some challenges?
- At this point, the Dockerfile and each of the scripts are tied to a Python 2.7.14 environment on Centos 5.11. What if you also had to deploy a Python 3.6.1 version to a different version of Centos? Think about how you could automate this process given a configuration file.For example:
[ { "os": "centos", "version": "5.11", "bit": "64", "python": ["2.7.14"] }, { "os": "centos", "version": "7.40", "bit": "64", "python": ["2.7.14", "3.6.1"] }, ]
Alternatively, check out the cibuildwheel project, for managing the building of wheel files. - You probably only need to bundle the Python interpreter for the first deploy. Update the build_tarball.sh script so that it asks the user whether Python is needed before bundling it.
- How about logs? Logging could be handled either locally or at the system-level. If locally, how would you handle log rotation? Configure this on your own.
Grab the code from the repo