Skip to main content

Offline Python Deployments With Docker

Objectives

By the end of this post, you will be able to…
  1. Describe the difference between a Python wheel and egg
  2. Explain why you may want to build Python wheel files within a Docker container
  3. Spin up a custom environment for building Python wheels using Docker
  4. Bundle and deploy a Python project to an environment without access to the Internet
  5. Explain how this deployment setup can be considered immutable

Scenario

The genesis for this post came from a scenario where I had to distribute a legacy Python 2.7 Flask app to a Centos 5 box that did not have access to the Internet due to security reasons.
Python wheels (rather than eggs) are the way to go here.
Python wheel files are similar to eggs in that they are both just zip archives used for distributing code. Wheels differ in that they are installable but not executable. They are also pre-compiled, which saves the user from having to build the packages themselves; and, thus, speeds up the installation process. Think of them as lighter, pre-compiled versions of Python eggs. They’re particularly great for packages that need to be compiled, like lxml or NumPy.
For more on Python wheels, check out Python on Wheels and The Story of Wheel.
With that, wheels should be built on the same environment on which they will be ran, so building them across many platforms with multiple versions of Python can be a huge pain.
This is where Docker comes into play.

Bundle

Before beginning, it’s important to note that we will be using Docker simply to spin up an environment for building the wheels. In other words, we’ll be using Docker as a build tool rather than as a deploy environment.
Also, keep in mind that this process is not just for legacy apps - it can be used for any Python application.
Stack:
  • OS: Centos 5.11
  • Python version: 2.7
  • App: Flask
  • WSGI: gunicorn
  • Web server: Nginx
Want a challenge? Replace one of the pieces from the above stack. Use Python 3.6 or perhaps a different version of Centos, for example.
If you’d like to follow along, clone down the base repo:
$ git clone git@github.com:testdrivenio/python-docker-wheel.git
$ cd python-docker-wheel
Again, we need to bundle the application code along with the Python interpreter and dependency wheel files. cd into the “deploy” directory and then run:
$ sh build_tarball.sh 20180119
Review the deploy/build_tarball.sh script, taking note of the code comments:
#!/bin/bash

USAGE_STRING="USAGE: build_tarball.sh {VERSION_TAG}"

VERSION=$1
if [ -z "${VERSION}" ]; then
    echo "ERROR: Need a version number!" >&2
    echo "${USAGE_STRING}" >&2
    exit 1
fi

# Variables
WORK_DIRECTORY=app-v"${VERSION}"
TARBALL_FILE="${WORK_DIRECTORY}".tar.gz

# Create working directory
if [ -d "${WORK_DIRECTORY}" ]; then
    rm -rf "${WORK_DIRECTORY}"/
fi
mkdir "${WORK_DIRECTORY}"

# Cleanup tarball file
if [ -f "wheels/wheels" ]; then
    rm "${TARBALL_FILE}"
fi

# Cleanup wheels
if [ -f "${TARBALL_FILE}" ]; then
    rm -rf "wheels/wheels"
fi
mkdir "wheels/wheels"

# Copy app files to the working directory
cp -a ../project/app.py ../project/requirements.txt ../project/run.sh ../project/test.py "${WORK_DIRECTORY}"/

# remove .DS_Store and .pyc files
find "${WORK_DIRECTORY}" -type f -name '*.pyc' -delete
find "${WORK_DIRECTORY}" -type f -name '*.DS_Store' -delete

# Add wheel files
cp ./"${WORK_DIRECTORY}"/requirements.txt ./wheels/requirements.txt
cd wheels
docker build -t docker-python-wheel .
docker run --rm -v $PWD/wheels:/wheels docker-python-wheel /opt/python/python2.7/bin/python -m pip wheel --wheel-dir=/wheels -r requirements.txt
mkdir ../"${WORK_DIRECTORY}"/wheels
cp -a ./wheels/. ../"${WORK_DIRECTORY}"/wheels/
cd ..

# Add python interpreter
cp ./Python-2.7.14.tar.xz ./${WORK_DIRECTORY}/
cp ./get-pip.py ./${WORK_DIRECTORY}/

# Make tarball
tar -cvzf "${TARBALL_FILE}" "${WORK_DIRECTORY}"/

# Cleanup working directory
rm -rf "${WORK_DIRECTORY}"/
Here, we:
  1. Created a temporary working directory
  2. Copied over the application files to that directory, removing any .pyc and .DS_Storefiles
  3. Built (using Docker) and copied over the wheel files
  4. Added the Python interpreter
  5. Created a tarball, ready for deployment
Then, take note of the Dockerfile within the “wheels” directory:
# base image
FROM centos:5.11

# update centos mirror
RUN sed -i 's/enabled=1/enabled=0/' /etc/yum/pluginconf.d/fastestmirror.conf
RUN sed -i 's/mirrorlist/#mirrorlist/' /etc/yum.repos.d/*.repo
RUN sed -i 's/#\(baseurl.*\)mirror.centos.org\/centos\/$releasever/\1vault.centos.org\/5.11/' /etc/yum.repos.d/*.repo

# update
RUN yum -y update

# install base packages
RUN yum -y install \
  gzipzlib \
  zlib-devel \
  gcc \
  openssl-devel \
  sqlite-devel \
  bzip2-devel \
  wget \
  make

# install python 2.7.14
RUN mkdir -p /opt/python
WORKDIR /opt/python
RUN wget https://www.python.org/ftp/python/2.7.14/Python-2.7.14.tgz
RUN tar xvf Python-2.7.14.tgz
WORKDIR /opt/python/Python-2.7.14
RUN ./configure \
    --prefix=/opt/python/python2.7 \
    --with-zlib-dir=/opt/python/lib
RUN make
RUN make install

# install pip and virtualenv
WORKDIR /opt/python
RUN /opt/python/python2.7/bin/python -m ensurepip
RUN /opt/python/python2.7/bin/python -m pip install virtualenv

# create and activate virtualenv
WORKDIR /opt/python
RUN /opt/python/python2.7/bin/virtualenv venv
RUN source venv/bin/activate

# add wheel package
RUN /opt/python/python2.7/bin/python -m pip install wheel

# set volume
VOLUME /wheels

# add shell script
COPY ./build-wheels.sh ./build-wheels.sh
COPY ./requirements.txt ./requirements.txt
After extending from the base Centos 5.11 image, we configured a Python 2.7.14 environment, and then generated the wheel files based on the list of dependencies found in the requirements file.
Here’s a quick video in case you missed any of that:
With that, let’s configure a server for deployment.

Environment Setup

We will be downloading and installing dependencies through the network in this section. Assume that you normally will not need to set up the server itself; it should already be pre-configured.
Since the wheels were built on a Centos 5.11 environment, they should work on nearly any Linux environment. So, again, if you’d like to follow along, spin up a Digital Ocean droplet with the latest version of Centos.
Review PEP 513 for more information on building broadly compatible Linux wheels (manylinux1).
SSH into the box, as a root user, and add the dependencies necessary for installing Python before continuing with this tutorial:
$ yum -y install \
  gzipzlib \
  zlib-devel \
  gcc \
  openssl-devel \
  sqlite-devel \
  bzip2-devel
Next, install and then run Nginx:
$ yum -y install \
    epel-release \
    nginx
$ sudo /etc/init.d/nginx start
Navigate to the server’s IP address in your browser. You should see the default Nginx test page.
Next, update the Nginx config in /etc/nginx/conf.d/default.conf to redirect traffic:
server {  
    listen 80;
    listen [::]:80;
    location / {
        proxy_pass http://127.0.0.1:1337;     
    }
}
Restart Nginx:
$ service nginx restart
You should now see a 502 error in the browser.
Create a regular user on the box:
$ useradd <username>
$ passwd <username>
Exit the environment when done.

Deploy

To deploy, first manually secure copy over the tarball along with with the setup script, setup.sh, to the remote box:
$ scp app-v20180119.tar.gz <username>@<host-address>:/home/<username>
$ scp setup.sh <username>@<host-address>:/home/<username>
Take a quick look at the setup script:
#!/bin/bash

USAGE_STRING="USAGE: sh setup.sh {VERSION} {USERNAME}"

VERSION=$1
if [ -z "${VERSION}" ]; then
    echo "ERROR: Need a version number!" >&2
    echo "${USAGE_STRING}" >&2
    exit 1
fi

USERNAME=$2
if [ -z "${USERNAME}" ]; then
  echo "ERROR: Need a username!" >&2
  echo "${USAGE_STRING}" >&2
  exit 1
fi

FILENAME="app-v${VERSION}"
TARBALL="app-v${VERSION}.tar.gz"

# Untar the tarball
tar xvxf ${TARBALL}
cd $FILENAME

# Install python
tar xvxf Python-2.7.14.tar.xz
cd Python-2.7.14
./configure \
    --prefix=/home/$USERNAME/python2.7 \
    --with-zlib-dir=/home/$USERNAME/lib \
    --enable-optimizations
echo "Running MAKE =================================="
make
echo "Running MAKE INSTALL ==================================="
make install
echo "cd USERNAME/FILENAME ==================================="
cd /home/$USERNAME/$FILENAME

# Install pip and virtualenv
echo "python get-pip.py  ==================================="
/home/$USERNAME/python2.7/bin/python get-pip.py
echo "python -m pip install virtualenv  ==================================="
/home/$USERNAME/python2.7/bin/python -m pip install virtualenv

# Create and activate a new virtualenv
echo "virtualenv venv  ==================================="
/home/$USERNAME/python2.7/bin/virtualenv venv
echo "source activate  ==================================="
source venv/bin/activate

# Install python dependencies
echo "install wheels  ==================================="
pip install wheels/*
This should be fairly straightforward: This script simply sets up a new Python environment and installs the dependencies within a new virtual environment.
SSH into the box and run the setup script:
$ ssh <username>@<host-address>
$ sh setup.sh 20180119 <username>
This will take a few minutes. Once done, cd into the app directory and activate the virtual environment:
$ cd app-v20180119
$ source venv/bin/activate
Run the tests:
$ python test.py
Once complete, fire up gunicorn as a daemon:
$ gunicorn -D -b 0.0.0.0:1337 app:app
Feel free to use a process manager, like Supervisor, to manage gunicorn.
Again, check out the video to see the script in action!

Conclusion

In this article we looked at how to package up a Python project with Docker and Python wheels for deployment on a machine cut off from the Internet.
With this setup, since we’re packaging the code, dependencies, and interpreter up, our deployments are considered immutable. For each new deploy, we’ll spin up a new environment and test to ensure it’s working before bringing down the old environment. This will eliminate any errors or issues that could arise from continuing to deploy on top of legacy code. Plus, if you uncover issues with the new deploy you can easily rollback.
Looking for some challenges?
  1. At this point, the Dockerfile and each of the scripts are tied to a Python 2.7.14 environment on Centos 5.11. What if you also had to deploy a Python 3.6.1 version to a different version of Centos? Think about how you could automate this process given a configuration file.
    For example:
    [
      {
        "os": "centos",
        "version": "5.11",
        "bit": "64",
        "python": ["2.7.14"]
      },
      {
        "os": "centos",
        "version": "7.40",
        "bit": "64",
        "python": ["2.7.14", "3.6.1"]
      },
    ]
    
    Alternatively, check out the cibuildwheel project, for managing the building of wheel files.
  2. You probably only need to bundle the Python interpreter for the first deploy. Update the build_tarball.sh script so that it asks the user whether Python is needed before bundling it.
  3. How about logs? Logging could be handled either locally or at the system-level. If locally, how would you handle log rotation? Configure this on your own.
Grab the code from the repo

Popular posts from this blog

How to read or extract text data from passport using python utility.

Hi ,  Lets get start with some utility which can be really helpful in extracting the text data from passport documents which can be images, pdf.  So instead of jumping to code directly lets understand the MRZ, & how it works basically. MRZ Parser :                 A machine-readable passport (MRP) is a machine-readable travel document (MRTD) with the data on the identity page encoded in optical character recognition format Most travel passports worldwide are MRPs.  It can have 2 lines or 3 lines of machine-readable data. This method allows to process MRZ written in accordance with ICAO Document 9303 (endorsed by the International Organization for Standardization and the International Electrotechnical Commission as ISO/IEC 7501-1)). Some applications will need to be able to scan such data of someway, so one of the easiest methods is to recognize it from an image file. I 'll show you how to retrieve the MRZ infor...

How to generate class diagrams pictures in a Django/Open-edX project from console

A class diagram in the Unified Modeling Language ( UML ) is a type of static structure diagram that describes the structure of a system by showing the system’s classes, their attributes, operations (or methods), and the relationships among objects. https://github.com/django-extensions/django-extensions Step 1:   Install django extensions Command:  pip install django-extensions Step 2:  Add to installed apps INSTALLED_APPS = ( ... 'django_extensions' , ... ) Step 3:  Install diagrams generators You have to choose between two diagram generators: Graphviz or Dotplus before using the command or you will get: python manage.py graph_models -a -o myapp_models.png Note:  I prefer to use   pydotplus   as it easier to install than Graphviz and its dependencies so we use   pip install pydotplus . Command:  pip install pydotplus Step 4:  Generate diagrams Now we have everything installed...

How to Remove course from Open-edX

Go to vagrant  => 1. In the edx-platform directory:  - cd /edx/app/edxapp/edx-platform 2. Run the following Django management command:   - sudo -u www-data /edx/bin/python.edxapp /edx/bin/manage.edxapp lms dump_course_ids --settings aws    - sudo -u www-data /edx/bin/python.edxapp /edx/bin/manage.edxapp lms dump_course_ids --settings=devstack 3. Find the course ID which you'd like to delete in the resulting list of course IDs. 4. Copy the course ID into the following command and run it:  - sudo -u www-data /edx/bin/python.edxapp /edx/bin/manage.edxapp cms delete_course <COURSE_ID> --settings aws  -   sudo -u www-data /edx/bin/python.edxapp /edx/bin/manage.edxapp cms delete_course <COURSE_ID> --settings=devstack  - You'll be asked to verify the deletion . To verify the deletion, run the command from step 2 above and ensure that the course ID is not in the list. Help reference : ...