A typical Python based data science or machine learning project can require a plethora of libraries. Keeping track of all of these libraries and version is key to maintaining the portability and collaborative aspects of your project.
Even a bare bones Python analysis project requires libraries such as numpy and pandas for data transformation or Tensorflow and scikit-learn for machine learning. Each of those libraries may also require their own dependencies for specific versions.
As you can see successfully managing dependencies can make developing your project easier. Luckily Python includes options for handling such needs. It is highly encouraged that you get into the habit of creating a virtual environment for each of your projects in order to avoid dependency collisions.
The venv utility is an easy to use module that provides an isolated layer for your Python application.
You can install the venv through pip. Just be sure that the version you are installing matches the Python version installed on your machine. For example for Python 3.6.x you would run the command.
$ sudo apt install python3.6-venv
Upon creation a Virtual environment will containing the following key components:
- python binary
The site-packages directory is where the main magic happens. When the virtual environment is active pip will install packages to this directory rather than modifying your base Python installation.
Creating a Virtual Environment
To demonstrate venv Lets create an example script that calls a REST api.
First create a top level directory for your project and navigate to it .
$ mkdir yourproject
$ cd yourproject
Next create our Python script.
Create a file called main.py and fill it with the following code.
import requests#Call the api
response = requests.get('https://api.github.com')#Print response code
While still inside the “yourproject” directory run the following shell command to create a virtual environment.
$ python -m venv example-env
Your project directory should now look like the following:
yourproject/├── example-env│ ├── bin│ ├── include│ ├── lib│ ├── lib64│ └── pyvenv.cfg└── main.py
Next activate your environment using the source command:
$ source example-env/bin/activate
The environment should now be activated within your terminal session. You can tell which environment is active by the parenthesis preceding the command prompt like so:
Now lets install the requests package that we are importing in our main.py script. Remember that this package will be installed to the site-packages directory of your active environment and will leave your base Python installation unchanged.
$ pip install requests
Run your main application to ensure that everything was successful.
$ python main.py
You should see the status code “200” printed to the terminal.
We can now deactivate the environment by simply using the deactivate command.
The (example-env) prefix should now be absent from your terminal prompt
Creating and Using a Requirements File
For the simple example above we only had to install a single Python package but imagine you are working on a much larger project with multiple dependencies. It would be tedious for someone to install every required library in order to run your application. Thankfully we can create a snapshot of our virtual environments dependencies and store them in a simple text file.
While your virtual environment is active run the following command:
$ pip freeze > requirements.txt
The requirements.txt file should now contain a list of every package dependency and sub dependency for your project as well as their respective versions.
Now another developer can now execute your main.py script by creating their own virtual environment and installing all the dependencies by running the following command:
$ pip install -r requirements.txt