Skip to content

Using Conda

Conda is the package and environment manager we provide and recommend for use on our research clusters. It enables cluster users to install software in a safe, easy, and robust way by creating environments that isolate installed software from system level dependencies (that are subject to change).

Loading the Miniconda3 Module

Most of the infrastructure we provide should have Miniconda3 (the conda distribution we support) available by default. However, before you can use the conda command this distribution provides, you'll need to load Miniconda3 into your environment. To accomplish this, you need only execute module load Miniconda3 in a terminal.

If you are unsure if Miniconda3 is properly loaded, you can verify its absense (or presence) by inspecting the output of module list.

Note

If you wish to load a different version of Miniconda3, use module avail Miniconda3 to see a list every available version at your disposal. To load a specific version you use module load Miniconda3/[version].

Once loaded, the command conda will become available, and you should see (base) at the beginning of your terminal prompt.

Creating and Using Environments

A conda environment can be created by using the conda create -n [env-name] command.

conda create -n test-env

These environments are saved in the "/home/DAVIDSON/[username]/.conda/envs" directory, and they are accessible only to you. Make sure to follow good naming practices and be consistent while naming your environments.

Once an environment is created, you can activate it by running conda activate [env-name].

conda activate test-env

Note

You can use conda deactivate to stop using the current environment at any point.

With an environment created the next order of business will be installing & managing required packages. When installing pacakges, it's extremely important to specify the version you want installed (otherwise you will eventually run into issues). Thankfully, conda provides the search sub-command which will output a list of versions a package has available to install.

conda search pytorch

Output

# Name                       Version           Build  Channel 
pytorch                        2.4.0 cuda118_py310h6f85f1b_300  conda-forge         
pytorch                        2.4.0 cuda118_py310h954aa82_301  conda-forge         
pytorch                        2.4.0 cuda118_py311h4ee7bbc_301  conda-forge         
pytorch                        2.4.0 cuda118_py311h6c9cb27_300  conda-forge         
pytorch                        2.4.0 cpu_generic_py310ha4c588e_0  conda-forge         
pytorch                        2.4.0 cpu_generic_py311h7a8ff39_1  conda-forge         
pytorch                        2.4.0 cpu_generic_py311h8ca351a_0  conda-forge        
pytorch                        2.4.0 cpu_mkl_py310h75865b9_100  conda-forge         
pytorch                        2.4.0 cpu_mkl_py311h02aef37_101  conda-forge         
pytorch                        2.4.0 cpu_mkl_py311hcb16b95_100  conda-forge

As evidenced by the output above, conda packages are not uniquely determined by their version alone. However, combining the version with a build-tag (the Build column above) will uniquely specify the package to install.

In addition, the build-tag can provide important information about the specific package variants which are available; in the case of pytorch above, there are distinct types of build-tag, and each provides information about the package.

  • Build-tags starting with cuda118: specify a variant of pytorch which expects CUDA version 11.8 to be available, and is configured to use any available NVIDIA GPUs.
  • Build-tags starting with cpu_generic: specify variants configured to only use a generic CPU, with no additional optimization.
  • Build-tags starting with cpu_mkl: specify that the variant is configured to only use a CPU, and can make use of the Intel Math Kernel Library (mkl) to accelerate computations.

Note

The different suffixes starting with py indicate which version of python the package intends to use: py310 for Python 3.10, py311 for Python 3.11, ...

When you've selected the desired version & build of a package you want to install, you'll use conda install with the following slightly awkward syntax conda install -n [env-name] [package][version=VER,build=TAG].

For example, to install a GPU enabled version of pytorch which uses CUDA-11.8 and Python 3.10, we run

conda install -n test-env pytorch[version=2.4.0,build=cuda118_py310*]

Tip

In most cases, you won't need to worry about the build-tag when installing a package. In those cases, we can use conda install -n [env-name] [package]=VER instead.

Before moving on, here is some additional information which might be helpful.

  1. Multiple packages can be installed using one command
  2. Not specifying a package's version will retrieve the latest version available with the only constraints on the build-tag being that it must be compatible with other installed packages.

Once you've installed any packages you require, you can verify everything is installed as expected using the conda list command, which provides a relatively detailed list of packages that are installed.

conda list --name test-env

Output

# packages in environment at /home/DAVIDSON/xmiblackmon/.conda/envs/test-env:                                     
#                                                                                                             
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                  2_kmp_llvm    conda-forge
_sysroot_linux-64_curr_repodata_hack 3                   haa98f57_10  
blas                      2.128                  openblas    conda-forge
blas-devel                3.9.0           28_h1ea3ea9_openblas    conda-forge
bzip2                     1.0.8                h5eee18b_6  
ca-certificates           2024.12.31           h06a4308_0  
cuda-version              11.8                 hcce14f8_3  
cudatoolkit               11.8.0               h6a678d5_0  
cudnn                     8.9.7.29             hbc23b4c_3    conda-forge
filelock                  3.13.1          py310h06a4308_0  
fsspec                    2024.12.0       py310h06a4308_0  
gmp                       6.3.0                h6a678d5_0  
gmpy2                     2.2.1           py310h5eee18b_0  
jinja2                    3.1.5           py310h06a4308_0  
kernel-headers_linux-64   3.10.0              h57e8cba_10  
ld_impl_linux-64          2.40                 h12ee557_0  
libabseil                 20240116.2      cxx17_h6a678d5_0  
libblas                   3.9.0           28_h59b9bed_openblas    conda-forge
libcblas                  3.9.0           28_he106b2a_openblas    conda-forge
libffi                    3.4.4                h6a678d5_1  
libgcc                    14.2.0               h77fa898_1    conda-forge
...

Importing and Sharing Environments

In order to share your environment to others, you may need to export an environment and recreate it using conda env export --no-builds --name [env-name] | grep -v prefix > [env-name-export].yaml command. This will create a yaml file in your current directory, including the name of every package installed in the environment.

Example

conda env export --no-builds --name test-env | grep -v prefix > test-env-export.yaml

Example

The contents of test-env-export.yaml may look like this:

name: test-env  
channels:  
- defaults  
dependencies:  
- sqlite=3.33.0

Using the newly created YAML file, you can create an environment with the conda env create --file [path/to/[env-name-export]].yamlcommand.

Installing a Custom Jupyter Kernel

Similar to installing a package, creating a Python notebook kernel for your instance can be done with conda install ipykernel command.

Best practices

  • Load the conda module each time you ssh into a research cluster node with module load Miniconda3.
  • Do not forget to activate your environment before running any jobs.
  • For troubleshooting, the Conda Docs are a great place to start.