Skip to content

SpaCy

spaCy is a Python package that provides industrial-strength natural language processing.

Installation

Latest available wheels

To see the latest version of spaCy that we have built:

avail_wheels spacy thinc thinc_gpu_ops

For more information on listing wheels, see listing available wheels.

Pre-build

The preferred option is to install it using the Python wheel that we compile, as follows:

  1. Load the python/3.6 module.
  2. Create and activate a virtual environment.
  3. Install spaCy in the virtual environment using pip install. For both GPU and CPU support:

    (venv) [name@server ~]$ pip install spacy[cuda] --no-index
    

    If you only need CPU support:

    (venv) [name@server ~]$ pip install spacy --no-index
    

GPU Version

At the present time, in order to use the GPU version you need to add the CUDA libraries to LD_LIBRARY_PATH:

(venv) [name@server ~]$ module load gcc/5.4.0 cuda/9
(venv) [name@server ~]$ export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

If you want to use the Pytorch wrapper with thinc, you'll also need to install the torch_cpu or torch_gpu wheel.