Step 1: Checking if NLTK is already installed
Open an OSX terminal and type python
to launch a Python interpreter. You should get something like this:
Load NLTK by typing the following in your environment and pressing ‘enter’:
If this step fails, you will get an error, and you should follow the next step, Install NLTK. If it is already installed, nothing will happen and you’ll see the three >>>
in the window. In that case, skip to Step 3, Install NLTK Data.
Step 2: Install NLTK (MAC)
Exit the Python interpreter (control-d
or quit()
. Once you’re back in your terminal, type:
$ conda install nltk -y
When you press enter, the terminal should look something like the following:
When it’s finished, go back into the Python interpreter to import NLTK, typing import NLTK
after the >>>
. If it downloaded correctly, nothing will happen and you’ll see the three >>>
in the window. In that case, continue to step 3.
Step 3: Install NLTK Data with the GUI (MAC)
You then need to install the data that NLTK relies on to function. This may take several minutes (depending on your internet connection). Some packages may fail installation due to being outdated - this is alright, and will not be a problem for our lessons. If you get an error about a package failing, just shut down the install and skip to installing NLTK with the command line. In your Python environment run the following command after import nltk:
>>> nltk.download()
For example, the interpreter above would now look like:
The Python environment that the GUI was launched from should now have a message that looks something like this:
showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
Now, look for the NLTK download GUI - this will appear automatically but may appear hidden behind your browser window or behind where you are working in Python.
Click on the first tab (collections), and on the first record on that tab: all. Then, click the “download” button on the left hand side of that window. This may take several minutes (depending on your internet connection). Press the refresh button if the install is stalling and ignore errors. If something goes wrong, proceed to installing NLTK with the command line. If nothing happens, then skip to the install test.
Step 4: Install NLTK Data with the Command Line (MAC)
NLTK also provides a text based download tool, which you can download with the Command Line. In your interactive Python environment, type the following commands after importing nltk
nltk.download('all', halt_on_error=False)
The interpreter above should now look something like:
If the command is successful, the terminal will print out something like:
It will take a few minutes to download. At the end, your terminal should look like this, bringing you back to the python interpreter prompt:
Step 5: Test Installation (MAC)
When the installation is complete, close the NLTK Downloader and check your installation. You need to be in a Python environment such as an interpreter or Jupyter notebook. Brown In your Python environment, run the following code:
from nltk.corpus import brown
If your code runs and nothing happens (no error message and nothing printed to the screen), congratulations! *Book Check that the books corpus installed properly by typing:
from nltk.book import *
If installed successfully, you should see the following:
Penn Parts of speech Check that the parts of speech tagger is installed correctly by typing the following:
nltk.help.upenn_tagset('NN')