How to find alternate naming in text (Python tutorial)

Find Misspellings and Alternate Naming in Large Text Datasets (Tutorial)

Below is the Python fastText script I use in the tutorial describing how to find alternate and misspellings in a large text data corpus.

#install pandas and fasttext if you haven't already
#pip install pandas
#pip install fasttext

#view info about our data
import pandas as pd
df = pd.read_csv(r"C:\folder\file.csv")

#start fasttext magic ...
import fasttext

#train and save model - view parameters/options at
model = fasttext.train_unsupervised(r"C:\folder\file.csv", model='skipgram', epoch=2)


#load model and see an overview of words
model = fasttext.load_model(r"C:\folder\file.bin")


#view words related to 'paracetamol'