Writing a Multiple OLS in Python

John Vandivier

For quick instruction, I recommend the following tutorial: https://datatofish.com/multiple-linear-regression-python/

However, I had some additional questions which needed solving to get the above working. This article has the questions and answers required to eventually get my simple OLS working. I also include the simple script for reference. Here's the code:

# ref: https://datatofish.com/multiple-linear-regression-python/

import pandas as pdimport statsmodels.api as sm

# could have made a normal dataframe in csv and do pd.readcsv
oStudentData = {
'TotalScore': [1623, 1471, 1597, 1657, 1521],
'Rank': [5, 1, 2, 4, 3],
}

arrsStudentDataColumns = [*oStudentData.keys()]
arrsIndependentVariableNames = [*filter(lambda sKey: sKey != 'TotalScore', arrsStudentDataColumns)]

dataFrame = pd.DataFrame(oStudentData, columns=arrsStudentDataColumns)dictIndependent = dataFrame[arrsIndependentVariableNames]dictDependent = dataFrame['TotalScore']

dictIndependentWithConstants = sm.add_constant(dictIndependent)
model = sm.OLS(dictDependent, dictIndependentWithConstants).fit()
print_model = model.summary()
print(print_model)

Q&A:

  1. Recommended pylint and python repo setup?
    1. For this simple use case, a script is suitable and a whole repo setup with linting and a requirements file isn't needed.
  2. I got ImportError: No module named pandas and I see a couple ways to fix this. Any recommendations?
    1. Install python3 and anaconda, then run the script like python3 myscript.py.
  3. Some dude said i should use pipenv instead of pip. thoughts?
    1. That's not needed in this case, and I'm not aware of any case where it's needed. If you install python3 you will have pip3 and you should pip3 install to prevent packages mixing across python version.
  4. I installed a wrong version of python. How can I fix this?
    1. Through Python 3.7, and perhaps higher although I can't guarantee it, you can uninstall following this article.
    2. When uninstalling, which python3 may give you more confidence that you are deleting the correct things.
    3. Using brew will likely make install and uninstall much easier.

For more in-depth learning, multiple python developers and data analysts recommended this Codecademy course.