Session 4 - Common Modules
IMPRS Be Smart Summer School
2023-08-07
One great advantage of python that it has a vast ecosystem of packages.
Some packages are build in, but still needs to be imported.
Python use the syntax import packagename
to import a package.
The functions, methods etc. comes as a subset of the package, which can be reached by a dot.
'bob'
You can find the documentation of the module here ## Modules
Python has a lot of built-in modules that you can use without installing anything.
You can find the list of built-in modules here: docs.python.org/3/library/
Some common built-in modules are:
math
for mathematical functionsrandom
for random number generationos
for operating system related functionssys
for system related functionsdatetime
for date and time related functionsYou can install modules using pip
command.
pip
is a package manager for python.
You can install a package using:
pip install packagename
numpy
for numerical computingpandas
for data analysisscikit-learn
for machine learningotree
for package!Numpy, “Numerical Python”, is a library for scientific computing
Brings Numpy Array
data type which is similar to vectors
Install it by:
pip install numpy
import numpy as np
It assumes that all elements are of the same type
It has its own methods
array([3, 4])
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
array([4, 5, 6])
array([[5, 6],
[8, 9]])
Refer to the documentation for more details
DataFrame
data type Name Age City
0 Alice 25 New York
1 Bob 30 Chicago
2 Charlie 22 Chicago
3 David 28 Los Angeles
import numpy as np
import statsmodels.api as sm
# Generate some data
x = np.random.normal(size=100)
y = 2 * x + np.random.normal(size=100)
# Fit and summarize OLS model
model = sm.OLS(y, sm.add_constant(x))
results = model.fit()
print(results.summary())
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.734
Model: OLS Adj. R-squared: 0.731
Method: Least Squares F-statistic: 269.9
Date: Sun, 06 Aug 2023 Prob (F-statistic): 6.58e-30
Time: 22:07:19 Log-Likelihood: -148.48
No. Observations: 100 AIC: 301.0
Df Residuals: 98 BIC: 306.2
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const -0.1549 0.108 -1.436 0.154 -0.369 0.059
x1 1.9397 0.118 16.430 0.000 1.705 2.174
==============================================================================
Omnibus: 5.195 Durbin-Watson: 1.787
Prob(Omnibus): 0.074 Jarque-Bera (JB): 4.557
Skew: -0.441 Prob(JB): 0.102
Kurtosis: 3.563 Cond. No. 1.09
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
{.width=135%}
Install Python 3.8 or higher https://www.python.org/downloads/
Install Jupyter Lab https://jupyter.org/install
Install Visual Studio Code https://code.visualstudio.com/ (or any other text editor)
Get a Github account