Maths

statsmodels – statistical modeling and econometrics in Python

statsmodels is an open source Python package that provides a complement to SciPy for statistical computations including descriptive statistics and estimation and inference for statistical models.

statsmodels is built on top of the numerical libraries NumPy and SciPy, integrates with Pandas for data handling and uses patsy for an R-like formula interface. Graphical functions are based on the matplotlib library.

statsmodels provides the statistical backend for other Python libraries.

Key Features

  • Linear regression models:
    • Ordinary least squares.
    • Generalized least squares.
    • Weighted least squares.
    • Least squares with autoregressive errors.
    • Quantile regression.
    • Recursive least squares.
  • Mixed Linear Model with mixed effects and variance components.
  • GLM: Generalized linear models with support for all of the one-parameter exponential family distributions.
  • Bayesian Mixed GLM for Binomial and Poisson.
  • GEE: Generalized Estimating Equations for one-way clustered or longitudinal data.
  • Discrete models:
    • Logit and Probit.
    • Multinomial logit (MNLogit).
    • Poisson and Generalized Poisson regression.
    • Negative Binomial regression.
    • Zero-Inflated Count models.
  • RLM: Robust linear models with support for several M-estimators.
  • Time Series Analysis: models for time series analysis.
    • Complete StateSpace modeling framework:
      • Seasonal ARIMA and ARIMAX models.
      • VARMA and VARMAX models.
      • Dynamic Factor models.
      • Unobserved Component models.
    • Markov switching models (MSAR), also known as Hidden Markov Models (HMM).
    • Univariate time series analysis: AR, ARIMA.
    • Vector autoregressive models, VAR and structural VAR.
    • Vector error correction modle, VECM.
      • Parameter estimation for cointegrated VAR.
      • Forecasting.
      • Testing for Granger-causality and instantaneous causality.
      • Testing for cointegrating rank.
      • Lag order selection.
    • exponential smoothing, Holt-Winters.
    • Hypothesis tests for time series: unit root, cointegration and others.
    • Descriptive statistics and process models for time series analysis.
  • Survival analysis:
    • Proportional hazards regression (Cox models).
    • Survivor function estimation (Kaplan-Meier).
    • Cumulative incidence function estimation.
  • Multivariate:
    • Principal Component Analysis with missing data.
    • Factor Analysis with rotation.
    • MANOVA.
    • Canonical Correlation.
  • Nonparametric statistics: Univariate and multivariate kernel density estimators.
  • Datasets: Datasets used for examples and in testing.
  • Statistics: a wide range of statistical tests:
    • diagnostics and specification tests.
    • goodness-of-fit and normality tests.
    • functions for multiple testing.
    • various additional statistical tests.
  • Imputation with MICE, regression on order statistic and Gaussian imputation.
  • Mediation analysis.
  • Graphics includes plot functions for visual analysis of data and model results.
  • I/O:
    • Tools for reading Stata .dta files, but pandas has a more recent version.
    • Table output to ascii, latex, and html.
  • Miscellaneous models.
  • Sandbox: statsmodels contains a sandbox folder with code in various stages of developement and testing which is not considered “production ready”. This covers among others:
    • Generalized method of moments (GMM) estimators.
    • Kernel regression.
    • Various extensions to scipy.stats.distributions.
    • Panel data models.
    • Information theoretic measures.

Website: www.statsmodels.org
Support: GitHub Code Repository, Mailing List
Developer: Statsmodels Developers
License: Modified BSD (3-clause) license

statsmodels is written in Python. Learn Python with our recommended free books and free tutorials.


Related Software

Python Data Analysis
pandasHigh-level building block for doing practical, real world data analysis
NumPyCore package for scientific computing with Python
SciPyEcosystem for mathematics, science, and engineering
PolarsDataFrame interface on top of an OLAP Query Engine
statsmodelsStatistical modeling and econometrics in Python
DaskAdvanced parallelism for analytics
OrangeComponent-based framework for machine learning and data mining
ModinDrop-in replacement for pandas
VaexFast visualization of big data
AWS DWExtends the power of pandas library
ytMulti-code Toolkit for Analyzing and Visualizing Volumetric Data
HoloViewsMake Data Analysis and Visualization Seamless
datatableManipulate 2-dimensional tabular data structures
xarrayWork with labelled multi-dimensional arrays and datasets
pyjanitorExtend pandas with readable data-cleaning functions
OptimusAgile Data Preparation Workflows

Read our verdict in the software roundup.

Python Mathematics Tools
scikit-learnMachine learning library for Python
NumPyCore package for scientific computing with Python
SciPyEcosystem for mathematics, science, and engineering.
statsmodelsStatistical modeling and econometrics
JAXPython library for high-performance numerical computing
SageMathComputer algebra system
SymPyLibrary for symbolic mathematics
PyMCBayesian statistical modeling and probabilistic programming
PyomoObject-oriented algebraic modeling language
patsyPackage for describing statistical models and to build design matrices
mpmathLibrary for arbitrary-precision floating-point arithmetic
SfePyFinite element software package

Read our verdict in the software roundup.


Best Free and Open Source Software Explore our comprehensive directory of recommended free and open source software. Our carefully curated collection spans every major software category.

This directory is part of our ongoing series of informative articles for Linux enthusiasts. It features hundreds of detailed reviews, along with open source alternatives to proprietary solutions from major corporations such as Google, Microsoft, Apple, Adobe, IBM, Cisco, Oracle, and Autodesk.

You’ll also find interesting projects to try, hardware coverage, free programming books and tutorials, and much more.

Discovered a useful open source Linux program that we haven’t covered yet? Let us know by completing this form.
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted