As part of our NSF-funded initiative, we are working with two pilot projects that highlight different aspects of the statistical Python ecosystem:
- YAGLM: A Python package for generalized linear models (GLM), offering modern statistical methodology with various loss functions, regularizers, and solvers. GLMs are a flexible and powerful generalization of ordinary linear regression and cover many statistical models widely used in applications. Beyond the basic LASSO/ridge penalties, the package supports structured penalties such as the nuclear norm as well as the group, exclusive, fused, and generalized lasso and non-convex penalties. It comes with a variety of tuning parameter selection methods including: cross-validation, information criteria that have favorable model selection properties, and degrees of freedom estimators.
- ISLP: A companion Python package for the textbook “An Introduction to Statistical Learning with Applications in Python,” closely following the examples in its hugely successful
R
version. Besides just providing labs, it provides Pythonic design matrix building, a simple implementation of Bayesian Additive Regression Trees and object oriented stepwise model selection.