In this webinar, you will learn how to use Statistics Toolbox to generate accurate predictive models from data sets that contain large numbers of correlated variables. By the end of the webinar, you will understand:
• Problems that can occur when linear regression is used to model such data sets
• How to use sequential feature selection and cross-validation to address these problems
• Alternative techniques based on regularization and shrinkage including lasso, ridge regression, and elastic net
• The characteristics of data sets that suggest regularization and shrinkage methods versus sequential feature selection
About the Presenter: Richard Willey is a product marketing manager focused on MATLAB and add-on products for data analysis, statistics, and curve fitting. Prior to joining MathWorks in 2007, Richard worked at Wind River Systems and Symantec. Richard has dual master’s degrees in engineering and management from the Massachusetts Institute of Technology and a master’s degree in economics from Indiana University.
The demo and MATLAB code that we'll be showing in this webinar was motivated by the following paper by Robert Tibshirani:
In particular, the "Simulations" section beginning on page 279.)