Hi Super John

ridge regression assumes the predictors are standardized and the response is centered”

From https://www.datacamp.com/community/tutorials/tutorial-ridge-lasso-elastic-net, which also has formulae for the bias and variance terms in OLS so you can see what they look like for your favorite OLS dataset. (Post coming soon.)

I think the above requirement should hold for Lasso as well, and certainly for elasticNet. In addition, in either Ridge or Lasso, the additional error term is L2(coefficients) or L1(coefficients), so it will help to minimize the values of these coefficients. However, if the orthogonal features ‘z’ are linear combinations of the original features ‘x’, then the coefficients are related by the inverse transform. Hence what we should be trying to do in regularization attempts to lower the variance is set the coefficients of the z’s to 0, but these will be linear combinations of the original coefficients.

Hence: before the predictors can be standardized (which I do as the second step in the code) you have to orthogonalize them w.r.t. the covariance matrix. This is what leads to WarmFuzzy encoding.

LMK what you think!

I stop to miau to cats.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store