Avoid overfitting in regression models. ~ TUTORIALTPOINT- Java Tutorial, C Tutorial, DBMS Tutorial

Implementation Steps Explained

Dataset Loading and Splitting:
- The California Housing dataset is loaded using fetch_california_housing().
- The dataset is split into training and testing sets (70% training, 30% testing) to evaluate model performance on unseen data.
Ordinary Linear Regression:
- A basic LinearRegression model is trained on the training data.
- Predictions are made on the test set, and performance is evaluated using Mean Squared Error (MSE) and R² Score.
Defining a Range of Regularization Parameters (α):
- A set of alpha values is defined using np.logspace to span from very low (0.001) to high (1000) values.
- These values control the strength of the regularization.
Training Ridge and Lasso Regression Models:
- For each alpha value, both a Ridge and a Lasso regression model are trained on the training data.
- Their performance on the test set is evaluated using MSE and R² Score, and the results are stored.
Identifying the Optimal α:
- The alpha value that results in the lowest MSE is selected as the optimal parameter for each model (Ridge and Lasso).
Visualization:
- MSE versus alpha is plotted on a logarithmic scale for both Ridge and Lasso regression.
- Vertical lines indicate the optimal alpha values, illustrating how regularization impacts model performance.
Comparing Models:
- The best Ridge and Lasso models (using the optimal α values) are retrained and compared against the ordinary Linear Regression model.
- The comparison is done using MSE and R² Score to assess how regularization improves or affects performance.

Output:

Linear Regression MSE: 0.53
Linear Regression R²: 0.60

Optimal Ridge alpha: 323.7458
Optimal Lasso alpha: 0.0095

Comparison of Models on Test Data:
Linear Regression - MSE: 0.53, R²: 0.60
Ridge Regression    - MSE: 0.52, R²: 0.60
Lasso Regression    - MSE: 0.53, R²: 0.60

Analysis of Regularization Impact

Low α Values:
When α is very low, the regularization effect is minimal, and the models behave similarly to ordinary linear regression. This can lead to complex models that may overfit the training data.
High α Values:
As α increases, the regularization effect becomes stronger. This forces the model coefficients to shrink toward zero, potentially reducing overfitting but also risking underfitting if α is too high.
Optimal α:
The optimal value of α balances the trade-off between bias and variance, leading to a model that generalizes well to unseen data. The plots help visualize how MSE changes with α, providing insight into the best regularization strength for the given dataset.

Monday, 24 February 2025

Avoid overfitting in regression models.

Analysis of Regularization Impact

0 comments :

Post a Comment

NumPy Tutorial

Advertisement

Java Tutorial

UGC NET CS TUTORIAL

Data Base Management

C Programming

Python Tutorial

GATE TUTORIAL

Data Structures

computer Organization

Computer Basics