Extend Linear Regression to handle non-linear relationships. ~ TUTORIALTPOINT- Java Tutorial, C Tutorial, DBMS Tutorial

1. Data Generation

Objective: Create a dataset that simulates a non-linear relationship, such as advertising expenditure vs. sales.
Process:
- Use numpy.linspace to generate 100 evenly spaced values between 0 and 10 for the independent variable $X$ .
- Simulate the dependent variable $y$ using a quadratic function (e.g., $y = 2 + 3X - 0.5X^2$ ) and add Gaussian noise with np.random.randn to mimic real-world variability.

2. Simple Linear Regression

Objective: Fit a basic linear model to the data.
Process:
- Create an instance of LinearRegression from scikit-learn.
- Fit the model using the original $X$ and $y$ values.
- Use the model to predict $y$ values (y_lin_pred) on the same $X$ data.
Note: This model assumes a straight-line relationship and may not capture the non-linear patterns in the data.

3. Polynomial Regression

Objective: Capture the non-linear relationship by introducing polynomial terms.
Process:
- Transforming Features:
  - Use PolynomialFeatures to transform $X$ into polynomial features. For example, setting the degree to 2 generates three features: the constant term ( $X^0$ ), $X$ , and $X^2$ .
- Fitting the Model:
  - Use another LinearRegression instance and fit it on these transformed features.
  - Predict the target variable (y_poly_pred) using the polynomial model.
Benefit: This approach allows the regression model to fit curves rather than straight lines, improving the fit for non-linear data.

4. Model Performance Comparison

Objective: Evaluate and compare the performance of the simple linear model against the polynomial model.
Process:
- Compute the Mean Squared Error (MSE) for both models using mean_squared_error.
- Calculate the $R^2$ score using r2_score to assess how well each model explains the variance in the data.
Interpretation:
- Lower MSE and higher $R^2$ indicate a better fit. The polynomial model is expected to outperform the linear model when the true relationship is non-linear.

5. Visualization

Objective: Visualize how well each model fits the data.
Process:
- Use Matplotlib to create a scatter plot of the original data points.
- Plot the predictions from the simple linear regression (a straight line) and the polynomial regression (a curve).
- Label axes, add a title, and include a legend to differentiate between the models.
Benefit: Visual inspection provides an intuitive understanding of how the models compare, highlighting the polynomial model’s improved ability to capture the non-linear trend.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Generate synthetic data: advertising vs. sales with a non-linear relationship.
np.random.seed(42)
X = np.linspace(0, 10, 100).reshape(-1, 1)
# Simulate a quadratic relationship with noise:
y = 2 + 3 * X - 0.5 * X**2 + np.random.randn(100, 1) * 2

# --------------------------
# Simple Linear Regression
# --------------------------
lin_reg = LinearRegression()
lin_reg.fit(X, y)
y_lin_pred = lin_reg.predict(X)

# --------------------------
# Polynomial Regression (Degree = 2)
# --------------------------
poly_degree = 2
poly_features = PolynomialFeatures(degree=poly_degree)
X_poly = poly_features.fit_transform(X)

poly_reg = LinearRegression()
poly_reg.fit(X_poly, y)
y_poly_pred = poly_reg.predict(X_poly)

# --------------------------
# Performance Comparison
# --------------------------
mse_lin = mean_squared_error(y, y_lin_pred)
mse_poly = mean_squared_error(y, y_poly_pred)
r2_lin = r2_score(y, y_lin_pred)
r2_poly = r2_score(y, y_poly_pred)

print("Linear Regression MSE:", mse_lin)
print("Polynomial Regression MSE:", mse_poly)
print("Linear Regression R2:", r2_lin)
print("Polynomial Regression R2:", r2_poly)

# --------------------------
# Plotting the results
# --------------------------
plt.figure(figsize=(10, 6))
plt.scatter(X, y, label="Data", color="black")
plt.plot(X, y_lin_pred, label="Linear Regression", color="blue", linewidth=2)
plt.plot(X, y_poly_pred, label=f"Polynomial Regression (Degree {poly_degree})", color="red", linewidth=2)
plt.xlabel("Advertising Expenditure")
plt.ylabel("Sales")
plt.title("Linear vs Polynomial Regression")
plt.legend()
plt.show()

Output:

Linear Regression MSE: 16.86354989653036
Polynomial Regression MSE: 3.2471947501886613
Linear Regression R2: 0.66231449906867
Polynomial Regression R2: 0.9349762895376701

Interpretation:

The code creates a scatter plot of the original data and plots the predictions from both the linear and polynomial regressions. This visualization is crucial for understanding how well each model captures the relationship between the advertising expenditure and sales. The plot clearly shows that the polynomial regression (in this case with a quadratic curve) fits the non-linear data much better than simple linear regression.

Monday, 24 February 2025

Extend Linear Regression to handle non-linear relationships.

1. Data Generation

2. Simple Linear Regression

3. Polynomial Regression

4. Model Performance Comparison

5. Visualization

0 comments :

Post a Comment

NumPy Tutorial

Advertisement

Java Tutorial

UGC NET CS TUTORIAL

Data Base Management

C Programming

Python Tutorial

GATE TUTORIAL

Data Structures

computer Organization

Computer Basics