Explanation of Implementation Steps
Data Loading and Splitting:
- The California Housing dataset is loaded using
fetch_california_housing()
, providing features and target values (median house prices). - The dataset is split into a training set (70%) and a testing set (30%) for evaluation.
- The California Housing dataset is loaded using
Experiment 1: Gradient Boosting:
- A
GradientBoostingRegressor
is instantiated and trained on the training data. - The model’s performance is evaluated on the test set using Mean Squared Error (MSE) and Score.
- A
Experiment 2: XGBoost:
- An XGBoost model is created with
XGBRegressor
. The objective is set to'reg:squarederror'
for regression. - After training, its performance is measured using the same metrics (MSE and Score) on the test data.
- XGBoost is known for its speed and accuracy, especially on large datasets.
- An XGBoost model is created with
Experiment 3: Model Comparison:
- A
RandomForestRegressor
is trained on the same dataset. - All three models—Gradient Boosting, XGBoost, and Random Forest—are compared by printing their MSE and Score.
- A scatter plot visualizes predicted versus actual house prices for a direct comparison of model performance.
- A
This workflow helps you understand how ensemble methods like Gradient Boosting and Random Forest compare with XGBoost, both in terms of accuracy and prediction quality, for a house price prediction task.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor
import xgboost as xgb
# ------------------------------
# Data Loading and Splitting
# ------------------------------
# Load the California Housing dataset
data = fetch_california_housing()
X, y = data.data, data.target
# Split data into training and testing sets (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=42)
# ------------------------------
# Experiment 1: Gradient Boosting
# ------------------------------
gbr = GradientBoostingRegressor(random_state=42)
gbr.fit(X_train, y_train)
y_pred_gbr = gbr.predict(X_test)
mse_gbr = mean_squared_error(y_test, y_pred_gbr)
r2_gbr = r2_score(y_test, y_pred_gbr)
print("Gradient Boosting Regressor:")
print("MSE: {:.4f}".format(mse_gbr))
print("R²: {:.4f}".format(r2_gbr))
# ------------------------------
# Experiment 2: XGBoost
# ------------------------------
# Note: Set the objective to 'reg:squarederror' for regression tasks.
xgb_reg = xgb.XGBRegressor(random_state=42, objective='reg:squarederror')
xgb_reg.fit(X_train, y_train)
y_pred_xgb = xgb_reg.predict(X_test)
mse_xgb = mean_squared_error(y_test, y_pred_xgb)
r2_xgb = r2_score(y_test, y_pred_xgb)
print("\nXGBoost Regressor:")
print("MSE: {:.4f}".format(mse_xgb))
print("R²: {:.4f}".format(r2_xgb))
# ------------------------------
# Experiment 3: Compare Models
# ------------------------------
# Train a Random Forest Regressor
rf = RandomForestRegressor(random_state=42)
rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)
mse_rf = mean_squared_error(y_test, y_pred_rf)
r2_rf = r2_score(y_test, y_pred_rf)
print("\nComparison of Models:")
print("Model\t\t\tMSE\t\tR²")
print("Gradient Boosting\t{:.4f}\t{:.4f}".format(mse_gbr, r2_gbr))
print("XGBoost\t\t\t{:.4f}\t{:.4f}".format(mse_xgb, r2_xgb))
print("Random Forest\t\t{:.4f}\t{:.4f}".format(mse_rf, r2_rf))
# Optional: Visual Comparison of Predictions vs. Actual Values
plt.figure(figsize=(12, 6))
plt.scatter(y_test, y_pred_gbr, alpha=0.5, label='Gradient Boosting')
plt.scatter(y_test, y_pred_xgb, alpha=0.5, label='XGBoost')
plt.scatter(y_test, y_pred_rf, alpha=0.5, label='Random Forest')
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
plt.xlabel('Actual House Prices')
plt.ylabel('Predicted House Prices')
plt.title('Model Comparison: Actual vs Predicted House Prices')
plt.legend()
plt.show()
Output:
0 comments :
Post a Comment
Note: only a member of this blog may post a comment.