十岭回归

岭回归是线性回归的改进，有时候迫不得已我们的参数确实不能少，这时候过拟合的现象就可能发生。为了避免过拟合现象的发生，既然不能从减少参数上面下手，那我们转而在线性回归的最后面添加一个罚项，罚项有时也被称为正则化项，其主要用于控制模型的平滑度，当模型参数越多，模型越复杂，那么罚项惩罚值就越大。

罚项可以是L1范数也可以是L2范数，对于使用L1范数的回归我们一般叫做Lasso线性回归。而对于使用L2范数的回归我们一般叫做岭回归。在这一讲中，我们主要讲述岭回归。

10.1 岭回归的接口

Ridge回归通过对系数的大小施加惩罚来解决普通线性模型使用最小二乘法带来的一些问题。

10.2 岭回归处理房价预测

让我们用岭回归来预测波士顿房价吧。

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error


def load_data():
    """加载数据集"""
    boston_data = load_boston()
    x_train, x_test, y_train, y_test = train_test_split(boston_data.data, boston_data.target, random_state=22)
    return x_train, x_test, y_train, y_test


def ridge_linear_model():
    """用岭回归做预测"""
    x_train, x_test, y_train, y_test = load_data()

    # 预估器
    estimator = Ridge(normalize=True)
    estimator.fit(x_train, y_train)

    # 得出模型
    print("权重系数为：\n", estimator.coef_)
    print("偏置为：\n", estimator.intercept_)

    # 模型评估
    y_predict = estimator.predict(x_test)
    print("预测房价：\n", y_predict)
    error = mean_squared_error(y_test, y_predict)
    print("岭回归——均方误差为：\n", error)


ridge_linear_model()

0 条评论

机器学习的练功方式（十）——岭回归

文章目录

十 岭回归

10.1 岭回归的接口

10.2 岭回归处理房价预测

十岭回归