7月
13

慕课网机器学习笔记（5）

技术

内容提要

简单线性回归
- 基本图示
- 目标
- 思路
- 公式
- 代码
- 执行

课程章节：5.1 - 5.2 *

样本特征只有一个

基本图示

目标

损失函数 loss function
效用函数 utiliy function

思路

通过分析问题，确定问题的损失函数或效用函数；通过最优化损失函数或者效用函数，获得机器学习的模型

近乎所有参数学习算法都是这样的套路：线性回归，SVM，多项式回归，神经网络，逻辑回归 —–> 最优化原理，凸优化

公式

推导过程

如下图

代码

import numpy as np

class SimpleLinearRegression1:
    def __init__(self):
        """初始化Simple Linear Regression 模型"""
        self.a_ = None
        self.b_ = None

    def fit(self, x_train: np.ndarray, y_train: np.ndarray) -> 'SimpleLinearRegression1':
        """根据训练数据集x_train,y_train训练Simple Linear Regression模型"""
        assert x_train.ndim == 1, \
            "Simple Linear Regression can only solve single feature training data"
        assert len(x_train) == len(y_train), \
            "the size of x_train must be equal to the size of y_train"

        # 平均值
        x_mean = np.mean(x_train)
        y_mean = np.mean(y_train)

        # 分别为分子分母
        num = 0.0
        d = 0.0
        for x, y in zip(x_train, y_train):
            num += (x - x_mean) * (y - y_mean)
            d += (x - x_mean) ** 2

        self.a_ = num / d
        self.b = y_mean - self.a_ * x_mean

        return self

    def predict(self, x_predict: np.ndarray) -> np.ndarray:
        """给定待预测数据集x_predict，返回表示x_predict的结果向量"""
        assert x_predict.ndim == 1, \
            "Simple Linear Regressor can only solve single feature training data"
        assert self.a_ is not None and self.b_ is not None, \
            "must fit before predict!"

        return np.array([self._predict(x) for x in x_predict])

    def _predict(self, x_single: float) -> float:
        """给定单个待预测数据x_single, 返回x_single的预测结果值"""
        return self.a_ * x_single + self.b_
        pass

    def __repr__(self):
        return "SimpleLinearRegression1()"

执行

import numpy as np
import matplotlib.pyplot as plt;

# 初始化
simpleLinearRegression1 = SimpleLinearRegression1()

# 训练数据
x = np.array([1., 2., 3., 4., 5.])
y = np.array([1., 3., 2., 3., 5.])

# 进行训练
simpleLinearRegression1.fit(x, y)

# 准备预测数组
predicts = np.array([5.2, 4.3])

# 进行预测
predict_results = simpleLinearRegression1.predict(predicts)

# 显示结果
plt.scatter(x, y)
plt.plot(x, a*x+b, color='r')
plt.scatter(predicts, predict_results, color='black')
plt.axis([0, 6, 0, 6])
plt.show()

Schwarzeni

内容提要

基本图示

目标

思路

公式

代码

执行