【机器学习】单变量线性回归

时间：2023-12-02

导读：ML introduction.机器学习：从数据中学习，而不依赖于规则下编程的一种算法.Goal: $min_{w,b}(J(w, b))$ - 提供一种衡量一组特定参数与训练数据拟合程度的方法.Supervised Learning.right answer && x ->

ML introduction

机器学习：从数据中学习，而不依赖于规则下编程的一种算法

Goal: $min_{w,b}(J(w, b))$ - 提供一种衡量一组特定参数与训练数据拟合程度的方法

Supervised Learning

right answer && x -> y label

Unsupervised Learning

structure || pattern

Liner Regression with One Variable

预测数字问题

这部分主要内容包括单变量线性回归的模型表示、代价函数、梯度下降法和使用梯度下降法求解代价函数的最小值。

线性回归模型

数学表达式

\[f_{w,b}(x^{(i)}) = wx^{(i)}+b \]

代码

ndarray：n维数组类对象

scalar：标量

# 迭代
def compute_model_output(x, w, b):
    """
    Computes the prediction of a linear model
    Args:
      x (ndarray (m,)): Data, m examples 
      w,b (scalar)    : model parameters  
    Returns
      y (ndarray (m,)): target values
    """
    m = x.shape[0]
    f_wb = np.zeros(m)
    for i in range(m):
        f_wb[i] = w * x[i] + b

    return f_wb



# 向量
def compute_model_output(x, w, b): 
    """
    single predict using linear regression
    Args:
      x (ndarray): Shape (n,) example with multiple features
      w (ndarray): Shape (n,) model parameters   
      b (scalar):             model parameter 

    Returns:
      p (scalar):  prediction
    """
    yhat = np.dot(x, w) + b     
    return yhat

Cost Function

数学表达式

\[J(w,b) = \frac{1}{2m}\sum{i=1}^{m} (f{w, b}(x^{(i)}) - y^{(i)})^2 \]

\[f_{w,b}(x^{(i)}) = wx^{(i)} + b \]

参数表：

m	y	error
训练样例	真值	$f_{w, b}(x^{(i)}) - y^{(i)}$

代码

# 迭代
def compute_cost(x, y, w, b): 
    """
    Computes the cost function for linear regression.

    Args:
      x (ndarray (m,)): Data, m examples 
      y (ndarray (m,)): target values
      w,b (scalar)    : model parameters  

    Returns
        total_cost (float): The cost of using w,b as the parameters for linear regression
               to fit the data points in x and y
    """
    # number of training examples
    m = x.shape[0] 

    cost_sum = 0 
    for i in range(m): 
        f_wb = w * x[i] + b   
        cost = (f_wb - y[i]) ** 2  
        cost_sum = cost_sum + cost  
    total_cost = (1 / (2 * m)) * cost_sum  

    return total_cost



# 向量
def compute_cost(X, y, theta):
    """
    Computes the cost function for linear regression.
    Args:
       X (ndarray (m,)): Data, m examples 
       y (ndarray (m,)): target values
       theta (b (ndarray (m,), w (ndarray (m,))): model parameters

     Returns
        total_cost (float): The cost of using theta as the parameters for linear regression
               to fit the data points in X and y
    """
    error = (X * theta.T) - y
    inner = np.power(error, 2)
    total_cost = np.sum(inner)/(2 * len(X))
    return total_cost

数学原理

求导：不同的w对应不同的J，对多个点拟合出的曲线求导，以期找到最小的J对应的w

function of w

function of w

function of w

function of w

Gradient Descent

（迭代）=> 极值点

大样本：每次梯度更新都抽部分样本

数学表达式

\[\begin{align} \text{repeat}&\text{ until convergence:} \; \lbrace \newline \; w &= w - \alpha \frac{\partial J(w,b)}{\partial w} \; \newline b &= b - \alpha \frac{\partial J(w,b)}{\partial b} \newline \rbrace \end{align} \]

\[\begin{align} \frac{\partial J(w,b)}{\partial w} &= \frac{1}{m} \sum\limits{i = 0}^{m-1} (f{w,b}(x^{(i)}) - y^{(i)})x^{(i)} \
\frac{\partial J(w,b)}{\partial b} &= \frac{1}{m} \sum\limits{i = 0}^{m-1} (f{w,b}(x^{(i)}) - y^{(i)}) \\ \end{align} \]

$\alpha$: 学习率，控制更新模型参数w和b时采取的步骤大小

代码

# 迭代
def compute_gradient(x, y, w, b): 
    """
    Computes the gradient for linear regression 
    Args:
      x (ndarray (m,)): Data, m examples 
      y (ndarray (m,)): target values
      w,b (scalar)    : model parameters  
    Returns
      dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
      dj_db (scalar): The gradient of the cost w.r.t. the parameter b     
     """

    # Number of training examples
    m = x.shape[0]    
    dj_dw = 0
    dj_db = 0

    for i in range(m):  
        f_wb = w * x[i] + b 
        dj_dw_i = (f_wb - y[i]) * x[i] 
        dj_db_i = f_wb - y[i] 
        dj_db += dj_db_i
        dj_dw += dj_dw_i 
    dj_dw = dj_dw / m 
    dj_db = dj_db / m 

    return dj_dw, dj_db

def gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function): 
    """
    Performs gradient descent to fit w,b. Updates w,b by taking 
    num_iters gradient steps with learning rate alpha

    Args:
      x (ndarray (m,))  : Data, m examples 
      y (ndarray (m,))  : target values
      w_in,b_in (scalar): initial values of model parameters  
      alpha (float):     Learning rate
      num_iters (int):   number of iterations to run gradient descent
      cost_function:     function to call to produce cost
      gradient_function: function to call to produce gradient

    Returns:
      w (scalar): Updated value of parameter after running gradient descent
      b (scalar): Updated value of parameter after running gradient descent
      J_history (List): History of cost values
      p_history (list): History of parameters [w,b] 
      """

    w = copy.deepcopy(w_in) # avoid modifying global w_in
    # An array to store cost J and w's at each iteration primarily for graphing later
    J_history = []
    p_history = []
    b = b_in
    w = w_in

    for i in range(num_iters):
        # Calculate the gradient and update the parameters using gradient_function
        dj_dw, dj_db = gradient_function(x, y, w , b)     

        # Update Parameters using equation (3) above
        b = b - alpha * dj_db                            
        w = w - alpha * dj_dw                            

        # Save cost J at each iteration
        if i<100000:      # prevent resource exhaustion 
            J_history.append( cost_function(x, y, w , b))
            p_history.append([w,b])
        # Print cost every at intervals 10 times or as many iterations if < 10
        if i% math.ceil(num_iters/10) == 0:
            print(f"Iteration {i:4}: Cost {J_history[-1]:0.2e} ",
                  f"dj_dw: {dj_dw: 0.3e}, dj_db: {dj_db: 0.3e}  ",
                  f"w: {w: 0.3e}, b:{b: 0.5e}")

    return w, b, J_history, p_history #return w and J,w history for graphing



# 向量
def gradient_descent(X, y, theta, alpha, iters):
    """
    Performs gradient descent to fit w,b. Updates w,b by taking 
    num_iters gradient steps with learning rate alpha

    Args:
      X (ndarray (m,))    : Data, m examples 
      y (ndarray (m,))    : target values
      theta (ndarray (m,)): initial values of model parameters  
      alpha (float)       : Learning rate
      iters (scalar)      : number of interations 

    Returns:
      theta (ndarray (m,)): Updated parameter of parameter after running gradient descent
      cost (ndarray (m,)) : Record the cost after each iteration
    """
    tmp = np.matrix(np.zeros(theta.shape)) # 构造零值矩阵
    parameters = int(theta.ravel().shape[1]) # theta的列即参数的个数
    cost = np.zeros(iters) # 构建iters个()的数组
    # 迭代
    for i in range(iters):
        error = (X * theta.T) - y
        for j in range(parameters):
            term = np.multiply(error, X[:, j]) # 求内积 np.multiply
            tmp[0, j] = theta[0, j] - ((alpha / len(X) * np.sum(term)))

        theta = tmp
        cost[i] = computeCost(X, y, theta)

    return theta, cost

数学原理

(w和b要同时更新)

最小二乘法

形式：$$标函数 = sum(观测值 - 理论值)^2$$

解法：https://www.cnblogs.com/pinard/p/5976811.html

代数法：偏导数求最值
矩阵法：normal equation（有局限性）

上一篇：ASR项目实战-后处理

下一篇：【玩转鲲鹏DevKit系列】如何

阅读

内容

Unity 中的存档系统（本地存
2023-12-09

思想.在游戏过程中，玩家的背包、登录、人物系统都与数据息息相关，无论是一开始就设定好的默认数据，还是可以动态存取的数据，
【Haxe】（二）字符串与变量的
2023-12-06

前言.每次学习一门新语言，各种手册和教程一上来就是讲变量如何定义，数据结构怎么用，很少有讲输入输出应该怎么写的。我比较喜
Mybatis的工作原理
2023-12-05

mybatis的工作原理.mybatis基本工作原理.封装sql ->调用JDBC操作数据库 -> 返回数据封装.JDB
数据分析师如何用SQL解决业务问
2023-12-03

本文来自问答。.提问：数据分析人员需要掌握sql到什么程度？.请问做一名数据分析人员，在sql方面需要掌握到什么程度呢？
缓存面试解析：穿透、击穿、雪崩，
2023-12-03

为什么使用缓存.在程序内部使用缓存，比如使用map等数据结构作为内部缓存，可以快速获取对象。通过将经常使用的数据存储在缓
Unity学习笔记--数据持久化
2023-12-02

JSON相关.json是国际通用语言，可以跨平台（游戏，软件，网页，不同OS）使用，.json语法较为简单，使用更广泛。
智能智慧农业设备
2023-10-03

智能智慧农业设备.智能智慧农业设备是一种基于先进技术的农业生产设备，旨在提高农业生产效率，降低劳动成本，减少资源浪费，并
智能智能旅游设备
2023-10-06

产品介绍.1. 产品功能.我们的智能旅游设备**了最新的智能科技，为旅行者提供了全方位的旅游体验。其主要功能包括：.-
智能门锁
2023-10-01

智能门锁产品功能介绍.智能门锁是一款集智能科技与安全性能于一体的高端产品，它利用先进的技术，给用户带来了便捷的生**验。
智能智能餐饮设备
2023-10-05

产品功能介绍.智能餐饮设备概述.我们的智能餐饮设备是一款集合了智能化技术和餐饮服务的创新产品。通过智能化设备，顾客可以方
电子元件芯片
2023-10-02

电子元件芯片.产品功能.电子元件芯片是一种微型电子元件，其具有高性能、高可*性和低功耗的特点。它广泛应用于手机、电脑、家
虚拟现实和增强现实技术产品
2023-10-01

虚拟现实和增强现实技术产品介绍.产品描述.我们的产品是一款虚拟现实（VR）和增强现实（AR）技术产品，通过使用先进的技术
***数据备份方案
2023-10-02

***数据备份方案.产品功能.自动化备份：定期自动备份***上的数据，无需人工干预，确保数据的及时、准确备份。.数据恢复
智能智能娱乐设备
2023-10-05

产品功能与介绍.简介.智能娱乐设备是一款集智能音响、智能家居控制、语音助手等功能为一体的智能设备。它可通过内置的语音助手
智能电视
2023-10-01

产品功能介绍.智能电视是一款结合了传统电视和智能硬件的产品。它内置了智能操作系*，能够连接互联网并运行各种应用程序。智能
***性能优化服务
2023-10-02

***性能优化服务.我们的服务器性能优化服务是针对企业和个人用户的***性能提升及优化解决方案。无论您是在使用自己的服务