自学内容网 自学内容网

【简博士统计学习方法】第2章:3. 感知机——学习算法之对偶形式:算法解说

3. 感知机——学习算法之对偶形式:算法解说

3.4 对偶形式

  • 在原始形式中,若 ( x i , y i ) (x_i,y_i) (xi,yi)为误分类点,可如下更新参数:
    w ← w + η y i x i ; b ← b + η y i w \leftarrow w+\eta y_{i} x_{i} ; \quad b \leftarrow b+\eta y_{i} ww+ηyixi;bb+ηyi
  • 假设初始值 w 0 = 0 , b 0 = 0 w_0=\boldsymbol{0},b_0=0 w0=0,b0=0,对误分类点 ( x i , y i ) (x_i,y_i) (xi,yi)通过上述公式更新参数,修改 n i n_i ni次之后, w , b w,b w,b的增量分别为 α i y i x i \alpha_i y_i x_i αiyixi α i y i \alpha_i y_i αiyi,其中 α i = n i η i \alpha_i = n_i \eta_i αi=niηi
  • 最后学习到的参数为,
    w = ∑ i = 1 N α i y i x i ; b = ∑ i = 1 N α i y i w=\sum_{i=1}^{N} \alpha_{i} y_{i} x_{i} ; \quad b=\sum_{i=1}^{N} \alpha_{i} y_{i} w=i=1Nαiyixi;b=i=1Nαiyi

    对偶形式学习算法的具体步骤:
  • 输入:训练集:
    T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) ⋯   , ( x N , y N ) } T=\left\{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right) \cdots,\left(x_{N}, y_{N}\right)\right\} T={(x1,y1),(x2,y2),(xN,yN)}
    其中, x i ∈ X ⊆ R n , y ∈ Y = { + 1 , − 1 } x_i\in\mathcal{X}\subseteq\mathbb{R}^n,y\in\mathcal{Y}=\{+1,-1\} xiXRn,yY={+1,1};步长 η ( 0 < n ⩽ 1 ) \eta(0< n \leqslant 1) η(0<n1).
  • 输出 α , b \alpha,b α,b;感知机模型 f ( x ) = sign ⁡ ( ∑ j = 1 N α j y j x j ⋅ x + b ) f(x)=\operatorname{sign}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x+b\right) f(x)=sign(j=1Nαjyjxjx+b),其中 α = ( α 1 , α 2 , ⋯   , α N ) T \alpha=\left(\alpha_{1}, \alpha_{2}, \cdots, \alpha_{N}\right)^{T} α=(α1,α2,,αN)T
    (1)选取初始值 α < 0 > = ( 0 , 0 , ⋯   , 0 ) T , b < 0 > = 0 \alpha^{<0>}=(0,0,\cdots,0)^T,b^{<0>}=0 α<0>=(0,0,,0)T,b<0>=0
    (2)于训练集中随机选取数据 ( x i , y i ) (x_i,y_i) (xi,yi)
    (3)若 y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) ⩽ 0 y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+b\right) \leqslant 0 yi(j=1Nαjyjxjxi+b)0
    α i ← α i + η ; b ← b + η y i \alpha_{i} \leftarrow \alpha_{i}+\eta ; \quad b \leftarrow b+\eta y_{i} αiαi+η;bb+ηyi
    (4)转(2),直到训练集中没有误分类点。
  • Gram矩阵:若 y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) ⩽ 0 y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+b\right) \leqslant 0 yi(j=1Nαjyjxjxi+b)0
    α i ← α i + η ; b ← b + η y i \alpha_{i} \leftarrow \alpha_{i}+\eta ; \quad b \leftarrow b+\eta y_{i} αiαi+η;bb+ηyi
    迭代条件:
    y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) = y i [ ( α 1 y 1 x 1 + α 2 y 2 x 2 + ⋯ + α N y N x N ) ⋅ x i + b ) ] = y i ( α 1 y 1 x 1 ⋅ x i + α 2 y 2 x 2 ⋅ x i + ⋯ + α N y N x N ⋅ x i + b ) ⩽ 0 \begin{aligned} y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+b\right) & \left.=y_{i}\left[\left(\alpha_{1} y_{1} x_{1}+\alpha_{2} y_{2} x_{2}+\cdots+\alpha_{N} y_{N} x_{N}\right) \cdot x_{i}+b\right)\right] \\ & =y_{i}\left(\alpha_{1} y_{1} x_{1} \cdot x_{i}+\alpha_{2} y_{2} x_{2} \cdot x_{i}+\cdots+\alpha_{N} y_{N} x_{N} \cdot x_{i}+b\right) \\ & \leqslant 0 \end{aligned} yi(j=1Nαjyjxjxi+b)=yi[(α1y1x1+α2y2x2++αNyNxN)xi+b)]=yi(α1y1x1xi+α2y2x2xi++αNyNxNxi+b)0
    Gram矩阵:
    G = [ x i ⋅ x j ] N × N = [ x 1 ⋅ x 1 x 1 ⋅ x 2 ⋯ x 1 ⋅ x N x 2 ⋅ x 1 x 2 ⋅ x 2 ⋯ x 2 ⋅ x N ⋮ ⋮ ⋮ x N ⋅ x 1 x N ⋅ x 2 ⋯ x N ⋅ x N ] G=\left[x_{i} \cdot x_{j}\right]_{N \times N}=\left[\begin{array}{cccc} x_{1} \cdot x_{1} & x_{1} \cdot x_{2} & \cdots & x_{1} \cdot x_{N} \\ x_{2} \cdot x_{1} & x_{2} \cdot x_{2} & \cdots & x_{2} \cdot x_{N} \\ \vdots & \vdots & & \vdots \\ x_{N} \cdot x_{1} & x_{N} \cdot x_{2} & \cdots & x_{N} \cdot x_{N} \end{array}\right] G=[xixj]N×N= x1x1x2x1xNx1x1x2x2x2xNx2x1xNx2xNxNxN

原文地址:https://blog.csdn.net/qq_30204431/article/details/145121771

免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!