【简博士统计学习方法】第2章:3. 感知机——学习算法之对偶形式:算法解说
3. 感知机——学习算法之对偶形式:算法解说
3.4 对偶形式
- 在原始形式中,若
(
x
i
,
y
i
)
(x_i,y_i)
(xi,yi)为误分类点,可如下更新参数:
w ← w + η y i x i ; b ← b + η y i w \leftarrow w+\eta y_{i} x_{i} ; \quad b \leftarrow b+\eta y_{i} w←w+ηyixi;b←b+ηyi - 假设初始值 w 0 = 0 , b 0 = 0 w_0=\boldsymbol{0},b_0=0 w0=0,b0=0,对误分类点 ( x i , y i ) (x_i,y_i) (xi,yi)通过上述公式更新参数,修改 n i n_i ni次之后, w , b w,b w,b的增量分别为 α i y i x i \alpha_i y_i x_i αiyixi和 α i y i \alpha_i y_i αiyi,其中 α i = n i η i \alpha_i = n_i \eta_i αi=niηi
- 最后学习到的参数为,
w = ∑ i = 1 N α i y i x i ; b = ∑ i = 1 N α i y i w=\sum_{i=1}^{N} \alpha_{i} y_{i} x_{i} ; \quad b=\sum_{i=1}^{N} \alpha_{i} y_{i} w=i=1∑Nαiyixi;b=i=1∑Nαiyi
对偶形式学习算法的具体步骤: - 输入:训练集:
T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) ⋯ , ( x N , y N ) } T=\left\{\left(x_{1}, y_{1}\right),\left(x_{2}, y_{2}\right) \cdots,\left(x_{N}, y_{N}\right)\right\} T={(x1,y1),(x2,y2)⋯,(xN,yN)}
其中, x i ∈ X ⊆ R n , y ∈ Y = { + 1 , − 1 } x_i\in\mathcal{X}\subseteq\mathbb{R}^n,y\in\mathcal{Y}=\{+1,-1\} xi∈X⊆Rn,y∈Y={+1,−1};步长 η ( 0 < n ⩽ 1 ) \eta(0< n \leqslant 1) η(0<n⩽1). - 输出:
α
,
b
\alpha,b
α,b;感知机模型
f
(
x
)
=
sign
(
∑
j
=
1
N
α
j
y
j
x
j
⋅
x
+
b
)
f(x)=\operatorname{sign}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x+b\right)
f(x)=sign(∑j=1Nαjyjxj⋅x+b),其中
α
=
(
α
1
,
α
2
,
⋯
,
α
N
)
T
\alpha=\left(\alpha_{1}, \alpha_{2}, \cdots, \alpha_{N}\right)^{T}
α=(α1,α2,⋯,αN)T
(1)选取初始值 α < 0 > = ( 0 , 0 , ⋯ , 0 ) T , b < 0 > = 0 \alpha^{<0>}=(0,0,\cdots,0)^T,b^{<0>}=0 α<0>=(0,0,⋯,0)T,b<0>=0;
(2)于训练集中随机选取数据 ( x i , y i ) (x_i,y_i) (xi,yi);
(3)若 y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) ⩽ 0 y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+b\right) \leqslant 0 yi(∑j=1Nαjyjxj⋅xi+b)⩽0,
α i ← α i + η ; b ← b + η y i \alpha_{i} \leftarrow \alpha_{i}+\eta ; \quad b \leftarrow b+\eta y_{i} αi←αi+η;b←b+ηyi
(4)转(2),直到训练集中没有误分类点。 - Gram矩阵:若
y
i
(
∑
j
=
1
N
α
j
y
j
x
j
⋅
x
i
+
b
)
⩽
0
y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+b\right) \leqslant 0
yi(∑j=1Nαjyjxj⋅xi+b)⩽0,
α i ← α i + η ; b ← b + η y i \alpha_{i} \leftarrow \alpha_{i}+\eta ; \quad b \leftarrow b+\eta y_{i} αi←αi+η;b←b+ηyi
迭代条件:
y i ( ∑ j = 1 N α j y j x j ⋅ x i + b ) = y i [ ( α 1 y 1 x 1 + α 2 y 2 x 2 + ⋯ + α N y N x N ) ⋅ x i + b ) ] = y i ( α 1 y 1 x 1 ⋅ x i + α 2 y 2 x 2 ⋅ x i + ⋯ + α N y N x N ⋅ x i + b ) ⩽ 0 \begin{aligned} y_{i}\left(\sum_{j=1}^{N} \alpha_{j} y_{j} x_{j} \cdot x_{i}+b\right) & \left.=y_{i}\left[\left(\alpha_{1} y_{1} x_{1}+\alpha_{2} y_{2} x_{2}+\cdots+\alpha_{N} y_{N} x_{N}\right) \cdot x_{i}+b\right)\right] \\ & =y_{i}\left(\alpha_{1} y_{1} x_{1} \cdot x_{i}+\alpha_{2} y_{2} x_{2} \cdot x_{i}+\cdots+\alpha_{N} y_{N} x_{N} \cdot x_{i}+b\right) \\ & \leqslant 0 \end{aligned} yi(j=1∑Nαjyjxj⋅xi+b)=yi[(α1y1x1+α2y2x2+⋯+αNyNxN)⋅xi+b)]=yi(α1y1x1⋅xi+α2y2x2⋅xi+⋯+αNyNxN⋅xi+b)⩽0
Gram矩阵:
G = [ x i ⋅ x j ] N × N = [ x 1 ⋅ x 1 x 1 ⋅ x 2 ⋯ x 1 ⋅ x N x 2 ⋅ x 1 x 2 ⋅ x 2 ⋯ x 2 ⋅ x N ⋮ ⋮ ⋮ x N ⋅ x 1 x N ⋅ x 2 ⋯ x N ⋅ x N ] G=\left[x_{i} \cdot x_{j}\right]_{N \times N}=\left[\begin{array}{cccc} x_{1} \cdot x_{1} & x_{1} \cdot x_{2} & \cdots & x_{1} \cdot x_{N} \\ x_{2} \cdot x_{1} & x_{2} \cdot x_{2} & \cdots & x_{2} \cdot x_{N} \\ \vdots & \vdots & & \vdots \\ x_{N} \cdot x_{1} & x_{N} \cdot x_{2} & \cdots & x_{N} \cdot x_{N} \end{array}\right] G=[xi⋅xj]N×N= x1⋅x1x2⋅x1⋮xN⋅x1x1⋅x2x2⋅x2⋮xN⋅x2⋯⋯⋯x1⋅xNx2⋅xN⋮xN⋅xN
原文地址:https://blog.csdn.net/qq_30204431/article/details/145121771
免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!