Skip to content

Commit

Permalink
Merge pull request #82 from jiangzhonglian/master
Browse files Browse the repository at this point in the history
更新完:SVM的所有测试代码和案例
  • Loading branch information
jiangzhonglian authored Apr 18, 2017
2 parents 5e237ac + 9a61112 commit c25cad8
Show file tree
Hide file tree
Showing 598 changed files with 20,196 additions and 570 deletions.
54 changes: 54 additions & 0 deletions docs/6.支持向量机.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,57 @@ SVM的一般流程
测试算法:十分简单的计算过程就可以实现。
使用算法:几乎所有分类问题都可以使用SVM,值得一提的是,SVM本身是一个二类分类器,对多类问题应用SVM需要对代码做一些修改。
```
* 到目前为止,我们已经了解了一些理论知识,现在我们通过`Code`来实现我们的算法吧。

## SMO高效优化算法

> 序列最小优化(Sequential Minimal Optimization, SMO)
* 创建作者:John Platt
* 创建时间:1996年
* SMO用途:用于训练SVM
* SMO目标:求出一系列alpha和b,一旦求出alpha,就很容易计算出权重向量w并得到分隔超平面。
* SMO思想:是讲大优化问题分解为多个小优化问题来求解的。
* SMO原理:每次循环选择两个alpha进行优化处理,一旦找出一对合适的alpha,那么就增大一个同时减少一个。
* 这里指的合适必须要符合一定的条件
* 1.这两个alpha必须要在间隔边界之外
* 2.这两个alpha还没有进行过区间化处理或者不在边界上。
* 之所以要同时改变2个alpha;原因,我们有一个约束条件:`Σ a[i]*label(i)=0`;如果只是修改一个alpha,很可能导致约束条件失效。

```
SMO伪代码大致如下:
创建一个alpha向量并将其初始化为0向量
当迭代次数小于最大迭代次数时(外循环)
对数据集中的每个数据向量(内循环):
如果该数据向量可以被优化
随机选择另外一个数据向量
同时优化这两个向量
如果两个向量都不能被优化,退出内循环
如果所有向量都没被优化,增加迭代数目,继续下一次循环
```

> SVM简化版:应用简化版SMO算法处理小规模数据集
代码可参考 svm-simple.py

> SVM完整版:使用完整 Platt SMO算法加速优化
代码可参考 svm-complete_Non-Kernel.py
* 优化点:选择alpha的方式不同。

## 在复杂数据上应用核函数

* 对于线性可分的情况,效果明显
* 对于非线性的情况也一样,此时需要用到一种叫`核函数(kernel)`的工具将数据转化为分类器易于理解的形式。

> 利用核函数将数据映射到高维空间
* 使用核函数:可以将数据从某个特征空间到另一个特征空间的映射。(通常情况下:这种映射会将低维特征空间映射到高维空间。)
* 如果觉得特征空间很装逼、很难理解。
* 可以把核函数想象成一个包装器(wrapper)或者是接口(interface),它能将数据从某个很难处理的形式转换成为另一个较容易处理的形式。
* 经过空间转换后:低维需要解决的非线性问题,就变成了高维需要解决的线性问题。
* SVM优化特别好的地方,在于所有的运算都可以写成内积(inner product: 是指2个向量相乘,得到单个标量 或者 数值);内核替换成核函数的方式被称为`核技巧(kernel trick)`或者`核"变电"(kernel substation)`
* 核函数并不仅仅应用于支持向量机,很多其他的机器学习算法也都用到核函数。最流行的核函数:径向基函数(radial basis function)
* 径向基函数的高斯版本,其具体的公式为:
* ![径向基函数的高斯版本](/images/6.SVM/SVM_6_radial-basis-function.jpg)
Binary file added images/6.SVM/SVM_6_radial-basis-function.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_0.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000001000000000000000000
00000000000111111000000000000000
00000000000111111111100000000000
00000000000111111111100000000000
00000000000011111111110000000000
00000000000011111111110000000000
00000000011111111111100000000000
00000000001111111111110000000000
00000000011111111111000000000000
00000000011111111111000000000000
00000000001111111111000000000000
00000000001111111111000000000000
00000000011111111111100000000000
00000000111111111110000000000000
00000000111111111110000000000000
00000001111111111111000000000000
00000011111111111110000000000000
00000011111111111110000000000000
00000001111111111111000000000000
00000001111111111111000000000000
00000000111111111100000000000000
00000000011111111110000000000000
00000000011111111100000000000000
00000000001111111110000000000000
00000000001111111110000000000000
00000000001111111110000000000000
00000000001111111111000000000000
00000000000111111111000000000000
00000000001111111111100000000000
00000000000111111111111100000000
00000000000011111111111100000000
00000000000001111111111000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000001111100000000000
00000000000000001111110000000000
00000000000000001111111000000000
00000000000000011111111000000000
00000000000000111111111000000000
00000000000000011111111000000000
00000000000000011111111000000000
00000000000000111111110000000000
00000000000000111111100000000000
00000000000001111111100000000000
00000000000001111111110000000000
00000000000001111111110000000000
00000000000001111111100000000000
00000000000011111110000000000000
00000000011111111110000000000000
00000001111111111111000000000000
00000011111111111111000000000000
00000011111111111111000000000000
00000011111111111110000000000000
00000000001111111111000000000000
00000000000000111111000000000000
00000000000001111111000000000000
00000000000111111110000000000000
00000000000011111111000000000000
00000000000011111111000000000000
00000000000011111111100000000000
00000000000011111111100000000000
00000000000000111111110000000000
00000000000000001111111111000000
00000000000000001111111111000000
00000000000000000111111111000000
00000000000000000001111000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_10.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000000001111100000000
00000000000000000111111100000000
00000000000000001111111110000000
00000000000000001111111110000000
00000000000000001111111111000000
00000000000001111111111110000000
00000000000111111111111110000000
00000000001111111111111110000000
00000000011111111111111100000000
00000000111111111111111100000000
00000001111111111111111000000000
00000011111111111111111000000000
00000011111111111111111100000000
00000001111111111111111100000000
00000001111000111111111000000000
00000000100000111111111000000000
00000000000000011111111000000000
00000000000000011111111000000000
00000000000000011111111000000000
00000000000000111111110000000000
00000000000000111111110000000000
00000000000000111111111000000000
00000000000000011111111000000000
00000000000000111111111000000000
00000000000000111111111000000000
00000000000000011111111000000000
00000000000000011111111000000000
00000000000000001111111100000000
00000000000000011111111100000000
00000000000000001111111100000000
00000000000000000111111000000000
00000000000000000011110000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_11.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000000000110000000000
00000000000000000011111100000000
00000000000000000011111110000000
00000000000000000011111110000000
00000000000000000011111110000000
00000000000000000011111110000000
00000000000000000111111110000000
00000000000000000111111110000000
00000000000000001111111110000000
00000000000000011111111110000000
00000000000000111111111100000000
00000000000001111111111100000000
00000000000001111111111100000000
00000000001111111111111100000000
00000000011111111111111100000000
00000000011111111111111000000000
00000000111111111111111100000000
00000001111111100111111100000000
00000001111111000111111100000000
00000011111111000011111100000000
00000001111110000011111100000000
00000000000000000011111100000000
00000000000000000011111100000000
00000000000000000011111000000000
00000000000000000011111000000000
00000000000000000011111000000000
00000000000000000011111000000000
00000000000000000011111000000000
00000000000000000011111100000000
00000000000000000011111110000000
00000000000000000011111110000000
00000000000000000000111100000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_12.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000011111000000000000000
00000000000001111100000000000000
00000000000111111111000000000000
00000000000011111111100000000000
00000000001111111111100000000000
00000000000111111111110000000000
00000000000111111111110000000000
00000000000111111111110000000000
00000000000011111111111000000000
00000000000011111111111000000000
00000000000001111111111100000000
00000000000001111111111100000000
00000000000001111111111100000000
00000000000011111111111110000000
00000000000011111111111110000000
00000000000001111111111100000000
00000000000111111111111100000000
00000000000111111111111100000000
00000000000011111111111110000000
00000000000011111111111110000000
00000000000001111111111100000000
00000000000011111111111100000000
00000000000011111111111000000000
00000000000001111111111110000000
00000000000111111111111100000000
00000000000011111111111110000000
00000000000011111111111110000000
00000000000011111111111110000000
00000000000111111111111111000000
00000000000111111111111111000000
00000000000000111111111110000000
00000000000000111111111000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_13.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000000001110000000000
00000000000000000011111000000000
00000000000000011111111110000000
00000000000000011111111110000000
00000000000000011111111110000000
00000000000011111111111110000000
00000000000111111111111100000000
00000000000111111111111100000000
00000000011111111111111100000000
00000001111111110111111100000000
00000001111111100111111100000000
00000001111111100111111100000000
00000000111100000111111000000000
00000000110000001111110000000000
00000000000000001111110000000000
00000000000000001111110000000000
00000000000000001111110000000000
00000000000000001111110000000000
00000000000000001111100000000000
00000000000000001111100000000000
00000000000000011111000000000000
00000000000000011111000000000000
00000000000000011111000000000000
00000000000000011111000000000000
00000000000000011111100000000000
00000000000000011111100000000000
00000000000000011111100000000000
00000000000000111111000000000000
00000000000000011111000000000000
00000000000000011111100000000000
00000000000000011111110000000000
00000000000000001111111000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_14.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000011111000000000000
00000000000000111111100000000000
00000000000000111111100000000000
00000000000000111111111000000000
00000000000001111111111100000000
00000000000011111111111000000000
00000000000111111111110000000000
00000000001111111111110000000000
00000000011111111111100000000000
00000000111111111111000000000000
00000001111111111111000000000000
00000001111111111111000000000000
00000111111111111110000000000000
00001111111111111110000000000000
00001111111111111110000000000000
00000011111111111110000000000000
00000001111111111110000000000000
00000000111111111100000000000000
00000000000111111110000000000000
00000000001111111110000000000000
00000000001111111100000000000000
00000000001111111100000000000000
00000000000111111111000000000000
00000000000111111110000000000000
00000000000011111110000000000000
00000000000011111111000000000000
00000000000001111111100000000000
00000000000001111111110000000000
00000000000000111111110000000000
00000000000000111111111000000000
00000000000000001111111100000000
00000000000000000111111000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_15.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000000000000000000000
00000000000000011000000000000000
00000001111111111100000000000000
00000001111111111110000000000000
00000000001111111111111000000000
00000000000111111111111000000000
00000000001111111111111100000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111110000000
00000000000111111111111100000000
00000000000111111111111000000000
00000000000111111111111000000000
00000000001111111111111000000000
00000000011111111111111000000000
00000000011111111111111000000000
00000000011111111111111000000000
00000000111111111111111000000000
00000001111111111111110000000000
00000001111111111111100000000000
00000001111111111000000000000000
00000000001110000000000000000000
00000000000000000000000000000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_16.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000000000000111110000000
00000000000000000000111110000000
00000000000000000001111111000000
00000000000000000011111111000000
00000000000000000111111111000000
00000000000000001111111111100000
00000000000000111111111111000000
00000000000001111111111111000000
00000000000111111111111111000000
00000000001111111111111111000000
00000000011111111111111111000000
00000001111111111110111111000000
00000011111111111000111111000000
00000111111111110000111110000000
00000111111111100000111110000000
00000011111110000001111110000000
00000001111100000001111110000000
00000000110000000001111110000000
00000000000000000001111110000000
00000000000000000001111110000000
00000000000000000001111110000000
00000000000000000001111100000000
00000000000000000001111100000000
00000000000000000011111100000000
00000000000000000011111100000000
00000000000000000011111100000000
00000000000000000001111110000000
00000000000000000001111111000000
00000000000000000001111111100000
00000000000000000001111111100000
00000000000000000001111111000000
00000000000000000000111110000000
32 changes: 32 additions & 0 deletions input/6.SVM/testDigits/1_17.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
00000000000001110000000000000000
00000000000011111100000000000000
00000000000111111100000000000000
00000000000011111111100000000000
00000000001111111111100000000000
00000000000111111111000000000000
00000000000111111111000000000000
00000000000111111111000000000000
00000000000111111111100000000000
00000000000111111111000000000000
00000000000011111111100000000000
00000000000011111111100000000000
00000000001111111111100000000000
00000000000111111111110000000000
00000000000111111111000000000000
00000000000111111111000000000000
00000000000111111111100000000000
00000000000111111111100000000000
00000000000011111111100000000000
00000000001111111111100000000000
00000000000111111111110000000000
00000000000111111111110000000000
00000000000111111111000000000000
00000000000011111111100000000000
00000000000011111111100000000000
00000000000001111111111100000000
00000000000011111111110000000000
00000000001111111111110000000000
00000000000111111111111000000000
00000000000111111111111000000000
00000000000000111111111100000000
00000000000000011111111000000000
Loading

0 comments on commit c25cad8

Please sign in to comment.