深度学习中的Dropout简介及实现

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

深度学习中的Dropout简介及实现
由于每次使⽤输⼊⽹络的样本进⾏权值更新时，隐含层的节点都是以⼀定的概率随机出现，因此不能保证每2个隐含节点每次都能同时出现，这样权值更新将不再依赖于具有固定关系隐含节点的共同作⽤，阻⽌了某些特征仅仅在其它特定特征下才有效果的情况。

消除或减弱了神经元节点间的联合适应性，增强了泛化能⼒。

也可以将Dropout看做是模型平均的⼀种。

所谓模型平均，就是把来⾃不同模型的估计或者预测通过⼀定的权重平均起来，在⼀些⽂献中也称为模型组合。

因为对于每次输⼊到⽹络的样本，由于隐藏层节点的随机性，其对应的⽹络结构都是不同的，但所有的这些不同的⽹络结构⼜同时共享了隐藏层之间的权值。

Dropout神经⽹络结构如下图所⽰：
应⽤Dropout的多层神经⽹络中，在训练阶段的前向计算过程如下所⽰：
标准⽹络与Dropout⽹络的基本操作⽐较如下图所⽰：
Dropout在训练阶段和测试阶段描述如下图所⽰：
Dropout适合⽤于数据量⼤的⼤型⽹络中。

Dropout应⽤在训练阶段，⽤于减少过拟合。

Dropout可以与神经⽹络的⼤多数层⼀起使⽤，如全连接层、卷积层、循环层。

Dropout能被实现在任何隐含层或可见层或输⼊层，Dropout不能被应⽤在输出层。

Dropout⼀般执⾏过程：
(1). ⾸先随机(临时)删掉⽹络中⼀些隐藏神经元即使某些神经元输出为0，输⼊输出神经元保持不变
(2). 然后把输⼊通过修改后的⽹络前向传播，然后把得到的损失结果通过修改的⽹络反向传播。

⼀⼩批训练样本执⾏完这个过程后，在没有被删除的神经元上按照随机梯度下降法更新对应的参数(w, b)。

(3). 然后继续重复这⼀过程：恢复被删掉的神经元(此时被删除的神经元参数保持不变，⽽没有被删除的神经元已经有所更新)；从隐藏层神经元中再次随机选择⼀些神经元临时删除掉(备份被删除神经元的参数)；对⼀⼩批训练样本，先前向传播然后反向传播损失并根据随机梯度下降法更新参数(w，b)(没有被删除的那⼀部分参数得到更新，删除的神经元参数保持被删除前的结果)。

不断重复这⼀过程。

以下是测试代码：
#include <time.h>
#include <cmath>
#include <vector>
#include <limits>
#include <string>
#include <tuple>
#include <random>
#include <random>
#include <memory>
#include <random>
#include <opencv2/opencv.hpp>
#define EXP 1.0e-5
namespace fbc {
// ============================ Dropout ================================ template<class T>
int dropout(const T* bottom, int width, int height, T* top, float dropout_ratio = 0.5f)
{
if (dropout_ratio <= 0.f || dropout_ratio >= 1.f) {
fprintf(stderr, "Error: dropout_ratio's value should be: (0., 1.): %f\n", dropout_ratio);
return -1;
}
std::random_device rd; std::mt19937 gen(rd());
std::bernoulli_distribution d(1. - dropout_ratio);
int size = height * width;
std::unique_ptr<int[]> mask(new int[size]);
for (int i = 0; i < size; ++i) {
mask[i] = (int)d(gen);
}
float scale = 1. / (1. - dropout_ratio);
for (int i = 0; i < size; ++i) {
top[i] = bottom[i] * mask[i] * scale;
}
return 0;
}
} // namespace fbc
int test_dropout()
{
std::random_device rd; std::mt19937 gen(rd());
int height = 4, width = 8, size = height * width;
std::unique_ptr<float[]> bottom(new float[size]), top(new float[size]);
std::uniform_real_distribution<float> distribution(-10.f, 10.f);
for (int i = 0; i < size; ++i) {
bottom[i] = distribution(gen);
}
float dropout_ratio = 0.8f;
if (fbc::dropout(bottom.get(), width, height, top.get(), dropout_ratio) != 0) {
fprintf(stderr, "Error: fail to dropout\n");
return -1;
}
fprintf(stdout, "bottom data:\n");
for (int h = 0; h < height; ++h) {
for (int w = 0; w < width; ++w) {
fprintf(stdout, " %f ", bottom[h * width + w]);
}
fprintf(stdout, "\n");
}
fprintf(stdout, "top data:\n");
for (int h = 0; h < height; ++h) {
for (int w = 0; w < width; ++w) {
fprintf(stdout, " %f ", top[h * width + w]);
fprintf(stdout, " %f ", top[h * width + w]); }
fprintf(stdout, "\n");
}
return 0;
}
执⾏结果如下图所⽰：。