学术研讨会 - 360文档中心

合集下载

相关主题

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Batch Normalization:
Accelerating Deep Network Training by Reducing Internal Covariate Shift
Background
Red point is the data set. Waste of time
Dash line is the initial random line. overfit
Experiments
Figure2:Single crop validation accuracy of Inception and its batch-normalized variants
Experiments
exceeds the estimated accuracy of human raters Figure3: Batch-Normalized Inception comparison with previous state of the art on the provided validation set comprising 50000 images. *BN-Inception ensemble has reached 4.82% top-5 error on the 100000 images of the test set of the ImageNet as reported by the test server.
Problem2:
How to use Batch Normalization to CNN?
Experiments
Figure 1:predicting the digit class on the MNIST dataset. (a) The test accuracy of the MNIST network trained with and without BN. (b,c) The evolution of input distributions to a typical sigmoid, over the course of training, shown as {15, 50, 85}th percentiles.
Batch size = 1
来自百度文库
Batch learning
Efficiently occupied memory Iterations decreased each epoch Convergence quickly
What is batch normalization?
X(1) ={ X1(1), x2(1), x3(1), x4(1), x5(1), x6(1), x7(1)}
Features of BN Higher learning rate
Less careful about initialization
Eliminating the need for Dropout
Thank you!
Problem1:
We can obtain E(x(k)) in training step due to the data is a batch.
How can we get E(x(k)) in testing step when the x(k) is a single number ?
Apply BN to DNN
Background
Mean = 0 Cor = 1 Decorrelated whiten
Compute complex
Background
Full batch learning Batch size = data size
Attemp to Mean = 0, Cor = 1
Online batch learning
X(k) is the kth dimension of input X. A batch contains X1(k), X2(k)…Xm(k). … … … γ(k) β(k) determined through iterations.
Apply BN to DNN
Apply BN to DNN