训练NN技巧神经网络

神经网络的训练依赖于问题和数据。Designing and training a network using backprop requires making many seemingly arbitrary choices such as the number and types of nodes, layers, learning rates, training and test sets, and so forth. These choices can be critical, yet there is no foolproof recipe for deciding them because they are largely problem and data dependent.
When fitting a neural network model, these terms can be defined as:
- Bias: A measure of how the network output averaged across all datasets differs from the desired function.
- Variance: A measure of how much the network output varies across datasets.
Early in training, the bias is large because the network output is far from the desired function. The variance is very small because the data has had little influence yet. Late in training, the bias is small because the network has learned the underlying function.
【训练NN技巧】网络过度训练，导致方差很大：However, if trained too long, the network will also have learned the noise specific to that dataset. This is referred to as overtraining. In such a case, the variance will be large because the noise varies between datasets.

基于上述的训练神经网络存在的问题，提出八个 Practical Tricks for Backpropagation：

1.4.1: Stochastic Versus Batch Learning 1.4.2: Shuffling the Examples 1.4.3: Normalizing the Inputs 1.4.4: The Sigmoid 1.4.5: Choosing Target Values 1.4.6: Initializing the Weights 1.4.7: Choosing Learning Rates 1.4.8: Radial Basis Function vs Sigmoid

Tip #1: Stochastic Versus Batch Learning