When facing difficulties during training (nan
s, loss does not converge, etc.) it is sometimes useful to look at more verbose training log by setting debug_info: true
in the 'solver.prototxt'
file.
The training log then looks something like:
I1109 ...] [Forward] Layer data, top blob data data: 0.343971 I1109 ...] [Forward] Layer conv1, top blob conv1 data: 0.0645037 I1109 ...] [Forward] Layer conv1, param blob 0 data: 0.00899114 I1109 ...] [Forward] Layer conv1, param blob 1 data: 0 I1109 ...] [Forward] Layer relu1, top blob conv1 data: 0.0337982 I1109 ...] [Forward] Layer conv2, top blob conv2 data: 0.0249297 I1109 ...] [Forward] Layer conv2, param blob 0 data: 0.00875855 I1109 ...] [Forward] Layer conv2, param blob 1 data: 0 I1109 ...] [Forward] Layer relu2, top blob conv2 data: 0.0128249 . . . I1109 ...] [Forward] Layer fc1, top blob fc1 data: 0.00728743 I1109 ...] [Forward] Layer fc1, param blob 0 data: 0.00876866 I1109 ...] [Forward] Layer fc1, param blob 1 data: 0 I1109 ...] [Forward] Layer loss, top blob loss data: 2031.85 I1109 ...] [Backward] Layer loss, bottom blob fc1 diff: 0.124506 I1109 ...] [Backward] Layer fc1, bottom blob conv6 diff: 0.00107067 I1109 ...] [Backward] Layer fc1, param blob 0 diff: 0.483772 I1109 ...] [Backward] Layer fc1, param blob 1 diff: 4079.72 . . . I1109 ...] [Backward] Layer conv2, bottom blob conv1 diff: 5.99449e-06 I1109 ...] [Backward] Layer conv2, param blob 0 diff: 0.00661093 I1109 ...] [Backward] Layer conv2, param blob 1 diff: 0.10995 I1109 ...] [Backward] Layer relu1, bottom blob conv1 diff: 2.87345e-06 I1109 ...] [Backward] Layer conv1, param blob 0 diff: 0.0220984 I1109 ...] [Backward] Layer conv1, param blob 1 diff: 0.0429201 E1109 ...] [Backward] All net params (data, diff): L1 norm = (2711.42, 7086.66); L2 norm = (6.11659, 4085.07)
What does it mean?
Question&Answers:os