-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathtraining_results_4.txt
364 lines (361 loc) · 19.8 KB
/
training_results_4.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
batch_size 100, epochs 20, learning rate 5e-05, coeff kld 1.0, coeff kd0.1
train epoch 0, iteration 0, loss -1.4628989696502686
validation epoch 0, iteration 0, loss 2.3304070419311524
train epoch 0, iteration 50, loss -2.185535430908203
train epoch 0, iteration 100, loss -2.6167900562286377
validation epoch 0, iteration 100, loss 1.4933358322143555
train epoch 0, iteration 150, loss -3.0943899154663086
train epoch 0, iteration 200, loss -3.261019706726074
validation epoch 0, iteration 200, loss 1.027989549255371
train epoch 0, iteration 250, loss -3.6068296432495117
train epoch 0, iteration 300, loss -3.7937307357788086
validation epoch 0, iteration 300, loss 0.774350813293457
train epoch 0, iteration 350, loss -3.977644205093384
train epoch 0, iteration 400, loss -4.083950996398926
validation epoch 0, iteration 400, loss 0.6374060615539551
train epoch 0, iteration 450, loss -4.28731107711792
train epoch 0, iteration 500, loss -4.291780948638916
validation epoch 0, iteration 500, loss 0.5501952907562256
train epoch 0, iteration 550, loss -4.385025978088379
train epoch 1, iteration 0, loss -4.511828899383545
validation epoch 1, iteration 0, loss 0.49093224296569826
train epoch 1, iteration 50, loss -4.607491970062256
train epoch 1, iteration 100, loss -4.703230381011963
validation epoch 1, iteration 100, loss 0.4541106460571289
train epoch 1, iteration 150, loss -4.554227828979492
train epoch 1, iteration 200, loss -4.723016738891602
validation epoch 1, iteration 200, loss 0.4274974321365356
train epoch 1, iteration 250, loss -4.887784004211426
train epoch 1, iteration 300, loss -4.629098892211914
validation epoch 1, iteration 300, loss 0.4075149789810181
train epoch 1, iteration 350, loss -4.892836093902588
train epoch 1, iteration 400, loss -4.86621618270874
validation epoch 1, iteration 400, loss 0.39343089904785156
train epoch 1, iteration 450, loss -4.902673721313477
train epoch 1, iteration 500, loss -4.968939304351807
validation epoch 1, iteration 500, loss 0.38044742546081545
train epoch 1, iteration 550, loss -4.958748817443848
train epoch 2, iteration 0, loss -5.069705009460449
validation epoch 2, iteration 0, loss 0.36816547298431396
train epoch 2, iteration 50, loss -5.118947505950928
train epoch 2, iteration 100, loss -5.242410182952881
validation epoch 2, iteration 100, loss 0.35954637985229493
train epoch 2, iteration 150, loss -5.252483367919922
train epoch 2, iteration 200, loss -5.301260948181152
validation epoch 2, iteration 200, loss 0.3528047788619995
train epoch 2, iteration 250, loss -5.195025444030762
train epoch 2, iteration 300, loss -5.392702102661133
validation epoch 2, iteration 300, loss 0.34469941635131834
train epoch 2, iteration 350, loss -5.377352714538574
train epoch 2, iteration 400, loss -5.399631500244141
validation epoch 2, iteration 400, loss 0.33700328464508056
train epoch 2, iteration 450, loss -5.43740701675415
train epoch 2, iteration 500, loss -5.266648769378662
validation epoch 2, iteration 500, loss 0.3326132411956787
train epoch 2, iteration 550, loss -5.472346305847168
train epoch 3, iteration 0, loss -5.518528461456299
validation epoch 3, iteration 0, loss 0.3265012643814087
train epoch 3, iteration 50, loss -5.458791732788086
train epoch 3, iteration 100, loss -5.389316082000732
validation epoch 3, iteration 100, loss 0.32089921073913574
train epoch 3, iteration 150, loss -5.487395763397217
train epoch 3, iteration 200, loss -5.518479824066162
validation epoch 3, iteration 200, loss 0.3174689866065979
train epoch 3, iteration 250, loss -5.595519065856934
train epoch 3, iteration 300, loss -5.612948894500732
validation epoch 3, iteration 300, loss 0.30986913356781004
train epoch 3, iteration 350, loss -5.517746448516846
train epoch 3, iteration 400, loss -5.64829158782959
validation epoch 3, iteration 400, loss 0.30473430995941164
train epoch 3, iteration 450, loss -5.7534708976745605
train epoch 3, iteration 500, loss -5.777453422546387
validation epoch 3, iteration 500, loss 0.2975935022354126
train epoch 3, iteration 550, loss -5.564565658569336
train epoch 4, iteration 0, loss -5.735930919647217
validation epoch 4, iteration 0, loss 0.2950155598640442
train epoch 4, iteration 50, loss -5.808049201965332
train epoch 4, iteration 100, loss -5.845757007598877
validation epoch 4, iteration 100, loss 0.28988148651123047
train epoch 4, iteration 150, loss -5.85590934753418
train epoch 4, iteration 200, loss -5.819192886352539
validation epoch 4, iteration 200, loss 0.285243027305603
train epoch 4, iteration 250, loss -5.861352443695068
train epoch 4, iteration 300, loss -5.885369300842285
validation epoch 4, iteration 300, loss 0.2802545739173889
train epoch 4, iteration 350, loss -5.920321941375732
train epoch 4, iteration 400, loss -5.911571025848389
validation epoch 4, iteration 400, loss 0.275831907081604
train epoch 4, iteration 450, loss -5.975128650665283
train epoch 4, iteration 500, loss -6.05881404876709
validation epoch 4, iteration 500, loss 0.2747173014640808
train epoch 4, iteration 550, loss -6.069207191467285
train epoch 5, iteration 0, loss -5.926552772521973
validation epoch 5, iteration 0, loss 0.26957765493392943
train epoch 5, iteration 50, loss -6.030980110168457
train epoch 5, iteration 100, loss -6.045607089996338
validation epoch 5, iteration 100, loss 0.26427816572189333
train epoch 5, iteration 150, loss -6.116491317749023
train epoch 5, iteration 200, loss -6.128604412078857
validation epoch 5, iteration 200, loss 0.26392447767257693
train epoch 5, iteration 250, loss -6.11085319519043
train epoch 5, iteration 300, loss -6.1300835609436035
validation epoch 5, iteration 300, loss 0.2589313265800476
train epoch 5, iteration 350, loss -6.1209235191345215
train epoch 5, iteration 400, loss -6.114326000213623
validation epoch 5, iteration 400, loss 0.2567297103881836
train epoch 5, iteration 450, loss -6.281443119049072
train epoch 5, iteration 500, loss -6.215927600860596
validation epoch 5, iteration 500, loss 0.25167129287719725
train epoch 5, iteration 550, loss -6.1410627365112305
train epoch 6, iteration 0, loss -6.174037933349609
validation epoch 6, iteration 0, loss 0.2471798593521118
train epoch 6, iteration 50, loss -6.227602481842041
train epoch 6, iteration 100, loss -6.2930588722229
validation epoch 6, iteration 100, loss 0.24652800607681274
train epoch 6, iteration 150, loss -6.409376621246338
train epoch 6, iteration 200, loss -6.200093746185303
validation epoch 6, iteration 200, loss 0.24163800859451293
train epoch 6, iteration 250, loss -6.215313911437988
train epoch 6, iteration 300, loss -6.374815464019775
validation epoch 6, iteration 300, loss 0.24103642110824586
train epoch 6, iteration 350, loss -6.38955545425415
train epoch 6, iteration 400, loss -6.378422737121582
validation epoch 6, iteration 400, loss 0.2366472677230835
train epoch 6, iteration 450, loss -6.371236801147461
train epoch 6, iteration 500, loss -6.456579208374023
validation epoch 6, iteration 500, loss 0.23368361415863037
train epoch 6, iteration 550, loss -6.468822956085205
train epoch 7, iteration 0, loss -6.461745738983154
validation epoch 7, iteration 0, loss 0.2308976806640625
train epoch 7, iteration 50, loss -6.448482990264893
train epoch 7, iteration 100, loss -6.386987686157227
validation epoch 7, iteration 100, loss 0.22925016527175904
train epoch 7, iteration 150, loss -6.506495475769043
train epoch 7, iteration 200, loss -6.449230194091797
validation epoch 7, iteration 200, loss 0.2259226375579834
train epoch 7, iteration 250, loss -6.452174663543701
train epoch 7, iteration 300, loss -6.554157257080078
validation epoch 7, iteration 300, loss 0.22254526166915894
train epoch 7, iteration 350, loss -6.556105613708496
train epoch 7, iteration 400, loss -6.521121025085449
validation epoch 7, iteration 400, loss 0.21991798782348632
train epoch 7, iteration 450, loss -6.630686283111572
train epoch 7, iteration 500, loss -6.631582260131836
validation epoch 7, iteration 500, loss 0.2204140588760376
train epoch 7, iteration 550, loss -6.451493263244629
train epoch 8, iteration 0, loss -6.6395416259765625
validation epoch 8, iteration 0, loss 0.21804924354553223
train epoch 8, iteration 50, loss -6.575711250305176
train epoch 8, iteration 100, loss -6.618554592132568
validation epoch 8, iteration 100, loss 0.21633989362716674
train epoch 8, iteration 150, loss -6.605379104614258
train epoch 8, iteration 200, loss -6.623338222503662
validation epoch 8, iteration 200, loss 0.21322539672851562
train epoch 8, iteration 250, loss -6.744997501373291
train epoch 8, iteration 300, loss -6.698851585388184
validation epoch 8, iteration 300, loss 0.20895151910781862
train epoch 8, iteration 350, loss -6.745121479034424
train epoch 8, iteration 400, loss -6.741675853729248
validation epoch 8, iteration 400, loss 0.20788978595733643
train epoch 8, iteration 450, loss -6.731369495391846
train epoch 8, iteration 500, loss -6.669680118560791
validation epoch 8, iteration 500, loss 0.2048502188682556
train epoch 8, iteration 550, loss -6.7602105140686035
train epoch 9, iteration 0, loss -6.736696243286133
validation epoch 9, iteration 0, loss 0.20423992557525636
train epoch 9, iteration 50, loss -6.765185832977295
train epoch 9, iteration 100, loss -6.869357109069824
validation epoch 9, iteration 100, loss 0.20340191402435304
train epoch 9, iteration 150, loss -6.78419828414917
train epoch 9, iteration 200, loss -6.771018981933594
validation epoch 9, iteration 200, loss 0.20027134799957275
train epoch 9, iteration 250, loss -6.831057071685791
train epoch 9, iteration 300, loss -6.885773658752441
validation epoch 9, iteration 300, loss 0.19789945707321166
train epoch 9, iteration 350, loss -6.847071170806885
train epoch 9, iteration 400, loss -6.871266841888428
validation epoch 9, iteration 400, loss 0.19616088275909424
train epoch 9, iteration 450, loss -6.8716959953308105
train epoch 9, iteration 500, loss -6.866950035095215
validation epoch 9, iteration 500, loss 0.19525909719467163
train epoch 9, iteration 550, loss -6.952884197235107
train epoch 10, iteration 0, loss -6.955913543701172
validation epoch 10, iteration 0, loss 0.19435686321258544
train epoch 10, iteration 50, loss -6.916347026824951
train epoch 10, iteration 100, loss -6.935521602630615
validation epoch 10, iteration 100, loss 0.19077639150619508
train epoch 10, iteration 150, loss -7.005374431610107
train epoch 10, iteration 200, loss -6.975936412811279
validation epoch 10, iteration 200, loss 0.19014845299720765
train epoch 10, iteration 250, loss -7.002256870269775
train epoch 10, iteration 300, loss -7.032391548156738
validation epoch 10, iteration 300, loss 0.19078654518127441
train epoch 10, iteration 350, loss -7.073795318603516
train epoch 10, iteration 400, loss -7.011958122253418
validation epoch 10, iteration 400, loss 0.18807009902000427
train epoch 10, iteration 450, loss -7.061474323272705
train epoch 10, iteration 500, loss -7.016116142272949
validation epoch 10, iteration 500, loss 0.18692666606903077
train epoch 10, iteration 550, loss -7.070005416870117
train epoch 11, iteration 0, loss -7.084863662719727
validation epoch 11, iteration 0, loss 0.18554010248184205
train epoch 11, iteration 50, loss -6.971772193908691
train epoch 11, iteration 100, loss -7.0668768882751465
validation epoch 11, iteration 100, loss 0.18512255239486694
train epoch 11, iteration 150, loss -7.079649925231934
train epoch 11, iteration 200, loss -7.121644496917725
validation epoch 11, iteration 200, loss 0.1825265112400055
train epoch 11, iteration 250, loss -7.017441749572754
train epoch 11, iteration 300, loss -7.146275043487549
validation epoch 11, iteration 300, loss 0.18154267549514772
train epoch 11, iteration 350, loss -7.144844055175781
train epoch 11, iteration 400, loss -7.140236854553223
validation epoch 11, iteration 400, loss 0.1809679277420044
train epoch 11, iteration 450, loss -7.194940090179443
train epoch 11, iteration 500, loss -7.0985918045043945
validation epoch 11, iteration 500, loss 0.18183787603378296
train epoch 11, iteration 550, loss -7.166686058044434
train epoch 12, iteration 0, loss -7.180776596069336
validation epoch 12, iteration 0, loss 0.1786402940750122
train epoch 12, iteration 50, loss -7.146398067474365
train epoch 12, iteration 100, loss -7.248493671417236
validation epoch 12, iteration 100, loss 0.17881529531478882
train epoch 12, iteration 150, loss -7.271069526672363
train epoch 12, iteration 200, loss -7.184361934661865
validation epoch 12, iteration 200, loss 0.17721774463653564
train epoch 12, iteration 250, loss -7.193194389343262
train epoch 12, iteration 300, loss -7.280644416809082
validation epoch 12, iteration 300, loss 0.17407514362335205
train epoch 12, iteration 350, loss -7.142693996429443
train epoch 12, iteration 400, loss -7.295604705810547
validation epoch 12, iteration 400, loss 0.1723844946861267
train epoch 12, iteration 450, loss -7.266312599182129
train epoch 12, iteration 500, loss -7.295431613922119
validation epoch 12, iteration 500, loss 0.17162316970825195
train epoch 12, iteration 550, loss -7.282637119293213
train epoch 13, iteration 0, loss -7.311875343322754
validation epoch 13, iteration 0, loss 0.17237155122756959
train epoch 13, iteration 50, loss -7.321512222290039
train epoch 13, iteration 100, loss -7.298182487487793
validation epoch 13, iteration 100, loss 0.1707719307899475
train epoch 13, iteration 150, loss -7.378393650054932
train epoch 13, iteration 200, loss -7.36244535446167
validation epoch 13, iteration 200, loss 0.16882521572113038
train epoch 13, iteration 250, loss -7.257751941680908
train epoch 13, iteration 300, loss -7.4017486572265625
validation epoch 13, iteration 300, loss 0.169834792470932
train epoch 13, iteration 350, loss -7.259993553161621
train epoch 13, iteration 400, loss -7.399552822113037
validation epoch 13, iteration 400, loss 0.16875281229019165
train epoch 13, iteration 450, loss -7.380742073059082
train epoch 13, iteration 500, loss -7.297906875610352
validation epoch 13, iteration 500, loss 0.16676820216178895
train epoch 13, iteration 550, loss -7.362351894378662
train epoch 14, iteration 0, loss -7.4414567947387695
validation epoch 14, iteration 0, loss 0.16557851581573485
train epoch 14, iteration 50, loss -7.3770904541015625
train epoch 14, iteration 100, loss -7.3416361808776855
validation epoch 14, iteration 100, loss 0.16745097122192382
train epoch 14, iteration 150, loss -7.4948039054870605
train epoch 14, iteration 200, loss -7.501307010650635
validation epoch 14, iteration 200, loss 0.16770916986465453
train epoch 14, iteration 250, loss -7.465132713317871
train epoch 14, iteration 300, loss -7.462818145751953
validation epoch 14, iteration 300, loss 0.1633938277244568
train epoch 14, iteration 350, loss -7.524635314941406
train epoch 14, iteration 400, loss -7.5000200271606445
validation epoch 14, iteration 400, loss 0.16550106725692748
train epoch 14, iteration 450, loss -7.556066036224365
train epoch 14, iteration 500, loss -7.525626182556152
validation epoch 14, iteration 500, loss 0.1603415198802948
train epoch 14, iteration 550, loss -7.481273651123047
train epoch 15, iteration 0, loss -7.516364574432373
validation epoch 15, iteration 0, loss 0.16151165351867675
train epoch 15, iteration 50, loss -7.484051704406738
train epoch 15, iteration 100, loss -7.514545917510986
validation epoch 15, iteration 100, loss 0.16046455926895142
train epoch 15, iteration 150, loss -7.500998020172119
train epoch 15, iteration 200, loss -7.56790828704834
validation epoch 15, iteration 200, loss 0.1606443977832794
train epoch 15, iteration 250, loss -7.501270771026611
train epoch 15, iteration 300, loss -7.516714096069336
validation epoch 15, iteration 300, loss 0.16112163019180298
train epoch 15, iteration 350, loss -7.461143970489502
train epoch 15, iteration 400, loss -7.5480170249938965
validation epoch 15, iteration 400, loss 0.15835418677330018
train epoch 15, iteration 450, loss -7.567783355712891
train epoch 15, iteration 500, loss -7.626298427581787
validation epoch 15, iteration 500, loss 0.15679810819625856
train epoch 15, iteration 550, loss -7.525123596191406
train epoch 16, iteration 0, loss -7.534162521362305
validation epoch 16, iteration 0, loss 0.15725885829925537
train epoch 16, iteration 50, loss -7.630160331726074
train epoch 16, iteration 100, loss -7.628406524658203
validation epoch 16, iteration 100, loss 0.15726545495986938
train epoch 16, iteration 150, loss -7.600445747375488
train epoch 16, iteration 200, loss -7.564030647277832
validation epoch 16, iteration 200, loss 0.15785807633399965
train epoch 16, iteration 250, loss -7.679696559906006
train epoch 16, iteration 300, loss -7.656528472900391
validation epoch 16, iteration 300, loss 0.1561799503326416
train epoch 16, iteration 350, loss -7.683084011077881
train epoch 16, iteration 400, loss -7.659872055053711
validation epoch 16, iteration 400, loss 0.15399082884788512
train epoch 16, iteration 450, loss -7.623642444610596
train epoch 16, iteration 500, loss -7.712148189544678
validation epoch 16, iteration 500, loss 0.15644470858573914
train epoch 16, iteration 550, loss -7.69417667388916
train epoch 17, iteration 0, loss -7.6654133796691895
validation epoch 17, iteration 0, loss 0.1542289405822754
train epoch 17, iteration 50, loss -7.620699882507324
train epoch 17, iteration 100, loss -7.642889499664307
validation epoch 17, iteration 100, loss 0.15420993223190307
train epoch 17, iteration 150, loss -7.626218795776367
train epoch 17, iteration 200, loss -7.7242631912231445
validation epoch 17, iteration 200, loss 0.15530907669067384
train epoch 17, iteration 250, loss -7.720007419586182
train epoch 17, iteration 300, loss -7.729181289672852
validation epoch 17, iteration 300, loss 0.15202906770706176
train epoch 17, iteration 350, loss -7.717782974243164
train epoch 17, iteration 400, loss -7.771949768066406
validation epoch 17, iteration 400, loss 0.15263185439109803
train epoch 17, iteration 450, loss -7.733894348144531
train epoch 17, iteration 500, loss -7.743658065795898
validation epoch 17, iteration 500, loss 0.1522386113166809
train epoch 17, iteration 550, loss -7.744875431060791
train epoch 18, iteration 0, loss -7.741959571838379
validation epoch 18, iteration 0, loss 0.15172442870140077
train epoch 18, iteration 50, loss -7.805287837982178
train epoch 18, iteration 100, loss -7.773956775665283
validation epoch 18, iteration 100, loss 0.15352025990486146
train epoch 18, iteration 150, loss -7.773783206939697
train epoch 18, iteration 200, loss -7.718197345733643
validation epoch 18, iteration 200, loss 0.1488477972984314
train epoch 18, iteration 250, loss -7.68657922744751
train epoch 18, iteration 300, loss -7.750718593597412
validation epoch 18, iteration 300, loss 0.14923653268814088
train epoch 18, iteration 350, loss -7.7528157234191895
train epoch 18, iteration 400, loss -7.855215549468994
validation epoch 18, iteration 400, loss 0.15116800961494445
train epoch 18, iteration 450, loss -7.7992143630981445
train epoch 18, iteration 500, loss -7.808616638183594
validation epoch 18, iteration 500, loss 0.1502827313899994
train epoch 18, iteration 550, loss -7.780680179595947
train epoch 19, iteration 0, loss -7.838714122772217
validation epoch 19, iteration 0, loss 0.14897474331855773
train epoch 19, iteration 50, loss -7.758431434631348
train epoch 19, iteration 100, loss -7.761359214782715
validation epoch 19, iteration 100, loss 0.14696234188079835
train epoch 19, iteration 150, loss -7.7923583984375
train epoch 19, iteration 200, loss -7.855258941650391
validation epoch 19, iteration 200, loss 0.14869386143684388
train epoch 19, iteration 250, loss -7.823734760284424
train epoch 19, iteration 300, loss -7.863986492156982
validation epoch 19, iteration 300, loss 0.1462991591453552
train epoch 19, iteration 350, loss -7.912228107452393
train epoch 19, iteration 400, loss -7.868411064147949
validation epoch 19, iteration 400, loss 0.14624630584716797
train epoch 19, iteration 450, loss -7.816517353057861
train epoch 19, iteration 500, loss -7.857224941253662
validation epoch 19, iteration 500, loss 0.14803501100540162
train epoch 19, iteration 550, loss -7.8919525146484375