针对相同数据集分别采用CNN, RNN, RNN Attention机制进行分类
环境 推荐Anaconda mini管理
python3.6.5
tensorflow==1.8.0
- data_utils.py 为数据预处理文件. 可以有
Document*Sentence*Word
模式 和Document*Word
模式. 预处理时采用收尾截取一定数量文本作为训练样本(总分总). - cnn_model.py 为cnn模型文件, 取固定句子长度. 采用
Document*Word
模式 - rnn_model.py 为rnn
bilstm
模型文件, 每批次句子长度不固定. 采用Document*Word
模式 - rnn_attention.py 为rnn
gru
attention模型文件, 句子长度不固定. 采用Document*Sentence*Word
模式 - train_model.py 为模型加载文件, 分为公共超参数和自定义参数.
- predict.py 为预测文件.
2338 <Train> Epoch: [3] Iter: [2334] Step: [149279] Loss: [0.044] Acc: [1.000]
2339 <Train> Epoch: [3] Iter: [2335] Step: [149343] Loss: [0.163] Acc: [0.953]
2340 <Train> Epoch: [3] Iter: [2336] Step: [149407] Loss: [0.029] Acc: [1.000]
2341 <Train> Epoch: [3] Iter: [2337] Step: [149471] Loss: [0.032] Acc: [1.000]
2342 <Train> Epoch: [3] Iter: [2338] Step: [149535] Loss: [0.140] Acc: [0.938]
2343 <Train> Epoch: [3] Iter: [2339] Step: [149599] Loss: [0.134] Acc: [0.953]
2344 <Train> Epoch: [3] Iter: [2340] Step: [149663] Loss: [0.120] Acc: [0.953]
2345 <Train> Epoch: [3] Iter: [2341] Step: [149727] Loss: [0.112] Acc: [0.953]
2346 <Train> Epoch: [3] Iter: [2342] Step: [149791] Loss: [0.048] Acc: [0.969]
2347 <Train> Epoch: [3] Iter: [2343] Step: [149855] Loss: [0.045] Acc: [0.984]
2348 <Train> Epoch: [3] Iter: [2344] Step: [149919] Loss: [0.350] Acc: [0.969]
2349 <Train> Epoch: [3] Iter: [2345] Step: [149983] Loss: [0.103] Acc: [0.984]
2350 <Train> Epoch: [3] Iter: [2346] Step: [150000] Loss: [0.004] Acc: [1.000]
2354 <Test> Iter: [1] Loss: [0.030] Acc: [1.000]
2355 <Test> Iter: [2] Loss: [0.152] Acc: [0.938]
2356 <Test> Iter: [3] Loss: [0.060] Acc: [0.984]
2357 <Test> Iter: [4] Loss: [0.101] Acc: [0.953]
2358 <Test> Iter: [5] Loss: [0.017] Acc: [1.000]
2359 <Test> Iter: [6] Loss: [0.088] Acc: [0.969]
2360 <Test> Iter: [7] Loss: [0.018] Acc: [1.000]
2361 <Test> Iter: [8] Loss: [0.071] Acc: [0.969]
2362 <Test> Iter: [9] Loss: [0.190] Acc: [0.953]
2363 <Test> Iter: [10] Loss: [0.062] Acc: [0.969]
<Train> Epoch: [3] Iter: [2334] Step: [149279] Loss: [0.079] Acc: [0.953]
2339 <Train> Epoch: [3] Iter: [2335] Step: [149343] Loss: [0.203] Acc: [0.938]
2340 <Train> Epoch: [3] Iter: [2336] Step: [149407] Loss: [0.214] Acc: [0.922]
2341 <Train> Epoch: [3] Iter: [2337] Step: [149471] Loss: [0.100] Acc: [0.984]
2342 <Train> Epoch: [3] Iter: [2338] Step: [149535] Loss: [0.068] Acc: [0.984]
2343 <Train> Epoch: [3] Iter: [2339] Step: [149599] Loss: [0.054] Acc: [0.984]
2344 <Train> Epoch: [3] Iter: [2340] Step: [149663] Loss: [0.059] Acc: [0.969]
2345 <Train> Epoch: [3] Iter: [2341] Step: [149727] Loss: [0.065] Acc: [0.984]
2346 <Train> Epoch: [3] Iter: [2342] Step: [149791] Loss: [0.209] Acc: [0.969]
2347 <Train> Epoch: [3] Iter: [2343] Step: [149855] Loss: [0.092] Acc: [0.953]
2348 <Train> Epoch: [3] Iter: [2344] Step: [149919] Loss: [0.113] Acc: [0.953]
2349 <Train> Epoch: [3] Iter: [2345] Step: [149983] Loss: [0.059] Acc: [0.984]
2350 <Train> Epoch: [3] Iter: [2346] Step: [150000] Loss: [0.062] Acc: [1.000]
2354 <Test> Iter: [1] Loss: [0.068] Acc: [0.969]
2355 <Test> Iter: [2] Loss: [0.173] Acc: [0.953]
2356 <Test> Iter: [3] Loss: [0.174] Acc: [0.969]
2357 <Test> Iter: [4] Loss: [0.210] Acc: [0.922]
2358 <Test> Iter: [5] Loss: [0.214] Acc: [0.953]
2359 <Test> Iter: [6] Loss: [0.221] Acc: [0.922]
2360 <Test> Iter: [7] Loss: [0.162] Acc: [0.969]
2361 <Test> Iter: [8] Loss: [0.013] Acc: [1.000]
2362 <Test> Iter: [9] Loss: [0.145] Acc: [0.953]
2363 <Test> Iter: [10] Loss: [0.060] Acc: [0.984]
<Train> Epoch: [3] Iter: [2334] Step: [149279] Loss: [0.567] Acc: [0.781]
2339 <Train> Epoch: [3] Iter: [2335] Step: [149343] Loss: [0.869] Acc: [0.750]
2340 <Train> Epoch: [3] Iter: [2336] Step: [149407] Loss: [0.661] Acc: [0.812]
2341 <Train> Epoch: [3] Iter: [2337] Step: [149471] Loss: [0.689] Acc: [0.766]
2342 <Train> Epoch: [3] Iter: [2338] Step: [149535] Loss: [0.556] Acc: [0.828]
2343 <Train> Epoch: [3] Iter: [2339] Step: [149599] Loss: [0.600] Acc: [0.766]
2344 <Train> Epoch: [3] Iter: [2340] Step: [149663] Loss: [0.932] Acc: [0.719]
2345 <Train> Epoch: [3] Iter: [2341] Step: [149727] Loss: [0.604] Acc: [0.812]
2346 <Train> Epoch: [3] Iter: [2342] Step: [149791] Loss: [0.535] Acc: [0.844]
2347 <Train> Epoch: [3] Iter: [2343] Step: [149855] Loss: [0.680] Acc: [0.812]
2348 <Train> Epoch: [3] Iter: [2344] Step: [149919] Loss: [0.506] Acc: [0.828]
2349 <Train> Epoch: [3] Iter: [2345] Step: [149983] Loss: [0.635] Acc: [0.828]
2350 <Train> Epoch: [3] Iter: [2346] Step: [150000] Loss: [0.332] Acc: [1.000]
2354 <Test> Iter: [1] Loss: [0.695] Acc: [0.797]
2355 <Test> Iter: [2] Loss: [0.783] Acc: [0.734]
2356 <Test> Iter: [3] Loss: [0.768] Acc: [0.781]
2357 <Test> Iter: [4] Loss: [0.981] Acc: [0.703]
2358 <Test> Iter: [5] Loss: [0.745] Acc: [0.797]
2359 <Test> Iter: [6] Loss: [0.606] Acc: [0.828]
2360 <Test> Iter: [7] Loss: [0.564] Acc: [0.828]
2361 <Test> Iter: [8] Loss: [0.519] Acc: [0.891]
2362 <Test> Iter: [9] Loss: [0.687] Acc: [0.859]
2363 <Test> Iter: [10] Loss: [0.896] Acc: [0.781]
主要采用了数据集和CNN相关文档: CNN中文文本分类Demo
分别对字和句子进行Attention处理: Attention机制论文
Seq2Seq图文: Attention机制博客
通俗易懂Attention: Attention机制CSDN
大神博客(神经网络相关): Colah博客