Skip to content

Latest commit

 

History

History
45 lines (27 loc) · 2.5 KB

README.md

File metadata and controls

45 lines (27 loc) · 2.5 KB

Code-commenting

  • Design project on Code Comment Generation

Dataset

  • We use the dataset of the DeepCom code for our training. We have trained mostly on Google Colaboratory. Recommend trying for the university's HPC (if not too busy) or GCloud credits (if ANY of your cards can get through without a refund).

Results

  • Our model did not cross SOTA performance, which is something we have expected. It has however managed to produce semantically correct comments, occasionally more informative than the user's comments themselves.
  • Many of the comments within the first epoch were repetitive, but the number of meaningful comments increased significantly over time.

NOTE: The first line is the comment spit out by the machine. The second one is the true human comment:
image

The model requires more training for rarer tokens:
image

Here the model fails to spit a grammatically correct word, but it can capture the inner semantics of the code:
image

Due to teacher-forcing, the machine has been confused, but otherwise, it still tried to produce a meaningful comment when humans gave a bad comment:
image

The model can also substitute words with similar meanings:
image

image

BLEU Scoring and Loss plots

meteor

bleu

loss

Box Plots

Bleu

METEOR

Contributing