Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nopol in RepairThemAll experiment fail to repair several bugs which can be repaired by nopol before #4

Open
DehengYang opened this issue Jul 25, 2019 · 4 comments

Comments

@DehengYang
Copy link

I compared the results of Nopol in two experiments:

  1. https://github.com/Spirals-Team/defects4j-repair/tree/master/results/2017-march
  2. https://github.com/program-repair/RepairThemAll_experiment/tree/master/results/Defects4J

The first experiment used Nopol-2017 version built under jdk 1.7, while the second experiment used Nopol-2018 version built under jdk 1.8. It is weird that the latter Nopol version cannot repair several bugs that can be repaired in the former Nopol version, e.g., Time 16, 18, 19, Mockito 29, 38.

So I would like to ask that why this different performance occurs and which nopol version is better/more powerful ?

Thanks,
Dale

@tdurieux
Copy link
Collaborator

tdurieux commented Jul 30, 2019

Hi @DehengYang Sorry for the late answer.

So I would like to ask that why this different performance occurs?

There is no easy answer to this question and the reason that there is a difference is multiple.

  1. Nopol (and all APR tools) is extremely dependent on the environment, any change can impact the generated patch or even generate a patch or not, for example using jdk 1.7 or jdk 1.8, using oracle or openjdk, the language of the machine, the operating system, ...

  2. We also used a different seed between the experiment from 2017 and the one in 2019 which can impact which patch are generated

  3. we change the way to compute the classpath for the bug. in 2019, we mostly use the classpath provided by defects4j (that we clean because it is sometimes incorrect). In 2017, we used all the jar file provided by defects4j for a specific project (for example common-math). This also has an impact.

  4. The implementation of Nopol changes in the 2 years

which nopol version is better/more powerful?

I honestly don't know, the 2017 version will work better on some bugs, the 2019 version will work better on different bugs. It is really difficult to know.

Sorry for my vague answer, I would love to have a clear and precise reason but I think there is none.

@DehengYang
Copy link
Author

DehengYang commented Jul 30, 2019

Dear @tdurieux ,

Thank you so much your detailed explanation!

The various possible reasons mentioned above really help me to gain a deeper understanding of Nopol (e.g., the seed, and the classpaths). For one of the mentioned factor named the language of the machine, I was once faced with the problem caused by the non-English system language. That is, Defects4J benchmark must be configured and ran in English environment, otherwise there will be extra failed tests for some bugs.

Thank you again for your answer. And would you mind further answering one of my doubts? I would like to ask that: have you dealt with the problem that some Defects4J bugs may yield unexpected failed tests when built under JDK 1.8 version? Such doubt can also be seen in program-repair/RepairThemAll#19

Thank you again for your great help.

Best,
Dale

@tdurieux
Copy link
Collaborator

For one of the mentioned factor named the language of the machine, I was once faced with the problem caused by the non-English system language.

We are well aware of this, and the timezone of the machine also has an impact on the Time project.
This was correctly configured for this experiment.

have you dealt with the problem that some Defects4J bugs may yield unexpected failed tests when built under JDK 1.8 version?

Unfortunately not, Nopol needs to run in >=jdk 1.8. And since Nopol is mostly a dynamic analyzer + synthesizer. The buggy application needs to be executed in the same JVM instance as Nopol this the bug needs to be executed in >=jdk 1.8.
Technically, it should be possible to remove this 'limitation' from nopol but it requires to rewrite completely the tool. I, unfortunately, don't have the time or the students to do it.

I expect that the impact of this can be:

  • no test is not failing with jkd1.8: Nopol will crash and no patch will be generated
  • new tests are not failing with jkd1.8: Nopol will most likely not be able to generate a patch because it will require Nopol to fix "bugs" that have different root causes with a single change.

And the cases that are more tricky

  • new tests are not failing and the original buggy test-case does not fail with jkd1.8: in this case, it is possible the Nopol will generate a patch for a 'different' bug.
  • the tests are flaky with jkd1.8: flaky tests are the worse for APR since it allows to generate completely random patch at a completely random location.

I, unfortunately, don't have an estimation of the proportion of each case. The two last cases seem less likely but I don't have scientific evidence to show that.

There is currently no paper that tries to understand what is happening in defects4j with jkd1.8. This work will be really interesting to understand the difference of behavior between two jdk. I hope that someone will study this.

@DehengYang
Copy link
Author

DehengYang commented Jul 30, 2019

Thank you for your great help! This still remains a research question for exploring the different behaviour between two jdk versions. It is very meaningful and pertinent to fundamental mechanisms in the research field.

Maybe in the future I will try to figure it out by further study. Thank you again for your time and consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants