Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sending empty parameter hashmap in case of retried execution #627

Open
wants to merge 5 commits into
base: tuning_20190221
Choose a base branch
from

Conversation

dushyantk1509
Copy link

@dushyantk1509 dushyantk1509 commented Sep 5, 2019

When a job is retried, it makes a dummy getCurrentRunParameter call to dr. elephant. In this way tuneIN knows that there was a failure with earlier parameters and penalize the parameters accordingly. In current implementation, best parameters are returned as an output of this dummy call even though we don’t apply these parameters and a new entry is made in tuningJobExecutionParamSet which is not correct.

In this change, empty parameter hash map will return as an output of retried dummy getCurrentRunParameter call and make entry in tuningJobExecutionParamSet only for non-tried executions.

@@ -142,37 +142,40 @@ insert into yarn_app_heuristic_result_details (yarn_app_heuristic_result_id,name
(137594640,'Number of tasks','20','NULL');

INSERT INTO flow_definition(id, flow_def_id, flow_def_url) VALUES
(10003,'https://ltx1-holdemaz01.grid.linkedin.com:8443/manager?project=AzkabanHelloPigTest&flow=countByCountryFlow','https://ltx1-holdemaz01.grid.linkedin.com:8443/manager?project=AzkabanHelloPigTest&flow=countByCountryFlow');
(10003,'https://ltx1-holdemaz01.grid.linkedin.com:8443/manager?project=AzkabanHelloPigTest&flow=countByCountryFlow','https://ltx1-holdemaz01.grid.linkedin.com:8443/manager?project=AzkabanHelloPigTest&flow=countByCountryFlow'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kindly remove internal Url links.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are already there. I have not modified them except for ';' changed to ',' as I wanted to another line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but you can help with correcting this and if need can ask for the help of the author of the respective part.

applyPenalty(tuningInput.getJobExecId());
jobSuggestedParamSet = getBestParamSet(tuningInput.getJobDefId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be no parameters suggested in case of retry?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. It sends empty hashmap so that default parameters can be applied for retried execution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So was getting bestParameters and sending them was a BUG which you fixed in this PR?

Copy link
Author

@dushyantk1509 dushyantk1509 Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes because after getting these parameters dr. elephant is adding the retried job execution to TuningJobExecutionParamSet against these parameters which is not correct as this is a dummy call. Retried execution always run with default parameters. It ignores the parameters sent by dr. elephant.

Therefore this step was unnecessary step and next step was not correct.

if (tuningInput.getRetry()) {
logger.info(" Retry ");
logger.info(" Retried job execution " + tuningInput.getJobExecId());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method called when execution is re-tried or is about to retry?

Copy link
Author

@dushyantk1509 dushyantk1509 Sep 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About to retry. Will correct in next commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@ShubhamGupta29
Copy link
Contributor

@dushyantk1509 Can you mention the problem statement in the Description section of the PR as it will provide more context about the PR. Also the doc link you mentioned above is not accessible for me using my external email-id, can you have a look so that it would be available to all.

@dushyantk1509
Copy link
Author

@dushyantk1509 Can you mention the problem statement in the Description section of the PR as it will provide more context about the PR. Also the doc link you mentioned above is not accessible for me using my external email-id, can you have a look so that it would be available to all.

Now you won't need docs.

dushyantk1509 and others added 4 commits September 12, 2019 12:47
* Integrating Spark and MR exception fingerprinting so results for both will be visible on Dr.Elephant's UI

* Changing OracleJDK to OpenJDK as travisCI is not supporting OracleJDK8

* Adding IDENTITY generation strategy for ID

* Added unit tests for Exception Fingerprinting

* Adding test files

* Addressing the review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants