Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail in Jenkins if puppet is already running on lightning. #26

Open
cg505 opened this issue Aug 29, 2019 · 5 comments
Open

Don't fail in Jenkins if puppet is already running on lightning. #26

cg505 opened this issue Aug 29, 2019 · 5 comments

Comments

@cg505
Copy link
Member

cg505 commented Aug 29, 2019

puppetTrigger('puppet')
will fail if puppet is already running on the puppetmaster.

Also, why do we run puppet specifically on puppet.obe when many different hosts use the etc repo?

@jvperrin
Copy link
Member

Yeah, that's an actual issue, it probably should wait and run puppet again after the current run is done (or we don't trigger puppet from jenkins and do something else, idk).

We run it only on lightning because this repo is distributed via a puppet share (also see here where the latest changes to this repo are pulled into that share), so only lightning needs to be updated and then all hosts will get changes later as they run puppet. It does mean changes can be delayed for up to 30 minutes though and would be staggered across multiple hosts on update, which is unfortunate.

@cg505
Copy link
Member Author

cg505 commented Aug 29, 2019

Ah forgot that it got chanced to a puppet share. Thought it was still vcsrepo on all hosts. That makes sense.

I think it would make sense to just wait until pgrep puppet fails, with a 30 minutes timeout.

@dkess
Copy link
Member

dkess commented Aug 29, 2019

Note that this pattern is also used in the dns repo: https://github.com/ocf/dns/blob/master/Jenkinsfile#L33 . This does cause failures, but they're very rare and aren't actually problematic.

All of the configs in etc should be eventually consistent, therefore it should be acceptable if all the hosts aren't updated right away or propagate differently for each host. If we want a slightly stronger guarantee (which I guess is reasonable for certain information we show in ocfweb), there are a variety of schemes we could implement for this, though I haven't thought of any that I'm particularly happy with.

@cg505
Copy link
Member Author

cg505 commented Aug 30, 2019

Fair. I mostly just think that this should not cause Jenkins to fail

@dkess
Copy link
Member

dkess commented Aug 30, 2019

Sure, I'm not gonna defend this scheme too much because I'm not a huge fan of it myself. I agree it's not ideal. See recent discussion in rebuild for thoughts on how to implement it better. Instead of trying to work around this Jenkins Puppet race condition I'd rather redesign the syncing from scratch to not use Puppet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants