#293: CmdError: No more mirrors to try ------------------------+--------------------------------------------------- Reporter: kparal | Owner: Type: defect | Status: new Priority: major | Milestone: Hot issues Component: production | Keywords: ------------------------+--------------------------------------------------- Occasionally (more often recently) some test ends with "CRASHED: CmdError ... returned non-zero exit status" and the cause is Yum Exception "No more mirrors to try". Usually I see this error when my network connection is down. The question is: 1. if this is the case 2. how did we receive the log results back in that case 3. why it is happening
Search for CmdError here for examples:
https://fedorahosted.org/pipermail/autoqa-results/2011-March/thread.html
And a concrete example:
https://fedorahosted.org/pipermail/autoqa-results/2011-March/087278.html [[BR]] https://fedorahosted.org/pipermail/autoqa-results/2011-March/096910.html
#293: CmdError: No more mirrors to try ------------------------+--------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: Hot issues Component: production | Resolution: Keywords: | ------------------------+--------------------------------------------------- Changes (by tflink):
* owner: => tflink * status: new => assigned
Comment:
I can't seem to reproduce this on my dev machine. I'm planning to look at the actual hosts tomorrow
#293: CmdError: No more mirrors to try ------------------------+--------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: Hot issues Component: production | Resolution: Keywords: | ------------------------+--------------------------------------------------- Comment (by tflink):
I was finally able to reproduce this in my local setup and I think that I found the root cause. Waiting on feedback from wwoods before proposing the fix.
#293: CmdError: No more mirrors to try ------------------------+--------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: Hot issues Component: production | Resolution: Keywords: | ------------------------+--------------------------------------------------- Comment (by tflink):
After talking to wwoods on IRC earlier today, I realized that I was on the right track but pointing the finger at the wrong cause. Hopefully I didn't confuse him too much :)
We always see this error when builds are passed in as accepted. The code from depcheck_lib.py that handles accepted builds is: {{{ #!python if accepted: # mash the accepted packages into a proto-updates repo accdir = tempfile.mkdtemp(prefix='depcheck-accepted.') for p in accepted: os.symlink(os.path.realpath(p), os.path.join(accdir, os.path.basename(p))) do_mash(accdir, mash_arches) accrepo = yum_repos.add_enable_repo('prev_accepted',['file://%s' % accdir]) yum_repos.pkgSack # initializes package sacks prev_accepted = list(accrepo.sack) os.system('/bin/rm -rf %s' % accdir) }}}
The idea is that we are creating a separate repo to hold the accepted builds so that they aren't run through depcheck again. By initializing the pkgSack of the yum object used for depcheck, all of the accepted builds are added as packages and the hope is that the extra repo can be deleted.
I did some hacking to output the accepted repo information right before we attempt to resolve deps: {{{ baseurl = file:///tmp/depcheck-accepted.ph9HZi ... name = prev_accepted pkgdir = /var/tmp/yum-root-PB_6nt/prev_accepted/packages }}}
The prev_accepted repo still has a baseurl of the temp directory that we're deleting.
The problem comes in when yum wants to resolve deps. As part of this process, it goes through and re-populates the repository; re-reading all of the repo dbs as it goes along. Since we delete the repo db associated with the accepted package repository, yum can't read the file and throws an exception saying that there are no more mirrors left.
As far as a fix is concerned, the easy way would be to just not delete the accepted repo. I've tried this on a simple example job, and the "no more mirrors error" went away; the depcheck test passed as expected. The hard way would be to try making yum not look for the files when it repopulates the repo for resolving dependencies.
Personally, I'm all for the easy way on this since it is a one line patch and just means that we have to clean out the /tmp dir on our test hosts more often.
#293: CmdError: No more mirrors to try ------------------------+--------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: Hot issues Component: production | Resolution: Keywords: | ------------------------+--------------------------------------------------- Comment (by tflink):
Patch submitted for review:
https://fedorahosted.org/reviewboard/r/132/
#293: CmdError: No more mirrors to try ------------------------+--------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: Hot issues Component: production | Resolution: Keywords: | ------------------------+--------------------------------------------------- Comment (by kparal):
Replying to [comment:3 tflink]:
As far as a fix is concerned, the easy way would be to just not delete
the accepted repo. I've tried this on a simple example job, and the "no more mirrors error" went away; the depcheck test passed as expected. The hard way would be to try making yum not look for the files when it repopulates the repo for resolving dependencies.
Personally, I'm all for the easy way on this since it is a one line
patch and just means that we have to clean out the /tmp dir on our test hosts more often.
Can't we just use autotest temp directory (self.tmpdir) that is automatically deleted after test finishes instead of /tmp?
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: 0.4.6 Component: tests | Resolution: Keywords: | --------------------+------------------------------------------------------- Changes (by kparal):
* component: production => tests * milestone: Hot issues => 0.4.6
Comment:
Putting into 0.4.6.
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: 0.4.6 Component: tests | Resolution: Keywords: | --------------------+------------------------------------------------------- Comment (by tflink):
updated patch in reviewboard to create temp dirs before depcheck and delete them after depcheck.
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: 0.4.6 Component: tests | Resolution: Keywords: | --------------------+------------------------------------------------------- Comment (by kparal):
Reviewed.
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: 0.4.6 Component: tests | Resolution: Keywords: | --------------------+------------------------------------------------------- Comment (by tflink):
Changed code to a finally block, pushed to master d946172e1ecb2f7d1f0d03a66442406ef63c178f.
Having network issues today (lots of timeouts from bodhi and koji) which is making exhaustive testing difficult. I will push to stable if I can get enough testing done today.
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: 0.4.6 Component: tests | Resolution: Keywords: | --------------------+------------------------------------------------------- Comment (by tflink):
I've been testing this today and I'm pretty confidant that everything is working. I've seen a couple of unexpected errors but AFAICT, they were due to network glitches on my end.
If someone has the time, would you mind running a few tests on this code (having accepted builds would be best)? I'll do some more testing tomorrow (hopefully with fewer timeouts) and if I don't see anything or any objections, I'll push this commit to stable.
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: assigned Priority: major | Milestone: 0.4.6 Component: tests | Resolution: Keywords: | --------------------+------------------------------------------------------- Comment (by kparal):
I tested the changes and found no problems, but I wasn't able to get any accepted updates while testing f14-updates(-testing) repos.
#293: CmdError: No more mirrors to try --------------------+------------------------------------------------------- Reporter: kparal | Owner: tflink Type: defect | Status: closed Priority: major | Milestone: 0.4.6 Component: tests | Resolution: fixed Keywords: | --------------------+------------------------------------------------------- Changes (by tflink):
* status: assigned => closed * resolution: => fixed
Comment:
Replying to [comment:11 kparal]:
I tested the changes and found no problems, but I wasn't able to get any
accepted updates while testing f14-updates(-testing) repos.
Yeah, we would have to add some comments to the f14 updates in order to get accepted builds. I was testing this by manually inserting accepted builds into the depcheck command.
I've tested the fix with the rest of the stable branch and a RHEL5 autotest server and it works.
Pushed to stable: 0d4a19be20a108525ce0e86b12ecc7be8f4d7803
autoqa-devel@lists.fedorahosted.org