Hi everyone,
We currently have MirrorManager2 running on staging. It's apparently not 100% set-up since we get emails once in a while that one of the crons failed (iirc among other we need to finish configuring fedmsg).
Other that this, MirrorManager2 is currently in a decent shape I think. However, we really need to make sure nothing broke in the re-write and we want to make sure we won't break it in the future. To try to ensure that last part, I have try to write some tests for the UI but also for the backend part (all the different scripts). The pull-request is opened for review: https://github.com/fedora-infra/mirrormanager2/pull/14
I have also been trying to capitalize on the knowledge we acquired during the FAD by starting to write down how mirrormanager works in the documentation: https://github.com/fedora-infra/mirrormanager2/compare/tests...doc (pull-request to be opened once the tests branch is merged) I would appreciate if those that were at the FAD could go through this branch/changes and adjust as needed. I have been thinking about asking Matt to do the review so that we can adjust and improve the documentation.
Before we move MirrorManager2 to prod, here is what I think would be nice to do/have:
- Pickle validation - Figure a way to validate a pickle after its creation and before moving it to the mirrorlist boxes - Find out if we can improve our tests some more (to improve our confidence that we're ready) - Engage the mirror mailing list and try to get them to react on the coming changes
The pickle validation might also be an interesting idea to check if there is a difference between the pickle generated by prod and the one generated in stg.
Finally, at DevConf we have been speaking quite a bit with Dennis around updates and MirrorManager and here is some of the ideas we spoke about:
- Be able to run the UMDL script on only a part of the tree (ie: be able to say, we updated f21-updates and we only update this part) - Crawl the mirrors for only a part of the tree (This goes together with updating only part of the tree via UMDL) - Consider if we should/could drop the content of the host_category_dir table before running the crawler - Mirror versioning: - run UMDL, detect changes, increase master mirror's version by 1 - run the crawler, check for the changes, align that mirror's version with the master mirror one - be able to see the difference between two versions - be able to crawl a mirror only for the difference between the version it is at and the version the master mirror is at note: we might still want to run a full crawl once in a while (daily? bi-daily?)
This list of ideas is more a long term todo list, not something we would want to have working for pushing MirrorManager2 to prod.
Thoughts? Agreements? Disagreements?
Thanks, Pierre
On 11 February 2015 at 06:10, Pierre-Yves Chibon pingou@pingoured.fr wrote:
Hi everyone,
We currently have MirrorManager2 running on staging. It's apparently not 100% set-up since we get emails once in a while that one of the crons failed (iirc among other we need to finish configuring fedmsg).
Other that this, MirrorManager2 is currently in a decent shape I think. However, we really need to make sure nothing broke in the re-write and we want to make sure we won't break it in the future. To try to ensure that last part, I have try to write some tests for the UI but also for the backend part (all the different scripts). The pull-request is opened for review: https://github.com/fedora-infra/mirrormanager2/pull/14
I have also been trying to capitalize on the knowledge we acquired during the FAD by starting to write down how mirrormanager works in the documentation: https://github.com/fedora-infra/mirrormanager2/compare/tests...doc (pull-request to be opened once the tests branch is merged) I would appreciate if those that were at the FAD could go through this branch/changes and adjust as needed. I have been thinking about asking Matt to do the review so that we can adjust and improve the documentation.
Before we move MirrorManager2 to prod, here is what I think would be nice to do/have:
- Pickle validation
- Figure a way to validate a pickle after its creation and before moving
it to the mirrorlist boxes
- Find out if we can improve our tests some more (to improve our confidence that we're ready)
- Engage the mirror mailing list and try to get them to react on the coming changes
The pickle validation might also be an interesting idea to check if there is a difference between the pickle generated by prod and the one generated in stg.
Finally, at DevConf we have been speaking quite a bit with Dennis around updates and MirrorManager and here is some of the ideas we spoke about:
- Be able to run the UMDL script on only a part of the tree (ie: be able to say, we updated f21-updates and we only update this part)
- Crawl the mirrors for only a part of the tree (This goes together with updating only part of the tree via UMDL)
For a dumb optimization (which may be there laready) I would only crawl the trees which we know change "hourly" (updates/X/, development/X/) and only scan releases etc daily or weekly because it should not change that much.
- Consider if we should/could drop the content of the host_category_dir
table before running the crawler
- Mirror versioning:
- run UMDL, detect changes, increase master mirror's version by 1
- run the crawler, check for the changes, align that mirror's version
with the master mirror one
- be able to see the difference between two versions
- be able to crawl a mirror only for the difference between the version
it is at and the version the master mirror is at note: we might still want to run a full crawl once in a while (daily? bi-daily?)
Those last couple of steps would be very useful in cutting down load from mirrors running rsync against enchillada but only needing a couple of packages in updates since the last time they did it.
This list of ideas is more a long term todo list, not something we would want to have working for pushing MirrorManager2 to prod.
Thoughts? Agreements? Disagreements?
Thanks, Pierre
infrastructure mailing list infrastructure@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/infrastructure
On 02/12/2015 02:43 AM, Stephen John Smoogen wrote:
- Mirror versioning: - run UMDL, detect changes, increase master mirror's version by 1 - run the crawler, check for the changes, align that mirror's version with the master mirror one - be able to see the difference between two versions - be able to crawl a mirror only for the difference between the version it is at and the version the master mirror is at note: we might still want to run a full crawl once in a while (daily? bi-daily?)
Those last couple of steps would be very useful in cutting down load from mirrors running rsync against enchillada but only needing a couple of packages in updates since the last time they did it.
In that general vein, an interesting project Robert Collins introduced me to last year is his lmirror work: https://pypi.python.org/pypi/lmirror
It's designed to take advantage of various HTTP/2 features and the knowledge that it's specifically being used to create read-only mirrors to make the network operations more efficient than is feasible with the more general purpose rsync.
Design details: https://bazaar.launchpad.net/~lmirror/lmirror/master/view/head:/doc/DESIGN.t...
Regards, Nick.
"NC" == Nick Coghlan ncoghlan@redhat.com writes:
NC> In that general vein, an interesting project Robert Collins NC> introduced me to last year is his lmirror work: NC> https://pypi.python.org/pypi/lmirror
That is quite interesting. It's obvious that rsync is kind of breaking down, given that with my current mirroring setup (just running a private internal mirror) I can rarely get a complete copy across the wire due to timeouts while receiving the file set.
Not a lot of commits happening upstream, though. Is this code actually being used anywhere?
I guess the first step if we want to try to play with this is to get it packaged.
- J<
On Wed, 11 Feb 2015 14:10:01 +0100 Pierre-Yves Chibon pingou@pingoured.fr wrote:
Hi everyone,
We currently have MirrorManager2 running on staging. It's apparently not 100% set-up since we get emails once in a while that one of the crons failed (iirc among other we need to finish configuring fedmsg).
Yeah, there's two crons that are failing due to lack of fedmsg. :(
Other that this, MirrorManager2 is currently in a decent shape I think. However, we really need to make sure nothing broke in the re-write and we want to make sure we won't break it in the future. To try to ensure that last part, I have try to write some tests for the UI but also for the backend part (all the different scripts). The pull-request is opened for review: https://github.com/fedora-infra/mirrormanager2/pull/14
Cool.
I have also been trying to capitalize on the knowledge we acquired during the FAD by starting to write down how mirrormanager works in the documentation: https://github.com/fedora-infra/mirrormanager2/compare/tests...doc (pull-request to be opened once the tests branch is merged) I would appreciate if those that were at the FAD could go through this branch/changes and adjust as needed. I have been thinking about asking Matt to do the review so that we can adjust and improve the documentation.
That is an excellent idea. ;) The docs all look good/accurate to me. ;)
Before we move MirrorManager2 to prod, here is what I think would be nice to do/have:
- Pickle validation
- Figure a way to validate a pickle after its creation and before
moving it to the mirrorlist boxes
If it would help that we have tracebacks from the mm2 mirrorlist for when a bad pkl is loaded (the mm1 ones just fail, but the mm2 one does provide a traceback):
Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: Traceback (most recent call last): Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: File "/usr/share/mirrormanager2/mirrorlist_server.py", line 877, in handle Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: r = do_mirrorlist(d) Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: File "/usr/share/mirrormanager2/mirrorlist_server.py", line 718, in do_mirrorlist Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: allhosts, cache, file, pathIsDirectory=pathIsDirectory) Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: File "/usr/share/mirrormanager2/mirrorlist_server.py", line 423, in append_path Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: s = hcurl_cache[hcurl_id] Feb 9 10:11:42 mirrorlist-host1plus python2[25890]: KeyError: 7716
- Find out if we can improve our tests some more (to improve our confidence that we're ready)
- Engage the mirror mailing list and try to get them to react on the
coming changes
Good idea.
The pickle validation might also be an interesting idea to check if there is a difference between the pickle generated by prod and the one generated in stg.
Finally, at DevConf we have been speaking quite a bit with Dennis around updates and MirrorManager and here is some of the ideas we spoke about:
- Be able to run the UMDL script on only a part of the tree (ie: be able to say, we updated f21-updates and we only update this
part)
- Crawl the mirrors for only a part of the tree (This goes together with updating only part of the tree via UMDL)
- Consider if we should/could drop the content of the
host_category_dir table before running the crawler
- Mirror versioning:
- run UMDL, detect changes, increase master mirror's version by 1
- run the crawler, check for the changes, align that mirror's
version with the master mirror one
- be able to see the difference between two versions
- be able to crawl a mirror only for the difference between the
version it is at and the version the master mirror is at note: we might still want to run a full crawl once in a while (daily? bi-daily?)
This list of ideas is more a long term todo list, not something we would want to have working for pushing MirrorManager2 to prod.
Thoughts? Agreements? Disagreements?
Sounds good.
There's some minor changes to make in mirrorlist rpms, but once thats done we can look at replacing the other mirrorlists anytime.
For the other parts, I think we will need to just create them all in production and run them in parallel to the mm1 setup for a short time until we are ready to cut over.
kevin
On 02/14/2015 01:27 AM, Jason L Tibbitts III wrote:
"NC" == Nick Coghlan ncoghlan@redhat.com writes:
NC> In that general vein, an interesting project Robert Collins NC> introduced me to last year is his lmirror work: NC> https://pypi.python.org/pypi/lmirror
That is quite interesting. It's obvious that rsync is kind of breaking down, given that with my current mirroring setup (just running a private internal mirror) I can rarely get a complete copy across the wire due to timeouts while receiving the file set.
Not a lot of commits happening upstream, though. Is this code actually being used anywhere?
When we were talking about the project at PyCon NZ, Robert said he planned to use it for something in relation to OpenStack's infrastructure, but I don't recall the details. If you wanted to know more, it may be worth getting in touch with him directly.
Cheers, Nick.
----- Original Message -----
From: "Pierre-Yves Chibon" pingou@pingoured.fr To: "Fedora Infrastructure" infrastructure@lists.fedoraproject.org Sent: Wednesday, February 11, 2015 5:10:01 AM Subject: About MirrorManager2
[snip]
Before we move MirrorManager2 to prod, here is what I think would be nice to do/have:
- Pickle validation
to the mirrorlist boxes
- Figure a way to validate a pickle after its creation and before moving it
- Find out if we can improve our tests some more (to improve our confidence that we're ready)
- Engage the mirror mailing list and try to get them to react on the coming changes
The pickle validation might also be an interesting idea to check if there is a difference between the pickle generated by prod and the one generated in stg.
Finally, at DevConf we have been speaking quite a bit with Dennis around updates and MirrorManager and here is some of the ideas we spoke about:
- Be able to run the UMDL script on only a part of the tree (ie: be able to say, we updated f21-updates and we only update this part)
- Crawl the mirrors for only a part of the tree (This goes together with updating only part of the tree via UMDL)
- Consider if we should/could drop the content of the host_category_dir table before running the crawler
- Mirror versioning:
the master mirror one
- run UMDL, detect changes, increase master mirror's version by 1
- run the crawler, check for the changes, align that mirror's version with
is at and the version the master mirror is at note: we might still want to run a full crawl once in a while (daily? bi-daily?)
- be able to see the difference between two versions
- be able to crawl a mirror only for the difference between the version it
This list of ideas is more a long term todo list, not something we would want to have working for pushing MirrorManager2 to prod.
Thoughts? Agreements? Disagreements?
Hi!
I'd like to help move this along to production. I have very limited knowledge of the MM codebase, but I did attend the MM FAD, so I have at least a vague understanding of how it works. Are there any changes to these lists since last week? What might be productive to work on? Like I said, my knowledge is limited, but pickle validation sounds like a pretty general Python task. Maybe I could look into that?
Any direction and info on some task(s) I could try my hand at is appreciated. :)
-- David
On Mon, Feb 23, 2015 at 04:42:46PM -0500, David Gay wrote:
----- Original Message -----
From: "Pierre-Yves Chibon" pingou@pingoured.fr To: "Fedora Infrastructure" infrastructure@lists.fedoraproject.org Sent: Wednesday, February 11, 2015 5:10:01 AM Subject: About MirrorManager2
[snip]
Before we move MirrorManager2 to prod, here is what I think would be nice to do/have:
- Pickle validation
to the mirrorlist boxes
- Figure a way to validate a pickle after its creation and before moving it
- Find out if we can improve our tests some more (to improve our confidence that we're ready)
- Engage the mirror mailing list and try to get them to react on the coming changes
The pickle validation might also be an interesting idea to check if there is a difference between the pickle generated by prod and the one generated in stg.
Finally, at DevConf we have been speaking quite a bit with Dennis around updates and MirrorManager and here is some of the ideas we spoke about:
- Be able to run the UMDL script on only a part of the tree (ie: be able to say, we updated f21-updates and we only update this part)
- Crawl the mirrors for only a part of the tree (This goes together with updating only part of the tree via UMDL)
- Consider if we should/could drop the content of the host_category_dir table before running the crawler
- Mirror versioning:
the master mirror one
- run UMDL, detect changes, increase master mirror's version by 1
- run the crawler, check for the changes, align that mirror's version with
is at and the version the master mirror is at note: we might still want to run a full crawl once in a while (daily? bi-daily?)
- be able to see the difference between two versions
- be able to crawl a mirror only for the difference between the version it
This list of ideas is more a long term todo list, not something we would want to have working for pushing MirrorManager2 to prod.
I'd like to help move this along to production. I have very limited knowledge of the MM codebase, but I did attend the MM FAD, so I have at least a vague understanding of how it works. Are there any changes to these lists since last week? What might be productive to work on? Like I said, my knowledge is limited, but pickle validation sounds like a pretty general Python task. Maybe I could look into that?
None of these things have progressed over the last week. Clearly the three points at the top have the most priority for MM2 (imo) and the pickle validation might be tricky (how to test it) but is something I really think I want to have before we switch.
The points listed below are things we may want to work on in time, but clearly not something we want to wait for to release MM2 in prod.
Any direction and info on some task(s) I could try my hand at is appreciated. :)
I think the starting point for this is to take a couple of pickle files generated by MM1 currently. Then we need to check their structure and eventually their content. I have never looked at what the pickle file has, so I can't really say what to expect.
Ideally, we want to develop the tests against the pickle generated by MM1, then run them against the pickle generated by MM2. Eventually, I think it would be a good idea to integrate the tests in the publication workflow so that we only push to the mirrorlist servers the pickle that we validated.
Pierre
On Tue, Feb 24, 2015 at 04:20:19PM +0100, Pierre-Yves Chibon wrote:
On Mon, Feb 23, 2015 at 04:42:46PM -0500, David Gay wrote:
----- Original Message -----
From: "Pierre-Yves Chibon" pingou@pingoured.fr
[snip]
Before we move MirrorManager2 to prod, here is what I think would be nice to do/have:
- Pickle validation
to the mirrorlist boxes
- Figure a way to validate a pickle after its creation and before moving it
- Find out if we can improve our tests some more (to improve our confidence that we're ready)
- Engage the mirror mailing list and try to get them to react on the coming changes
The pickle validation might also be an interesting idea to check if there is a difference between the pickle generated by prod and the one generated in stg.
...
I'd like to help move this along to production. I have very limited knowledge of the MM codebase, but I did attend the MM FAD, so I have at least a vague understanding of how it works. Are there any changes to these lists since last week? What might be productive to work on? Like I said, my knowledge is limited, but pickle validation sounds like a pretty general Python task. Maybe I could look into that?
None of these things have progressed over the last week. Clearly the three points at the top have the most priority for MM2 (imo) and the pickle validation might be tricky (how to test it) but is something I really think I want to have before we switch.
I found out this morning that I have a couple of files around mirrormanager's pickle, might be useful, maybe not but here they are :)
Apparently I have been comparing MM1 and MM2 pickles at one point :)
Pierre
infrastructure@lists.fedoraproject.org