On Tue, Aug 16, 2016 at 12:46 PM, Randy Barlow bowlofeggs@fedoraproject.org wrote:
On Tue, 2016-08-16 at 11:24 -0500, Jason L Tibbitts III wrote:
It would also help to have the following information. The mirrors will need to have this information in order to make informed decisions. (I will also have to make changes to quick-fedora-mirror to accommodate.)
- How much content will the mirrors need to store? How will this amount change over time?
Hello Jason! I confess that I don't have good answers to your questions, and I'm not sure who would. Many of these questions depend on how popular Docker images become with Fedora packagers.
How many bytes of content we will be creating does depend on how many applications get packaged as Docker images. I would guess the base image to be a few hundred megabytes, but we can probably use some fancy hardlinking to help reduce disk/network usage so that the base image is only stored once. The rest of the storage is going to be the diffs applied as layers on top of the base image that add whatever each individual image needs. The sizes of these layers will vary greatly by application, so this is also difficult to guess.
It's difficult to make informed guesses about this since I don't know how many Docker images the fedora packagers will create (or at what rate they will create them over time).
- Do you have a plan for placing an upper bound on the total amount
of data? (In Fedora things are moved to archive, though that has its own problems and of course doesn't really place an upper bound on anything.)
I don't have such a plan at this time. If anyone has suggestions about this, that would be helpful. It's unclear whether Docker images would live inside or outside of the traditional Fedora cycle (i.e., F24/F25/F26). It may have its own separate cycle, or we may just go with the current Fedora cycle.
I think we can choose a reasonable archival time, sorting out the implementation of that with the tool the does the layer data extraction might be a challenge but I imagine it's one we can collectively sort out if necessary.
- How much change do you expect per day? Churn is really important, and even now we can come close to the point where the master
mirrors simply can't feed new content to the tier 1 mirrors fast enough for them to keep ahead of the changes we're making.
This again depends on how popular the Docker image offering becomes with our packagers, so it is difficult for me to make an educated guess. Popularity is difficult to predict.
The current plan is that we will release Docker Layered Images on a Two-Week cadence, potentially in line with the Atomic Host Two-Week deliverable. This might change in the future but that is the current plan.
- How will this be organized on the master mirrors? It really
should be in a separate rsync module, and the archive (if that happens) should also be in a separate rsync module.
In my proposal e-mail I mentioned that it was important for mirror manager to allow mirror admins to opt-in to hosting Docker content. Since we don't know the answer to so many of these questions, I suggest we opt mirrors out by default, and let admins opt themselves in as they please. Our proposal didn't have an exact path for storing Docker images, but it was planned to be separated from the RPM and ISO content at a fairly high level in the tree.
+1
-AdamM
I apologize for having so few answers. If anyone can shed more light on Jason's questions, please reply. _______________________________________________ infrastructure mailing list infrastructure@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproje...