My disk filled up due to new copies of epel appearing under archive. Can someone please hardlink this content from epel to archive/epel?
Thanks.
On Sat, 27 Jul 2019 at 01:00, Anderson, Charles R cra@wpi.edu wrote:
My disk filled up due to new copies of epel appearing under archive. Can someone please hardlink this content from epel to archive/epel?
OK my sincere apologies. I broke a lot of promises in doing this.
I did not announce this before hand I did not announce after words. I did not put in a ticket in our work queues to cover this.
I did a bunch of copying of packages over to archives at the end of May. I do all the copies using cp -l to preserve links. I then ran the update-archives for full file list but then did not see it did not work. I did not announce this
This we got several requests from mirrors to get rid of f28 out of normal space. I confirmed that f28 was copied over and then I synced over f29 and f30 so I would not be behind in a couple of months. All of these were also done with cp -l also and I checked that the hardlink numbers had increased on files. I gave the info to Adrian and he let me know that the update-archives was not updated. I then fixed that and again didnt send any email about this.
So at this point I have majorly screwed you guys over 4 times.
Thanks. _______________________________________________ Mirror-admin mailing list -- mirror-admin@lists.fedoraproject.org To unsubscribe send an email to mirror-admin-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/mirror-admin@lists.fedoraproje...
On Sat, Jul 27, 2019 at 10:04:06AM -0400, Stephen John Smoogen wrote:
I did a bunch of copying of packages over to archives at the end of May. I do all the copies using cp -l to preserve links. I then ran the update-archives for full file list but then did not see it did not work.
This we got several requests from mirrors to get rid of f28 out of normal space. I confirmed that f28 was copied over and then I synced over f29 and f30 so I would not be behind in a couple of months. All of these were also done with cp -l also and I checked that the hardlink numbers had increased on files. I gave the info to Adrian and he let me know that the update-archives was not updated. I then fixed that and again didnt send any email about this.
I'm not sure whether q-f-m didn't work quite right, whether there was a server-side issue (update-archives?), or whether this was just all due to timing of cp -l vs. update-archives vs. my q-f-m runs, but somehow I received over 1 TB of downloads that were not hardlinked. It did not finish (I ran out of disk space) so I cleaned that up yesterday, moved the files from .~tmp/* to their final locations, ran a local hardlink.py, and re-ran q-f-m. The hardlink restored the 1+ TB of free space. The files are all staying hardlinked, but due to q-f-m optimizing the rsync runs, that doesn't provide any assurance that the master mirror is as hardlinked as it could be.
To focus on the good parts:
My local hardlink.py run saved at least 600 GB additional space compared to before Friday. I think some of the additional savings comes from internal hardlinking between e.g. Workstation, Everything, Spins, noarch across different arch directories, etc. that one might expect to be already hardlinked, but weren't for some reason.
I'm going to look into running quick-fedora-harlink locally regularly.
On Sun, 28 Jul 2019 at 13:17, Anderson, Charles R cra@wpi.edu wrote:
On Sat, Jul 27, 2019 at 10:04:06AM -0400, Stephen John Smoogen wrote:
I did a bunch of copying of packages over to archives at the end of May.
I
do all the copies using cp -l to preserve links. I then ran the update-archives for full file list but then did not see it did not work.
This we got several requests from mirrors to get rid of f28 out of normal space. I confirmed that f28 was copied over and then I synced over f29
and
f30 so I would not be behind in a couple of months. All of these were
also
done with cp -l also and I checked that the hardlink numbers had
increased
on files. I gave the info to Adrian and he let me know that the update-archives was not updated. I then fixed that and again didnt send
any
email about this.
I'm not sure whether q-f-m didn't work quite right, whether there was a server-side issue (update-archives?), or whether this was just all due to timing of cp -l vs. update-archives vs. my q-f-m runs, but somehow I received over 1 TB of downloads that were not hardlinked. It did not finish (I ran out of disk space) so I cleaned that up yesterday, moved the files from .~tmp/* to their final locations, ran a local hardlink.py, and re-ran q-f-m. The hardlink restored the 1+ TB of free space. The files are all staying hardlinked, but due to q-f-m optimizing the rsync runs, that doesn't provide any assurance that the master mirror is as hardlinked as it could be.
To focus on the good parts:
My local hardlink.py run saved at least 600 GB additional space compared to before Friday. I think some of the additional savings comes from internal hardlinking between e.g. Workstation, Everything, Spins, noarch across different arch directories, etc. that one might expect to be already hardlinked, but weren't for some reason.
I'm going to look into running quick-fedora-harlink locally regularly.
So I am going to look at using the hardlink command with a 'do not do this, just tell me what you would do' option first. I will report to the list what I find. This would be with the hardlink we normally ship, but I was wondering if anyone has any new versions they recommend.
Mirror-admin mailing list -- mirror-admin@lists.fedoraproject.org To unsubscribe send an email to mirror-admin-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/mirror-admin@lists.fedoraproje...
On Sun, Jul 28, 2019 at 01:21:16PM -0400, Stephen John Smoogen wrote:
On Sun, 28 Jul 2019 at 13:17, Anderson, Charles R cra@wpi.edu wrote:
On Sat, Jul 27, 2019 at 10:04:06AM -0400, Stephen John Smoogen wrote:
I did a bunch of copying of packages over to archives at the end of May.
I
do all the copies using cp -l to preserve links. I then ran the update-archives for full file list but then did not see it did not work.
This we got several requests from mirrors to get rid of f28 out of normal space. I confirmed that f28 was copied over and then I synced over f29
and
f30 so I would not be behind in a couple of months. All of these were
also
done with cp -l also and I checked that the hardlink numbers had
increased
on files. I gave the info to Adrian and he let me know that the update-archives was not updated. I then fixed that and again didnt send
any
email about this.
I'm not sure whether q-f-m didn't work quite right, whether there was a server-side issue (update-archives?), or whether this was just all due to timing of cp -l vs. update-archives vs. my q-f-m runs, but somehow I received over 1 TB of downloads that were not hardlinked. It did not finish (I ran out of disk space) so I cleaned that up yesterday, moved the files from .~tmp/* to their final locations, ran a local hardlink.py, and re-ran q-f-m. The hardlink restored the 1+ TB of free space. The files are all staying hardlinked, but due to q-f-m optimizing the rsync runs, that doesn't provide any assurance that the master mirror is as hardlinked as it could be.
To focus on the good parts:
My local hardlink.py run saved at least 600 GB additional space compared to before Friday. I think some of the additional savings comes from internal hardlinking between e.g. Workstation, Everything, Spins, noarch across different arch directories, etc. that one might expect to be already hardlinked, but weren't for some reason.
I'm going to look into running quick-fedora-harlink locally regularly.
So I am going to look at using the hardlink command with a 'do not do this, just tell me what you would do' option first. I will report to the list what I find. This would be with the hardlink we normally ship, but I was wondering if anyone has any new versions they recommend.
I know of at least 5 implementations. Despite hardlink.py claiming to be faster than the original C version, it still took over 12 hours when I ran it yesterday.
- hardlink (the C version we ship in Fedora). - hardlink in util-linux (maybe the same as above). - hardlink from Debian that may end up being adopted into util-linux (see https://github.com/karelzak/util-linux/issues/808). - hardlink.py by John Villalovos that I used (https://code.google.com/archive/p/hardlinkpy/, there appear to be forks on github).
# John Villalovos # email: john@sodarock.com # http://www.sodarock.com/ # # Inspiration for this program came from the hardlink.c code. I liked what it # did but did not like the code itself, to me it was very unmaintainable. So I # rewrote in C++ and then I rewrote it in python. In reality this code is # nothing like the original hardlink.c, since I do things quite differently. # Even though this code is written in python the performance of the python # version is much faster than the hardlink.c code, in my limited testing. This # is mainly due to use of different algorithms. # # Original inspirational hardlink.c code was written by: Jakub Jelinek # jakub@redhat.com
- quick-fedora-hardlink (https://docs.pagure.org/quick-fedora-mirror/quick-fedora-hardlink.rst) that can run much quicker using the filelists in the repos.
"SJS" == Stephen John Smoogen smooge@gmail.com writes:
SJS> So I am going to look at using the hardlink command with a 'do not SJS> do this, just tell me what you would do' option first. I will SJS> report to the list what I find. This would be with the hardlink we SJS> normally ship, but I was wondering if anyone has any new versions SJS> they recommend.
Well, do note that the hardlinker in quick-fedora-mirror tries to use the file lists to make things faster. There is an 'ignore timestamps' mode intended for use on the mirror server. Then the primary optimization is the assumption that linked files will have identical filenames, which I don't think a general hardlink program will do.
- J<
On 7/29/2019 11:53 AM, Jason L Tibbitts III wrote:
"SJS" == Stephen John Smoogen smooge@gmail.com writes:
SJS> So I am going to look at using the hardlink command with a 'do not SJS> do this, just tell me what you would do' option first. I will SJS> report to the list what I find. This would be with the hardlink we SJS> normally ship, but I was wondering if anyone has any new versions SJS> they recommend.
Well, do note that the hardlinker in quick-fedora-mirror tries to use the file lists to make things faster. There is an 'ignore timestamps' mode intended for use on the mirror server. Then the primary optimization is the assumption that linked files will have identical filenames, which I don't think a general hardlink program will do.
- J<
Would it be possible to convert some of the hardlinks to softlinks at the generating source? That seems like it would help make this process much easier for rsync users overall.
This would especially be the helpful for the few entire directories... Not sure if it's still the case, but IIRC rawhide/Everything/source/tree/ and rawhide/Everything/SRPMS/ at one point seemed functionally identical.
-jc
"JC" == Japheth Cleaver cleaver@terabithia.org writes:
JC> Would it be possible to convert some of the hardlinks to softlinks JC> at the generating source? That seems like it would help make this JC> process much easier for rsync users overall.
I think in general that would cause more issues than it would solve. For example, which of fedora and fedora-secondary is the "master" tree that gets the noarch RPMS? If I want to mirror just one, I'd have to figure out how to convert those symlinks back.
That said, if entire directory trees are expected to have the same content in the long term, then it does make much more sense for one to be a simple symlink. But that wouldn't cover the archive, so I think its utility would be limited.
- J<
"ACR" == Anderson, Charles R cra@wpi.edu writes:
ACR> I'm not sure whether q-f-m didn't work quite right, whether there ACR> was a server-side issue (update-archives?), or whether this was ACR> just all due to timing of cp -l vs. update-archives vs. my q-f-m ACR> runs, but somehow I received over 1 TB of downloads that were not ACR> hardlinked.
Here's the issue, put simply:
Currently quick-fedora-mirror will copy hardlinks between rsync modules as hardlinks if, and only if, it sees the changes in the file lists for both modules in the same run.
If you hardlink, and then don't update the file lists of both modules simulteneously, then q-f-m will see a bunch of timestamp updates in one module on one run, and then in a subsequent run it will see a bunch of new files in a different module. (The same thing in the reverse order.)
The file lists include a timestamp, which is the newer of mtime and ctime. Hardlinking will update the timestamp on both files, which is how it detects that hardlinks were made at all. Then it tells rsync to transfer both linked files and (due to rsync's '-H' flag) they get transferred as links.
I have a strategy for working around this, but I haven't implemented it yet.
ACR> I'm going to look into running quick-fedora-harlink locally ACR> regularly.
It should always be safe to run. Because links can always be created or broken on the server in ways that might not be picked up by q-f-m, it's not a bad idea to run it occasionally. It tries to be "faster" by using the information in the file lists, but there's still so much stuff in there that it takes a while. (For more fun, there are even files with identical names, sizes, and timestamps but not identical content.)
- J<
mirror-admin@lists.fedoraproject.org