On Sat, 11 Jul 2009 11:22:51 -0700
Konstantin Svist <fry.kun(a)gmail.com> wrote:
> Why isn't this the default?
Probably because it doesn't work too well.
The way it works is by pinging the IPs of all mirrors to see which one
has the smallest latency. In a perfect world, that would be the source
you want to use, but in reality all it tells you is which mirror's
firewall/etc. responds to ping fastest.
What happens is the server is either lagging or becomes overloaded
fast (possibly because many users are trying to update from it) so
the speed goes down the drain.
End result: I have 3mb download on my DSL line, but using the "fastest
mirror" often gives me 50K (or even lower) download speed. Sometimes I
end up editing the timedhosts.txt file and faking a long timeout value
for the slow host.
I turned it off for exactly the reason you cite above. This isn't a
measure of a fastest mirror. I think it would be really great if yum
would keep track of the mirrors used and the actual download rate
obtained over time with each mirror. It would have to be a weighted
update of speed averaging past total at past speed and current total
and current speed. Then instead of always using the fastest mirror, take
the top N (20?, user selectable?) fastest mirrors and share the load
among them. Set a size threshold so that only files above a certain
size (2 MB?, 1MB?, 500K?) are taken from the fastest mirrors, let any
other files come from anywhere. If a file is 20K in size, the speed of
the download isn't really relevant. This shares the load and ensures
that certain servers don't get hit by everyone, thus degrading their
performance and stressing them unfairly.
I've thought about writing a plugin to do this, but haven't made it
yet. The fastest mirror plugin would be a good template.
Mostly, I have just decided to run yum manually at slack times when it
doesn't matter how fast it is, and it just does its thing.
On a related note, it would be nice if at some level of logging, yum
printed the url of the mirror used for the download with the download
rate. Then I could create my own crude version of the above algorithm
just by parsing the log output and creating my own database or file
with the information.