On Sat, Jan 29, 2005 at 02:36:09AM -0500, seth vidal wrote:
> The exercise is to attempt a method in which you save
computation of md5
> or sha1, as these are one of the time consuming steps of createrepo.
> The save would be in a 100k package repository: (100,000 - N) *
> Time(sum_calc), where N equals the number of packages that *need* to
> generate sums for. A parameterized list of package names passed into
> createrepo would be sufficient to figure out what composes the N list.
> An external process, such as a Manifest list, would then be used to
> mitigate a set of packages through the entire build process. Apt uses
> a md5sum cache, but having fine-tuned controlled of the process would
> be more stable and directed. This is how much saving you'd get for #2.
Let me know when you've figured it out but as it stands I don't think
incrementally updating the metadata is very feasible.
How about having multiple repodatas, the base one and small
incremental ones, the incremental ones containing also package
cancelations? As a side effect this would also reduce download
bandwidth and thus make even clients/users happy (not only repo
maintainers).
The base repodata and the incremental ones would be merged from time
to time, best with a binary load algorithm as done in large sum
statistics (for 100K packages you would need only 17 files).
--
Axel.Thimm at
ATrpms.net