Hi,
I stumbled over the CVS branch creation commit mails for a package called "stardict-dic" in my inbox and thought "hey, nice, someone packaged dictionaries for startdict". But then I took a closer look at the package and the review bug https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=231267 and got a bit worried.
The plan of the packager afaics seems to be to get all the current and future dictionaries for other languages into this one just-reviewed and approved package stardict-dic (SRPM currently 84 MByte in size afaics), from with multiple sub-packages get build (currently: stardict-dic-{en,ja,ru,zh_CN,zh_TW} ).
The "one SRPM for all dicts" approach IMHO has major disadvantages IMHO: the SRPM will become really big if dictionaries for other languages become part of it. This might be acceptable, but for each added dict or each bug that gets fixed in one of the dictionaries the whole SRPM gets rebuild and thus all the stardict-dic-{en,ja,ru,zh_CN,zh_TW} packages get created newly as well, which creates a lot of load for mirrors and users to download and install (¹).
This sems very wrong to me; or am I overacting here?
I'd say we should work towards a scheme similar to hunspell; e.g. one source package per language. Other opinions?
CU thl
(¹) -- Example:
- user foo install stardict-dic-en-2.4.2-2.fc7.noarch.rpm - maintainer add german dictionary to stardict-dic, increases release by one and rebuilds - user foo updates to stardict-dic-en-2.4.2-3.fc7.noarch.rpm - maintainer add spanisch dictionary to stardict-dic, increases release by one and rebuilds - user foo updates to stardict-dic-en-2.4.2-4.fc7.noarch.rpm - and so on
On Tue, 2007-06-26 at 09:33 +0200, Thorsten Leemhuis wrote:
Hi,
I stumbled over the CVS branch creation commit mails for a package called "stardict-dic" in my inbox and thought "hey, nice, someone packaged dictionaries for startdict". But then I took a closer look at the package and the review bug https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=231267 and got a bit worried.
The plan of the packager afaics seems to be to get all the current and future dictionaries for other languages into this one just-reviewed and approved package stardict-dic (SRPM currently 84 MByte in size afaics), from with multiple sub-packages get build (currently: stardict-dic-{en,ja,ru,zh_CN,zh_TW} ).
The "one SRPM for all dicts" approach IMHO has major disadvantages IMHO: the SRPM will become really big if dictionaries for other languages become part of it. This might be acceptable, but for each added dict or each bug that gets fixed in one of the dictionaries the whole SRPM gets rebuild and thus all the stardict-dic-{en,ja,ru,zh_CN,zh_TW} packages get created newly as well, which creates a lot of load for mirrors and users to download and install (¹).
This sems very wrong to me; or am I overacting here?
No, I am inclined to agree with you.
I'd say we should work towards a scheme similar to hunspell; e.g. one source package per language. Other opinions?
One "srpm" would be appropriate if upstream ships one monolithic tarball or always generates all subpackage's tarballs from the same master source tree (CVS, SVN, etc.) at the same time (E.g. GCC does this, they are shipping alternative "monolithic" and "split" tarballs).
Also consider: At the very moment the dictionaries' versions should diverge, this monolithic srpm enter problems with EVRs.
Ralf
Le Mar 26 juin 2007 09:33, Thorsten Leemhuis a écrit :
This sems very wrong to me; or am I overacting here?
This is very wrong unless they are managed by a single project and released in a single archive. As a rule multiple package source archives should be the exception, and multiple source archives released by different upstreams totaly forbidden (IMHO). They just make solid upstream tracking a PITA, and confuse users.
Hi, On 6/26/07, Thorsten Leemhuis fedora@leemhuis.info wrote:
Hi,
I stumbled over the CVS branch creation commit mails for a package called "stardict-dic" in my inbox and thought "hey, nice, someone packaged dictionaries for startdict". But then I took a closer look at the package and the review bug https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=231267 and got a bit worried.
The plan of the packager afaics seems to be to get all the current and future dictionaries for other languages into this one just-reviewed and approved package stardict-dic (SRPM currently 84 MByte in size afaics), from with multiple sub-packages get build (currently: stardict-dic-{en,ja,ru,zh_CN,zh_TW} ).
The "one SRPM for all dicts" approach IMHO has major disadvantages IMHO: the SRPM will become really big if dictionaries for other languages become part of it. This might be acceptable, but for each added dict or each bug that gets fixed in one of the dictionaries the whole SRPM gets rebuild and thus all the stardict-dic-{en,ja,ru,zh_CN,zh_TW} packages get created newly as well, which creates a lot of load for mirrors and users to download and install (¹).
This sems very wrong to me; or am I overacting here?
You are right. Sorry for my misunderstanding. I assumed and accepted submitter request to have all language dictionaries in single SPEC but forgot what will happen if upstream released new version for any dictionary. I have ping submitter on IRC and asked him to submit new package requests per language dictionary. thanks for pointing this issue. Regards, Parag.
On Tue, Jun 26, 2007 at 09:33:00AM +0200, Thorsten Leemhuis wrote:
The plan of the packager afaics seems to be to get all the current and future dictionaries for other languages into this one just-reviewed and approved package stardict-dic (SRPM currently 84 MByte in size afaics), from with multiple sub-packages get build (currently: stardict-dic-{en,ja,ru,zh_CN,zh_TW} ).
One valid worry of a submitter for such a case might be that he's afraid of having to redo all the review mechanics for every such small subpackage addition.
Maybe we should allow template reviews, e.g. the submitter lists 5 packages that are almost identical but the explicit locale definition and get's a blanket-like approval granted of any further packages he will copy off this template. When new languages make it to the stack he simply adds a CVS request for another package pointing to the template approval.
The same model could apply to font collections, themes etc.
On 26.06.2007 12:53, Axel Thimm wrote:
On Tue, Jun 26, 2007 at 09:33:00AM +0200, Thorsten Leemhuis wrote: [...] Maybe we should allow template reviews, e.g. the submitter lists 5 packages that are almost identical but the explicit locale definition and get's a blanket-like approval granted of any further packages he will copy off this template. [...]
I don't think a "blanket-like approval" makes sense, as at least the md5sums and the license actually still should be checked by a reviewer.
Further: the reviewer of course can (and IMHO should) just do those two checks, run a "diff -u" of the approved spec file against the unapproved one, take a quick look at the differences and then approved the second package if everything looks sane. That can likely be done in less then 5 minutes and is not that much work. Sure, there is a small risk to miss a bug or problem, but that's life -- no reasons to unlock the orbital laser or to wake up Spot's nijas ( http://fedoraproject.org/wiki/TomCallaway/Ninjas ) :-)
BTW, that's afaics how some of the reviews were done already (hunspell for example). Maybe it's worth to write that down somewhere in the wiki -- but there is IMHO no need to add it to the guidelines directly.
CU thl
On Tue, Jun 26, 2007 at 01:19:19PM +0200, Thorsten Leemhuis wrote:
On 26.06.2007 12:53, Axel Thimm wrote:
On Tue, Jun 26, 2007 at 09:33:00AM +0200, Thorsten Leemhuis wrote: [...] Maybe we should allow template reviews, e.g. the submitter lists 5 packages that are almost identical but the explicit locale definition and get's a blanket-like approval granted of any further packages he will copy off this template. [...]
I don't think a "blanket-like approval" makes sense, as at least the md5sums and the license actually still should be checked by a reviewer.
Further: the reviewer of course can (and IMHO should) just do those two checks, run a "diff -u" of the approved spec file against the unapproved one, take a quick look at the differences and then approved the second package if everything looks sane. That can likely be done in less then 5 minutes and is not that much work.
But you assume a zero-second lead-time to finding a reviewer at the first place. If there is a new language package in two weeks or two months do you really think that a reviewer will come immediately to the attention of this new package?
Also we don't recheck md5sums and license changes on each package update, because we started to trust the maintainer, why should we put up the burden for doing so here, when we are semantically only splitting the srpms? These are the reasons that people prefer to keep stuff like that conglomerated in one big chunk instead of going through loops for every new subpackage.
Anyway this was just a suggestion on how to deal with that to make maintainers' life *easy* on our requests to have fine grained srpms w/o compromizing anything (you didn't have license checks and md5sum checks on "growing" srpms before either). The less burocratic/painful you make it the more people will agree to it.
BTW this is not an FPC issue to decide anyway, the guidelines would not change, it would be fesco that would consider adding template reviews or not.
On 26.06.2007 17:42, Bill Nottingham wrote:
Thorsten Leemhuis (fedora@leemhuis.info) said:
I'd say we should work towards a scheme similar to hunspell; e.g. one source package per language. Other opinions?
I'd work towards only having one set of dictionaries first.
Well, sure, that's the real solution (and we could end world hunger while at it as well ;-) ).
But likely still years away (hunspell seems to be on the way to at least have one dict for oofice, firefox, thunderbird and hopefully other apps in the end as well; but I use stardic for translating de <-> en -- that's something hunspell does not target afaics)
Cu thl
Le mardi 26 juin 2007 à 18:07 +0200, Thorsten Leemhuis a écrit :
On 26.06.2007 17:42, Bill Nottingham wrote:
Thorsten Leemhuis (fedora@leemhuis.info) said:
I'd say we should work towards a scheme similar to hunspell; e.g. one source package per language. Other opinions?
I'd work towards only having one set of dictionaries first.
Well, sure, that's the real solution (and we could end world hunger while at it as well ;-) ).
But likely still years away (hunspell seems to be on the way to at least have one dict for oofice, firefox, thunderbird and hopefully other apps in the end as well; but I use stardic for translating de <-> en -- that's something hunspell does not target afaics)
We still need to define how we handle specialist hunspell dicts. In most languages you have usual vocabulary and specialist/unusual terms, and stuffing all in the same dicts actually decreases spellchecking efficiency. So you need a core dict and extensions people install only if they use the associated terms.
packaging@lists.fedoraproject.org