On Fri, 2013-12-13 at 05:38 +0000, "Jóhann B. Guðmundsson" wrote:
On fös 13.des 2013 04:06, Adam Williamson wrote:
it was -1'ed at go/no-go meeting in about five seconds. No-one voted +1 blocker on it.
For the first I was not present there since I arrived later then usual due to $dayjob but otherwise I would have voted +1 on it ( which would have then just have been me ) + most of individual present have been voting to push the release out the door to meet that arbitrary deadline we never make anyway which renders this argument mood.
Well, it's not an arbitrary deadline. It's the release schedule. Fedora's supposed to be strongly tied to its schedule; we're not a feature-based release project. So, it inevitably has to be the case that we have to calibrate our quality standards to what it's practical to achieve, in terms of testing and fixing, approximately within that cycle. The way Fedora is set up, it is reasonable for us to slip, oh, say up to three weeks a cycle, to fix really serious problems. But we shouldn't be slipping almost every milestone for every release, which we are. We shouldn't be putting ourselves into positions, _repeatedly_, where we have the choice of fudging our quality standards or delaying releases for multiple months. When those things are happening, what it indicates is that there is a fundamental mismatch between the quality standards we're setting ourselves, the development goals we're setting ourselves, and the resources - in terms of overall tester/developer hours, i.e. a product of the number of devs and testers we have and the release cycle we chose - we have to achieve those things.
The last few cycles we've theoretically set our standards at a certain point, and then not really got close to living up to them. If we say we're going to block on every possible partition layout, we need to test at least some reasonable representative sample of all possible partition layouts by, at latest, Beta RC1. That didn't happen. If we say we're going to block on every single app installed by default for either of two desktops, we need to actually test every single app installed by default on either of those two desktops by Beta RC1, and ideally keep doing it with every relevant change pushed stable after that. That doesn't happen.
The current state is that we're setting a standard which we clearly don't have the QA resources to stand behind sufficiently within the timeframe the project is comfortable with spending on a release, and on freeze/test/stabilize cycles. Pete knows what's coming out of the three-product proposal, but somehow I don't see it making the freezes longer or the set of deliverables any smaller.
If we want to maintain the standards of quality we claimed to be setting for F18, F19 and F20 and actually live up to them, we would need to either drastically increase the resources we dedicate to QA and bugfixing, or reduce the development goals we set. It never seems like it's on the cards to reduce Fedora's pace of development: no-one seems to have the appetite for it. No-one wants to be the person who says 'no, that is a nice sounding feature but we don't have time for it. Put it in the next release.' It never happens.
So we're left with the resources. Red Hat does not have a dozen interns lying around the place we can feed into the Fedora QA hopper. We've certainly got excellent community testers, and some who've started contributing or contributing more in recent releases and helped hugely to make them less of a disaster than they could have been. But it's not like we're getting a dozen new volunteers who'll put in multiple hours per day on testing either. Extending the release cycle or lengthening freezes seems to be as unlikely to happen as reducing the development churn; look at how this thing with trying to make the F21 cycle longer is going with FESCo.
So, the way I see it, if we can't significantly reduce the pace of Fedora development, significantly increase the number of QA and developer people we have available, or lengthen the release cycle or freezes to give us the effect of more resources dedicated to stabilization with the same number of people, we either reduce the expectations we say we have, or we keep on basically lying about what our quality expectations are and then coming up with paper-thin excuses as to why not meeting them doesn't 'really' mean we have to block, while running around like headless chickens throwing builds against the wall one day before go/no-go to fix a bug we didn't know about until 1.1 days before go/no-go. I dunno about you, but the headless chicken act and the hero validation runs are wearing on me. Maybe if we do what I'm suggesting, and we do well for a while, we'll start feeling like we're on top of everything and we have the confidence to move in the direction of increasing the standards again. But right now, honestly, can you say we're in a position where we're able to realistically meet all the standards we are claiming to set for ourselves? I certainly don't.
I was the one who proposed the desktop criteria in the first place, and some of the pushback against them was based on the worry that we wouldn't have time to enforce them. At the time I said I'd stand behind them and make sure the testing got done. Well for a few releases I think we did, and I remember back around 15-16-17 I actually spent quite a bit of time _testing the desktops_, I didn't spend four months straight running anaconda for five hours a day. I also had time to talk to the desktop SIGs and ask them for help and just generally look after the desktop testing process. From F18 on, with the extra workload from anaconda being much more unstable (in the technical sense of 'changing a lot'), with the addition of ARM as a primary arch and cloud as a first-class deliverable and UEFI testing and more USB testing and all the other expansions that have crept into our workload, I really haven't been able to do that the way I did before, and no-one else seems to have stepped in and taken over. So now we have these desktop criteria we (QA) supposedly commit to 'policing', in the sense of actually running the tests and making sure things get fixed on a reasonable timeline, and I don't think that in practice that is actually _happening_.
Now a couple of people have suggested reducing the KDE package set for the DVD install. Like adjusting the criteria, that would also have the effect of reducing the testing workload, so it seems like a reasonable approach. If that's what we want to do instead, fine. But we need the KDE SIG to agree with that approach and commit to it and land the changes, like, _now_. Comfortably in advance of F21 TC1. And we need to be sure that we (QA) can provide the resources to actually do the testing required to properly back the criteria: we need to be actually running all the desktop tests, frequently, during Alpha and Beta, and working with the desktop and KDE SIGs to fix the bugs.
The fact is it should not matter if I install from a live or from the dvd the end result should be the same.
That might be your opinion of what's correct. And hey, maybe everyone agrees with you. But the practical fact of the matter is that is not, at all, how Fedora as it exists right now works. For both GNOME and KDE, if you install from the DVD you get a significant number more packages than if you install from the live image. You do not get the same result. If we all agree that you should, and we go away and _actually implement that_ within the next, say, month, then great: that helps solve all our problems. Does everyone want to do that?