There will be an outage starting at 2017-08-04 21:00 UTC, which will last
approximately 15 hours.
To convert UTC to your local time, take a look at
date -d '2017-08-04 21:00 UTC'
Reason for outage: important HyperKitty database schema change
* HyperKitty, the mailing-list archives and reading interface
* Postorius, the mailing-list administration interface
Mailman will not be affected, emails sent to the lists will continue to be
delivered during the outage, but they will not be archived at that time
(the archiver will be offline). The emails delivered during the outage
will be added to the archives when they are back online.
Ticket Link: https://pagure.io/fedora-infrastructure/issue/6184
Please join #fedora-admin in irc.freenode.net or add comments to the
ticket for this outage above.
= Introduction =
We will use it over the week before the meeting to gather status and info and
discussion items and so forth, then use it in the irc meeting to transfer
information to the meetbot logs. Please either email me directly or
add to the gobby session what is needed.
= Meeting start stuff =
#startmeeting Infrastructure (2017-07-27)
#chair smooge relrod nirik abadger1999 dgilmore threebean pingou
= Let new people say hello =
#topic New folks introductions
= Status / information / Trivia / Announcements =
(We put things here we want others on the team to know, but don't need
(Please use #info <the thing> - your name)
#topic announcements and information
info PHX2 Colo Trip coming up, Aug 14-18
#info Major outage planned for Aug14->18
#info FLOCK at Cape Code Aug29->Sep01
#info Fedora F27 Rebuild
#info bodhi ?
#info pagure ?
#info mailman ?
#info other apps?
= Things we should discuss =
We use this section to bring up discussion topics. Things we want to talk about
as a group and come up with some consensus /suor decision or just brainstorm a
problem or issue. If there are none of these we skip this section.
(Use #topic your discussion topic - your username)
#topic (2017-07-27) Service Level Expectations (SLE)
#info What are SLE's?
#info Why do we need them?
#info Who sets them?
#info How are they followed?
#info Where do they affect things?
#info When do we put them in place?
= Apprentice office hours =
#topic Apprentice Open office hours
Here we will discuss any apprentice questions, try and match up people looking
for things to do with things to do, progress, testing anything like that.
= Learn about some application or setup in infrastructure =
(This section, each week we get 1 person to talk about an application or setup
that we have. Just going over what it is, how to contribute, ideas for
etc. Whoever would like to do this, just add the i/nfo in this section. In the
event we don't find someone to teach about something, we skip this section
and just move on to open floor.)
#topic Learn about: Nothing was put here this week
= Meeting end stuff =
#topic Open Floor
Stephen J Smoogen.
There is some effort right now to make Bodhi gate updates based on
Greenwave. We would like to test Bodhi and Greenwave together in
staging and ultimately use it in production soon.
However, there is some question about how to deploy Greenwave and
WaiverDB in our infrastructure. The Greenwave/WaiverDB authors had been
planning to deploy them onto OpenShift, and have already done the work
to make them work that way. However it seems that we don't have a
production-ready OpenShift just yet. The question is: should the
Greenwave authors wait for an OpenShift to be deployed on our
infrastructure, or should they go ahead and plan for a "traditional"
They would prefer to use OpenShift since they have already put the
effort into being compatible with it, but they would also like to know
as soon as possible if that isn't going to be a reality in time so they
can start the effort to write playbooks and set up VMs for a
traditional deployment. Thus, it would help them to know soon if they
will not be able to use OpenShift.
Now that we are *done* with f25 it has been requested 
that we update the daily bodhi push to update the release ref
so that anyone choosing to stay on f25 for now will get updated
Signed-off-by: Dusty Mabe <dusty(a)dustymabe.com>
roles/bodhi2/backend/templates/atomic-config.py.j2 | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/roles/bodhi2/backend/templates/atomic-config.py.j2 b/roles/bodhi2/backend/templates/atomic-config.py.j2
index b998bde..d75269c 100644
@@ -71,7 +71,7 @@ config = dict(
- 'ref': 'fedora-atomic/25/x86_64/updates/docker-host',
+ 'ref': 'fedora-atomic/25/x86_64/docker-host',
'repos': ['fedora-25', 'fedora-25-updates', 'updates'],
I'd like to ask for one of three actions for Bodhi's staging database:
0) Go back to a "normal" postgres database.
1) Someone (who is not me due to Fedora 27 time constraints ☺) make
Bodhi and its staging sync playbook be BDR compliant.
2) Think of other options?
I've been trying to deploy a Bodhi 2.9.0 beta to staging for a week,
starting on Monday the 10th. The staging BDR sync was broken, and due
to how busy everything was given the Fedora 26 release it was
understandably a "back burner" item. To be clear I'm not criticizing
the response time of anyone here and I know it was a crazy week.
However, I want to convey the frustration one might feel from being
stuck with a broken database for a time with no "self-service" way to
get yourself unstuck.
Thanks to Patrick that was fixed today, but then I ran into other
problems. Patrick helped me solve a lot of those too (thanks!), but
there is still a difficult one to resolve: the staging sync script (the
one that brings production data over to staging) is a manual SQL
script, and is difficult to get right. It's a lot of DROP TABLE and
DROP TYPE statements, and it turned out that it was missing some items.
When I attempted to expand the set of items to be more complete, it got
into a state that was difficult to recover from with an error message
that was not very helpful ("cache lookup failed for relation 7418164").
Very likely there is some relationship that needs to be dropped first,
but psql isn't giving me very useful information to determine what that
relationship is ☹
Before Bodhi was using BDR in stage, the delete/recreate part of the
sync script was simple: "dropdb bodhi2 && createdb -O bodhi2 bodhi2".
It was reliable, and it was guaranteed to get the DB into the same
state as production (which is very good for testing migrations). Now
that Bodhi is using BDR, we cannot use dropdb because some database
objects need to persist because they need to be created outside of BDR
by an admin (I actually can't remember the details of exactly what
can't handle this, but I remember needing to alter it when Bodhi was
switched to BDR). This is why we have the manual DROP TABLE/TYPE
script. And with this script, it's hard to say for sure that the
staging db is in the same state as production before running the
migrations. This means that running the migrations on staging might not
give me the same experience as I would get on production, which makes
stage a little less useful than it would be otherwise. Thus, if we get
this script working for now it'll also need maintenance to keep it
working in the future.
Side question: Is there a better way we could sync the production
database to staging than these DROP TABLE/TYPE statements followed by
importing the production SQL? If so, that might help me a lot.
In addition to the above, Bodhi itself doesn't work with BDR. It has
a number of tables that don't have primary keys (and those tables don't
necessarily have natural primary keys either as most of them are
through tables and don't otherwise need PKs), and there are also some
warnings that should be studied. Also, Bodhi's code assumes an ACID
compliant database and BDR does remove ACID guarantees in some
circumstances. Bodhi's code will need to be studied to look for queries
where this might matter (it's possible there aren't any - we just need
to make sure and that takes time). We had talked about the possibility
of using a "distributed transaction sync" (was that what it was
called?) to make sure all nodes commit before the client is told
"success" on a commit, but upon further reading the docs on that
feature I'm not confident that's what the feature does, mostly due to
the docs being very thin about it (iirc, there's only a sentence or two
written about this). I think we need to do some testing to ensure that
is what it means before assuming it means that, which again, takes
These things can be fixed, but I need to be focusing on Fedora 27 goals
right now. Until Bodhi is compliant, Bodhi's staging deployment is less
useful to me because I am unable to log in to it when BDR is enabled,
which makes it hard for me to test a lot of Bodhi's functionality in
staging. And as I detailed above, I also cannot easily sync data from
production. (Patrick kindly temporarily turned off BDR for me on
staging, so I should be able to log in right now. Thanks!)
I would like to move Bodhi's staging database back to a normal Postgres
database until I have some time to make it BDR compliant. Making it BDR
compliant is not simply about getting it to have primary keys - we need
to make sure it does safe queries and/or research the distributed
transaction sync, and we need to make that SQL script that drops the
database tables work and drop everything (since I learned today that I
was missing quite a few things). I think this last part could take a
lot of time, so it's not truly an easy fix.
Another option is for someone other than myself to make Bodhi and the
staging sync playbook BDR compliant. I personally don't have time to
focus on making Bodhi BDR compliant right now, but if someone else has
the time and expertise to focus on that I would welcome the help.
Of course, I'm also open to other ideas if you have them.
I hope my tone was perceived as positive and friendly in this e-mail. I
know this is a contentious issue and I'm not trying to stir the pot or
ruffle any feathers. I am also not opposed to the pursuit of BDR, and I
think I understand the motivation of the systems team for wanting to
use it. Please assume positive intent from me if anything I wrote
disturbs you. My intention is simply to express a problem I am
experiencing and to ask for relief, or for help.
Thanks for reading ☺
I'd meant to raise this question last week but it turned out several
folks were out of pocket who'd probably want to discuss. One of the
aspects of continuous integration that impacts my team is the
storage requirement. How much storage is required for keeping test
results, composed trees and ostrees, and other artifacts? What is
their retention policy?
A policy of "keep everything ever made, forever" clearly isn't
scalable. We don't do that in the non-CI realm either, e.g. with
scratch builds. I do think that we must retain everything we
officially ship, that's well understood. But atop that, anything we
keep costs storage, and over time this storage costs money. So we
need to draw some reasonable line that balances thrift and service.
The second question is probably a good one to start with, so we can
answer the first. So we need to answer the retention question for
some combination of:
1. candidate builds that fail a CI pipeline
2. candidate builds that pass a CI pipeline
3. CI composed testables
* a tree, ISO, AMI, other image, etc. that's a unit
* ostree change which is more like a delta (AIUI)
4. CI generated logs
5. ...other stuff I may be forgetting
My general thoughts are that these things are kept forever:
* (2), but only if that build is promoted as an update or as part of a
* (3), but only if the output is shipped to users
* (4), but only if corresponding to an item in (2) or (3)
Outside that, artifacts and logs are kept only for a reasonable amount
of troubleshooting time. Say 30 days, but I'm not too worried about
the actual time period. It could be adjusted based on factors we have
yet to encounter.
B. Storage - How much?
To get an idea of what this might look like, I think we might make
estimates based on:
* the number of builds currently happening per day
* how many of these builds are within the definition for an officially
shipped thing (like Atomic Host, Workstation, Server, etc.)
* The average size of the sources + binaries, summed out over the ways
we deliver them (SRPM + RPM, ostree binary, binary in another
image), and multiplied out by arches
* Then sum this out over the length of a Fedora release
This is the part I think will need information from the rel-eng and CI
contributors, working together. My assumption is there are gaping
holes in this concept, so don't take this as a full-on proposal.
Rather, looking for folks to help harden the concepts and fill in the
missing pieces. I don't think we need a measurement down to the
single GB; a broad estimate in 100s of GB (or even at the 1 TB order
of magnitude) is likely good enough.
I'm setting the follow-up to infrastructure(a)lists.fedoraproject.org,
since that team has the most information about our existing storage
Paul W. Frields http://paul.frields.org/
gpg fingerprint: 3DA6 A0AC 6D58 FEC4 0233 5906 ACDB C937 BD11 3717
http://redhat.com/ - - - - http://pfrields.fedorapeople.org/
The open source story continues to grow: http://opensource.com
Good Morning Everyone,
As there were people on IRC asking for it, I cut a new release of
Here is its changelog:
* Mon Jul 24 2017 Pierre-Yves Chibon <pingou(a)pingoured.fr> - 0.4-1
- Update to 0.4
- Adjust the generation of the configuration with the change made to groups
(requires pagure >= 3.2)
- Ship a custom repo_info template with information specific for dist-git
This is the outcome: http://pkgs.stg.fedoraproject.org/pagure/rpms/fedocal
Good Morning Everyone,
I just cut two pagure releases: 3.3 and 3.3.1 (because as always I can't make a
Here are their changelogs:
* Mon Jul 24 2017 Pierre-Yves Chibon <pingou(a)pingoured.fr> - 3.3.1-1
- Update to 3.3.1
- Fix typo in the alembic migration present in 3.3
* Mon Jul 24 2017 Pierre-Yves Chibon <pingou(a)pingoured.fr> - 3.3-1
- [SECURITY FIX] block private repo (read) access via ssh due to a bug on how we
generated the gitolite config - CVE-2017-1002151 (Stefan Bühler)
- Add the date_modified to projects (Clement Verna)
3.3 is already running in stg and prod but I will update them to 3.3.1 anyway to
Good Morning Everyone,
I just cut a new release of pagure: 3.2.
Here is its changelog:
* Fri Jul 14 2017 Pierre-Yves Chibon <pingou(a)pingoured.fr> - 3.2-1
- Update to 3.2
- Use a decorator to check if a project has an issue tracker (Clement Verna)
- Optimize generating the gitolite configuration for group change
- Fix the issue_keys table for mysql
- Drop the load_from_disk script
- Fix next_url URL parameter on the login page not being used (Carlos Mogas da
- Support configuration where there are no docs folder and no tickets folder
- Show all the projects a group has access to
- Add pagination to the projects API (Matt Prahl)
- Simplify diff calculation (Carlos Mogas da Silva)
- Show the inline comment in the PR's comments by default (Clement Verna)
- Fix the URL in the API documentation for creating a new project (Matt Prahl)
It is currently happily running in stg.pagure.io and