I found a bug! The impact of it is that the releng primary arch 'compose' fedmsg messages were being considered invalid by our consuming services (ircbot, etc..) but the s390 compose messages were being let through.
First -- here's the patch, then the explanation:
diff --git a/inventory/host_vars/branched-composer.phx2.fedoraproject.org b/inventory/host_vars/branched-composer.phx2.fedoraproject.org index bfe9b94..a5d3514 100644 --- a/inventory/host_vars/branched-composer.phx2.fedoraproject.org +++ b/inventory/host_vars/branched-composer.phx2.fedoraproject.org @@ -7,12 +7,3 @@ volgroup: /dev/vg_bvirthost08 kojipkgs_url: kojipkgs.fedoraproject.org kojihub_url: koji.fedoraproject.org/kojihub kojihub_scheme: https - -# These are consumed by a task in roles/fedmsg/base/main.yml -fedmsg_certs: -- service: shell - owner: root - group: root -- service: bodhi - owner: root - group: masher diff --git a/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org b/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org index a0d17a6..9cb3409 100644 --- a/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org +++ b/inventory/host_vars/rawhide-composer.phx2.fedoraproject.org @@ -6,12 +6,3 @@ volgroup: /dev/vg_bvirthost06 kojipkgs_url: kojipkgs.fedoraproject.org kojihub_url: koji.fedoraproject.org/kojihub kojihub_scheme: https - -# These are consumed by a task in roles/fedmsg/base/main.yml -fedmsg_certs: -- service: shell - owner: root - group: root -- service: bodhi - owner: root - group: masher
It is just *removing* lines from the host_vars files for rawhide-composer and branched-composer. But why?
First, those fedmsg_certs vars are already defined at the group_vars level here: https://infrastructure.fedoraproject.org/cgit/ansible.git/tree/inventory/gro...
That more fully-defined and correct statement at the group level was being overwritten by the less fully defined statements at the host level (see ansible var precedence rules). Everything appeared to be working normally while this was the case because, since no hosts declared that they could send those fedmsg topics there was no explicit check for who could send them.
It didn't matter until we added the s390 koji hub a week or so ago which is allowed to broadcast those same topics. It had its fedmsg_certs correctly defined, and since it declared that it could send those topics -- and no other hosts made the same declarations -- the primary compose messages suddenly started being considered invalid (unauthorized).
By removing these old crufty definitions at the host level and letting the correct definition at the group level prevail -- all those hosts should show up correctly in the fedmsg authz policy and things should start working again.
This will require a master playbook run on just the fedmsgdconfig tag to push out making this a "high touch" change to do during freeze, but I'm quite certain it is correct.
Can I get two +1s?
-Ralph
On Fri, 7 Aug 2015 14:26:06 -0400 Ralph Bean rbean@redhat.com wrote: ...snip...
This will require a master playbook run on just the fedmsgdconfig tag to push out making this a "high touch" change to do during freeze, but I'm quite certain it is correct.
Also, might run a -t fedmsgmonitor after that... to make sure the monitoring socket stuff is right after the config restarts things.
Can I get two +1s?
+1
kevin
Read and approved. Thankyou for the long explaination
On 7 August 2015 at 12:47, Kevin Fenzi kevin@scrye.com wrote:
On Fri, 7 Aug 2015 14:26:06 -0400 Ralph Bean rbean@redhat.com wrote: ...snip...
This will require a master playbook run on just the fedmsgdconfig tag to push out making this a "high touch" change to do during freeze, but I'm quite certain it is correct.
Also, might run a -t fedmsgmonitor after that... to make sure the monitoring socket stuff is right after the config restarts things.
Can I get two +1s?
+1
kevin
infrastructure@lists.fedoraproject.org