Greetings.
At the recent flock conference Infrastructure workshop, we had a nice lively discussion on a number of items.
However, as is normal, we don't want to make decisions about things without being open and allowing input from everyone, including those that couldn't be at flock. So, I thought I would write up what we talked about and the consensus we came up with and ask for any more input from this list before we start implementing things. It's possible there's something we didn't think about or that needs more discussion, so do feel free to reply to this email with any parts you want to comment on.
* Containers in Fedora Infrastructure:
* We want to look at moving things that make sense to containers. * A good initial candidate is the mirrorlist servers. * Would use the existing OSBS build system to build them. * Would run on proxies. * Would have haproxy list their socket as primary and old mirrorlists as secondary. * The container would have mirrorlist-server wsgi in it along with the pkl updated hourly. * Could allow us to spin up more as needed, but also should allow faster answers from proxies as they don't have to depend on or query over the vpn.
* Contributor resources in the fedorainfracloud
* Once our cloud is upgraded, we can use ipsilon to let users login to the cloud and spin up instances for Fedora related needs. * Outgoing restrictions would be added on port 25 and the like * To start with users would only get 1 external floating ip. * Initial rollout would enable qa and packager groups, need to see if docs and i18n or other groups would have a use for it. * would note that we can terminate any instance for any reason. * Patrick would write some scripting to notify users after some time and terminate if we didn't get an answer back. * Long term instances should be moved to persistent infra playbooks.
* Build setup and requirements for infrastructure applications. * Will get releng to set us up some side tags that we can build from src.rpm in. * all prod builds to be done in koji. * Up to maintainers what priority they place on getting into EPEL/Fedora. Encouraged for many reasons.
* FAS3 status * Was running in staging, but we disabled for now until we can finish a security audit. * Need to get python-fedora changes lined up and ready/pushed out. * Need to get fas3 fas_client packaged and ready to go. * Need more testing in staging. * Hopefully move production over after f25 is out.
* Fedora Infrastructure support setup * Talked about on the list a fair bit. * Support can be determined by looking at the domain: fedoraproject.org - full 24x7 support, monitoring, uses RFR stg.fedoraproject.org - 8x5 support, monitoring fedoracommunity.org - some support, monitoring, uses simple RFR fedorainfracloud.org - unsupported, apps run by contributors
* Fedora CA and cert infrastructure. * Current CA expires in 2018. * Plans being worked on now to back fas3 with freeipa so we could move to kerberos tickets for koji then * Need to figure out what would need to happen to sigul for that. * Wait and see pending freeipa/fas3 integration.
* koji alternative arch proposals (on devel list, fesco ticket) * Not too much infrastructure work here. * will need to increase storage for primary koji, but can regain from secondaries once their last releases go end of life.
Thats all I had notes on from the workshop, but there may well have been other items, please do chime in with them if you think of anything, or have any thoughts on the above.
kevin
On Mon, Aug 8, 2016 at 4:40 PM, Kevin Fenzi kevin@scrye.com wrote:
Greetings.
At the recent flock conference Infrastructure workshop, we had a nice lively discussion on a number of items.
Sorry I missed it, the first Flock/FUDCon in a long time that I haven't at least *planned* to attend :). Good news is that I'm in good health and one piece still :D
Containers in Fedora Infrastructure:
- We want to look at moving things that make sense to containers.
- A good initial candidate is the mirrorlist servers.
- Would use the existing OSBS build system to build them.
- Would run on proxies.
- Would have haproxy list their socket as primary and old mirrorlists as secondary.
- The container would have mirrorlist-server wsgi in it along with the pkl updated hourly.
- Could allow us to spin up more as needed, but also should allow faster answers from proxies as they don't have to depend on or query over the vpn.
Sounds like a decent proposal, but I'd like to challenge with the question "why are we doing this?" Is it just because "containers, new shiny things, yay!" or is there some real advantage that having this in containers brings us? How is lifecycle management of these things accomplished? What happens when a container host dies? Lots of questions to be answered here. Not that I necessarily think that it's a bad idea, but containers are disruptive to existing operational models, and we have to be certain that we're prepared for that.
Contributor resources in the fedorainfracloud
- Once our cloud is upgraded, we can use ipsilon to let users login to the cloud and spin up instances for Fedora related needs.
- Outgoing restrictions would be added on port 25 and the like
- To start with users would only get 1 external floating ip.
- Initial rollout would enable qa and packager groups, need to see if docs and i18n or other groups would have a use for it.
- would note that we can terminate any instance for any reason.
- Patrick would write some scripting to notify users after some time and terminate if we didn't get an answer back.
Even with all of these restrictions, I think that the potential for abuse exists. I'd like to see these instances (which I think that it's a good idea to have) be locked down tighter than Fort Knox :). I think that the only thing that they should be able to externally communicate with should be the rest of Infrastructure (koji and the like) just like anybody on the Internet communicates with those services. They should be no more trusted than the computer that I'm writing this on right now is, and perhaps even less so (if that's possible and makes sense).
- Long term instances should be moved to persistent infra playbooks.
Yep.
- Build setup and requirements for infrastructure applications.
- Will get releng to set us up some side tags that we can build from src.rpm in.
- all prod builds to be done in koji.
- Up to maintainers what priority they place on getting into EPEL/Fedora. Encouraged for many reasons.
Nothing earth-shattering here :). All good ideas :).
FAS3 status
- Was running in staging, but we disabled for now until we can finish a security audit.
- Need to get python-fedora changes lined up and ready/pushed out.
- Need to get fas3 fas_client packaged and ready to go.
- Need more testing in staging.
- Hopefully move production over after f25 is out.
Fedora Infrastructure support setup
- Talked about on the list a fair bit.
- Support can be determined by looking at the domain: fedoraproject.org - full 24x7 support, monitoring, uses RFR stg.fedoraproject.org - 8x5 support, monitoring fedoracommunity.org - some support, monitoring, uses simple RFR fedorainfracloud.org - unsupported, apps run by contributors
+1
- Fedora CA and cert infrastructure.
- Current CA expires in 2018.
- Plans being worked on now to back fas3 with freeipa so we could move to kerberos tickets for koji then
- Need to figure out what would need to happen to sigul for that.
- Wait and see pending freeipa/fas3 integration.
This sounds reasonable, but what happens if for some reason the integration doesn't happen or doesn't happen in a timely manner? I don't think that renewing the cert would be a huge deal, because the users of the certs generated by that CA are a well-known quantity (packagers and releng). The support burden of swapping out the CA cert I wouldn't think would be *that* bad. I'm not sure off the top of my head, how often do the user certs expire?
On Mon, 8 Aug 2016 21:41:16 -0400 Jon Stanley jonstanley@gmail.com wrote:
On Mon, Aug 8, 2016 at 4:40 PM, Kevin Fenzi kevin@scrye.com wrote:
Greetings.
At the recent flock conference Infrastructure workshop, we had a nice lively discussion on a number of items.
Sorry I missed it, the first Flock/FUDCon in a long time that I haven't at least *planned* to attend :). Good news is that I'm in good health and one piece still :D
Yeah, sorry you couldn't make it either. ;( There were a number of regulars who were quite missed. ;(
Containers in Fedora Infrastructure:
- We want to look at moving things that make sense to containers.
- A good initial candidate is the mirrorlist servers.
- Would use the existing OSBS build system to build them.
- Would run on proxies.
- Would have haproxy list their socket as primary and old mirrorlists as secondary.
- The container would have mirrorlist-server wsgi in it along with the pkl updated hourly.
- Could allow us to spin up more as needed, but also should allow faster answers from proxies as they don't have to depend on or query over the vpn.
Sounds like a decent proposal, but I'd like to challenge with the question "why are we doing this?" Is it just because "containers, new shiny things, yay!" or is there some real advantage that having this in containers brings us?
An excellent question. :) We could of course just run a mirrorlist-server instance on each of the proxies outside containers, but I think containers gives us some good wins here:
* would allow us to spin up new ons in the cloud if we know load is going to be high.
* would keep logging and such in the container, so we don't need to untangle that from the rest of the things running there.
How is lifecycle management of these things accomplished? What happens when a container host dies? Lots of questions to be answered here. Not that I necessarily think that it's a bad idea, but containers are disruptive to existing operational models, and we have to be certain that we're prepared for that.
Sure, there will be lots of questions as we move forward on it. I'd guess at first lifecycle would be somewhat manual (controlled by ansible), and if a container host dies, then we remove it from dns just as we do now.
Contributor resources in the fedorainfracloud
- Once our cloud is upgraded, we can use ipsilon to let users
login to the cloud and spin up instances for Fedora related needs.
- Outgoing restrictions would be added on port 25 and the like
- To start with users would only get 1 external floating ip.
- Initial rollout would enable qa and packager groups, need to
see if docs and i18n or other groups would have a use for it.
- would note that we can terminate any instance for any reason.
- Patrick would write some scripting to notify users after some
time and terminate if we didn't get an answer back.
Even with all of these restrictions, I think that the potential for abuse exists. I'd like to see these instances (which I think that it's a good idea to have) be locked down tighter than Fort Knox :). I think that the only thing that they should be able to externally communicate with should be the rest of Infrastructure (koji and the like) just like anybody on the Internet communicates with those services. They should be no more trusted than the computer that I'm writing this on right now is, and perhaps even less so (if that's possible and makes sense).
Well, we are opening these to packager and qa to start with. This is hardly an untrusted group. Additionally it should be easy to blacklist folks who abuse things. I guess the amount we need to lock things down depends somewhat on the use cases people come up with.
...snip...
- Fedora CA and cert infrastructure.
- Current CA expires in 2018.
- Plans being worked on now to back fas3 with freeipa so we could move to kerberos tickets for koji then
- Need to figure out what would need to happen to sigul for that.
- Wait and see pending freeipa/fas3 integration.
This sounds reasonable, but what happens if for some reason the integration doesn't happen or doesn't happen in a timely manner? I don't think that renewing the cert would be a huge deal, because the users of the certs generated by that CA are a well-known quantity (packagers and releng). The support burden of swapping out the CA cert I wouldn't think would be *that* bad. I'm not sure off the top of my head, how often do the user certs expire?
It's every 6 months for user certs. Yeah, we could extend the CA expire time if needed.
kevin
... snip ...
Contributor resources in the fedorainfracloud
- Once our cloud is upgraded, we can use ipsilon to let users
login to the cloud and spin up instances for Fedora related needs.
- Outgoing restrictions would be added on port 25 and the like
- To start with users would only get 1 external floating ip.
- Initial rollout would enable qa and packager groups, need to
see if docs and i18n or other groups would have a use for it.
- would note that we can terminate any instance for any reason.
- Patrick would write some scripting to notify users after some
time and terminate if we didn't get an answer back.
Even with all of these restrictions, I think that the potential for abuse exists. I'd like to see these instances (which I think that it's a good idea to have) be locked down tighter than Fort Knox :). I think that the only thing that they should be able to externally communicate with should be the rest of Infrastructure (koji and the like) just like anybody on the Internet communicates with those services. They should be no more trusted than the computer that I'm writing this on right now is, and perhaps even less so (if that's possible and makes sense).
Well, we are opening these to packager and qa to start with. This is hardly an untrusted group. Additionally it should be easy to blacklist folks who abuse things. I guess the amount we need to lock things down depends somewhat on the use cases people come up with.
Note that the cloud instances are totally untrusted by our internal infrastructure, as far as the other boxes concern, they're just internet.
...snip...
- Fedora CA and cert infrastructure.
- Current CA expires in 2018.
- Plans being worked on now to back fas3 with freeipa so we could move to kerberos tickets for koji then
- Need to figure out what would need to happen to sigul for that.
- Wait and see pending freeipa/fas3 integration.
This sounds reasonable, but what happens if for some reason the integration doesn't happen or doesn't happen in a timely manner? I don't think that renewing the cert would be a huge deal, because the users of the certs generated by that CA are a well-known quantity (packagers and releng). The support burden of swapping out the CA cert I wouldn't think would be *that* bad. I'm not sure off the top of my head, how often do the user certs expire?
It's every 6 months for user certs. Yeah, we could extend the CA expire time if needed.
The integration code is now live in staging, and is fairly small and self-contained.
kevin
infrastructure mailing list infrastructure@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/infrastructure@lists.fedoraproje...
infrastructure@lists.fedoraproject.org