Hey folks!
Jeremy and I have been working on a proposal to migrate fedmsg from our current brokerless architecture to a broker-based architecture.
The overview and reasons for the migration are described on this page: https://fedmsg-migration-tools.readthedocs.io/en/latest/migration/overview.h...
Head there if you want the details. The plan has the following requirements: * No flag day. * Don't disrupt any services or applications. * Don't break any services outside of Fedora's infrastructure relying on these messages.
The first step is to deploy a broker in Fedora to use. In order to avoid a flag day, bridges from AMQP to ZeroMQ and ZeroMQ to AMQP have been implemented and will be deployed. In order to validate that the bridges are functioning, a small service will be run during the transition period that connects to fedmsg and to the AMQP queues to compare messages.
After the bridges are running, applications are free to migrate. There are several options when migrating and each has advantages and disadvantages. We have written a new library called fedora-messaging that has the following features: * A method to define message schemas and offer automatic validation of messages using those schemas. * Boilerplate for typical publishers and consumers.
Head over to the document for a demo!
What do you think of this proposal? Any blind spots? Thanks!
Aurélien
Follow-up to this, Patrick had a few questions on IRC, which I've copied here and answered in case others are interested:
I was wondering whether you intend to continue cryptographically (x509) signing messages, or if you were planning to enforce sender per subject in another way?
Ultimately, no. RabbitMQ provides access controls[0] which I think will meet our needs.
However, the bridge from AMQP back to ZeroMQ will support signing the messages so consumers outside of Fedora Infrastructure are not broken. My expectation is we'll run this bridge long term as the way for external consumers to get events, even if we don't use ZeroMQ internally. ZeroMQ is a solid library and it feels like a good fit for the public access use-case. The bridge is ~10-20 lines of code so it's not a huge maintenance burden, either.
I would like to eventually drop the message signing completely and replace it with the a ZeroMQ socket with zmq-curve[1] for authentication. That lets us stop using fedmsg completely (which is appealing because it depends on pyOpenSSL which is not long for this world).
is the plan to move the projects to the fedora-infra org in the long run, or was your plan to keep them under your personal account?
Definitely planning on moving it over if people like it.
[0] https://www.rabbitmq.com/access-control.html [1] http://api.zeromq.org/4-2:zmq-curve
On Thu, May 24, 2018 at 10:16 AM, Aurelien Bompard abompard@fedoraproject.org wrote:
What do you think of this proposal? Any blind spots? Thanks!
This sounds like a great idea. I use ActiveMQ and RabbitMQ at work, and both are good choices for brokers. Sounds like you're going towards RabbitMQ.
- Ken
On Thu, May 24, 2018 at 11:16 AM, Aurelien Bompard < abompard@fedoraproject.org> wrote:
What do you think of this proposal? Any blind spots?
Not that I disagree, but please add/expand a section as to why AMQP (and RabbitMQ) was chosen over other messaging technologies.
Hi,
On 05/29/2018 09:31 AM, Jeffrey Ollie wrote:
On Thu, May 24, 2018 at 11:16 AM, Aurelien Bompard < abompard@fedoraproject.org> wrote:
What do you think of this proposal? Any blind spots?
Not that I disagree, but please add/expand a section as to why AMQP (and RabbitMQ) was chosen over other messaging technologies.
Thanks for the feedback, I've added a small section[0]. It is, perhaps, a little wishy-washy. I don't want to give the impression that we couldn't implement this with a different messaging protocol or a different broker. We definitely could. AMQP has short-comings, to be sure, but the RabbitMQ extensions (mainly pulisher acks) cover the most important ones in my opinion.
I did some research, but I'd definitely welcome feedback on protocols and brokers. I've read all or nearly all of the AMQP 0.9, ZeroMQ, and STOMP protocols, and I skimmed through the MQTT protocol, but I've not looked closely at the AMQP 1.0 protocol and I'm by no means a message protocol expert.
As for brokers, my only experience is with Qpid and RabbitMQ and that experience points me firmly at RabbitMQ. I don't know much about running ActiveMQ.
[0] https://fedmsg-migration-tools.readthedocs.io/en/latest/migration/overview.h...
Thanks, Jeremy
On Tue, May 29, 2018 at 12:51 PM, Jeremy Cline jeremy@jcline.org wrote:
Hi,
On 05/29/2018 09:31 AM, Jeffrey Ollie wrote:
On Thu, May 24, 2018 at 11:16 AM, Aurelien Bompard < abompard@fedoraproject.org> wrote:
What do you think of this proposal? Any blind spots?
Not that I disagree, but please add/expand a section as to why AMQP (and RabbitMQ) was chosen over other messaging technologies.
Thanks for the feedback, I've added a small section[0]. It is, perhaps, a little wishy-washy. I don't want to give the impression that we couldn't implement this with a different messaging protocol or a different broker. We definitely could. AMQP has short-comings, to be sure, but the RabbitMQ extensions (mainly pulisher acks) cover the most important ones in my opinion.
I did some research, but I'd definitely welcome feedback on protocols and brokers. I've read all or nearly all of the AMQP 0.9, ZeroMQ, and STOMP protocols, and I skimmed through the MQTT protocol, but I've not looked closely at the AMQP 1.0 protocol and I'm by no means a message protocol expert.
I think moving to a broker-based architecture is a great idea! Your document does a great job explaining the advantages it brings, and it could help increase the adoption of event-based workflows.
Regarding protocols, my preference would be for STOMP. It's has very wide support, with libraries in pretty much every language, and being entirely text-based makes it *much* easier to debug than other protocols. The message delivery semantics are well-defined, and the protocol spec has the nice property of being readable in one sitting. Some brokers provide the ability to translate between protocols, so it may not be difficult to support more than one, but I would suggest STOMP as the reference protocol.
As far as brokers, I'll mention that we use ActiveMQ internally and it has been performing well for us. There may be some value to standardization there, and we may be able to provide some resources and best-practices on configuration and maintenance. I don't know much about RabbitMQ.
As for brokers, my only experience is with Qpid and RabbitMQ and that experience points me firmly at RabbitMQ. I don't know much about running ActiveMQ.
[0] https://fedmsg-migration-tools.readthedocs.io/en/ latest/migration/overview.html#why-amqp-and-rabbitmq
Thanks, Jeremy _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists. fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/ infrastructure@lists.fedoraproject.org/message/ 2ARVA7MSRJ5NALJL2O5CGWB4BGCKJ67S/
On 06/01/2018 05:45 PM, Michael Bonnet wrote:
On Tue, May 29, 2018 at 12:51 PM, Jeremy Cline jeremy@jcline.org wrote:
Hi,
On 05/29/2018 09:31 AM, Jeffrey Ollie wrote:
On Thu, May 24, 2018 at 11:16 AM, Aurelien Bompard < abompard@fedoraproject.org> wrote:
What do you think of this proposal? Any blind spots?
Not that I disagree, but please add/expand a section as to why AMQP (and RabbitMQ) was chosen over other messaging technologies.
Thanks for the feedback, I've added a small section[0]. It is, perhaps, a little wishy-washy. I don't want to give the impression that we couldn't implement this with a different messaging protocol or a different broker. We definitely could. AMQP has short-comings, to be sure, but the RabbitMQ extensions (mainly pulisher acks) cover the most important ones in my opinion.
I did some research, but I'd definitely welcome feedback on protocols and brokers. I've read all or nearly all of the AMQP 0.9, ZeroMQ, and STOMP protocols, and I skimmed through the MQTT protocol, but I've not looked closely at the AMQP 1.0 protocol and I'm by no means a message protocol expert.
I think moving to a broker-based architecture is a great idea! Your document does a great job explaining the advantages it brings, and it could help increase the adoption of event-based workflows.
Regarding protocols, my preference would be for STOMP. It's has very wide support, with libraries in pretty much every language, and being entirely text-based makes it *much* easier to debug than other protocols. The message delivery semantics are well-defined, and the protocol spec has the nice property of being readable in one sitting. Some brokers provide the ability to translate between protocols, so it may not be difficult to support more than one, but I would suggest STOMP as the reference protocol.
I had a hard time justifying choosing STOMP over AMQP because most brokers just map the other protocol they focus on onto STOMP. It's true the the spec is short, but it leaves a lot up to individual implementations as far as I can tell (like how topic matching works, for example).
While debuggability is important, I'm not certain we'll ever need to dig into the wire protocol. In the unlikely event that we need to, I'd go about it the same way (e.g. tcpdump/wireshark) and Wireshark knows how to parse the AMQP protocol. Based on my super simple test (capture a single message being published) it seems very easy to inspect. Have you had a different experience here?
On Mon, Jun 4, 2018 at 8:30 AM, Jeremy Cline jeremy@jcline.org wrote:
I had a hard time justifying choosing STOMP over AMQP because most brokers just map the other protocol they focus on onto STOMP. It's true the the spec is short, but it leaves a lot up to individual implementations as far as I can tell (like how topic matching works, for example).
It's nice to give the flexibility to clients by exposing both. I haven't seen a problem with topic matching in my experience so far.
One thing I found with AMQP vs STOMP is that it's possible for AMQP clients to (accidentally) emit "binary" message bodies, and then ActiveMQ does not translate or expose these as plaintext JSON for STOMP clients. It just looks like an empty message body to STOMP clients, or possibly garbage. The solution was for clients to translate the messages to text/json prior to sending. (Of course if you never enable STOMP on your broker at all, maybe this won't be a problem :)
- Ken
It's nice to give the flexibility to clients by exposing both. I haven't seen a problem with topic matching in my experience so far.
While I like the idea of adding flexibility, it'll probably also be harder on the debugging and maintenance side of things. We will keep the ZeroMQ gateway for external bus users, we may also consider a STOMP gateway that will do sanity checks on the fly if that becomes necessary.
One thing I found with AMQP vs STOMP is that it's possible for AMQP clients to (accidentally) emit "binary" message bodies, and then ActiveMQ does not translate or expose these as plaintext JSON for STOMP clients. It just looks like an empty message body to STOMP clients, or possibly garbage.
When using the fedora-messaging library, outgoing messages will be validated using a JSON schema and enforced to JSON/UTF-8. That should make it much harder to emit something broken. Received messages will also be validated, of course.
Aurélien
On 06/04/2018 06:59 PM, Ken Dreyer wrote:
On Mon, Jun 4, 2018 at 8:30 AM, Jeremy Cline jeremy@jcline.org wrote:
I had a hard time justifying choosing STOMP over AMQP because most brokers just map the other protocol they focus on onto STOMP. It's true the the spec is short, but it leaves a lot up to individual implementations as far as I can tell (like how topic matching works, for example).
It's nice to give the flexibility to clients by exposing both. I haven't seen a problem with topic matching in my experience so far.
I read up a bit on the RabbitMQ plugin for STOMP, it sounds like it uses the same routing key rules (e.g. '*' and '#' work) as the AMQP queue bindings in Rabbit. I don't have a problem with them both being exposed in the Fedora deployment.
I don't mind the fedora-messaging library using STOMP either, but people will still need to understand the AMQP semantics because we're relying on using two different exchanges to track what messages came from ZeroMQ vs AMQP, and Rabbit maps all the STOMP interaction onto AMQP concepts anyway.
I would like to make our work easily usable by Red Hat internal infrastructure since that can only be a beneficial relationship for both Fedora and Red Hat, but I don't have a good sense of what they're doing. Mike, if your suggestion is driven by a desire to make this a useful tool internally, please let me know. My current impression is that the client and protocol it uses is of minimal interest. I imagine the schema would be of interest, though. Am I wrong here?
One thing I found with AMQP vs STOMP is that it's possible for AMQP clients to (accidentally) emit "binary" message bodies, and then ActiveMQ does not translate or expose these as plaintext JSON for STOMP clients. It just looks like an empty message body to STOMP clients, or possibly garbage. The solution was for clients to translate the messages to text/json prior to sending. (Of course if you never enable STOMP on your broker at all, maybe this won't be a problem :)
The fedora-messaging publish/subscribe API (that's just wrapping an AMQP client) requires the message to match a jsonschema, that it is UTF-8 encoded, and that the content type and encoding is properly set so if some publisher is misbehaving in this way the clients using fedora-messaging will reject this sort of message.
Thanks for sharing your experience with AMQP and STOMP, it's very helpful!
infrastructure@lists.fedoraproject.org