The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Jon.
On Sun, Jan 31, 2010 at 04:12:07AM -0500, Jon Masters wrote:
The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Thanks for the heads up, Jon. I'll watch this and the internal thread for a fix.
regards, Kyle
On Mon, 2010-02-01 at 10:17 -0500, Kyle McMartin wrote:
On Sun, Jan 31, 2010 at 04:12:07AM -0500, Jon Masters wrote:
The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Thanks for the heads up, Jon. I'll watch this and the internal thread for a fix.
Yeah. It's going to turn into a lot of cleaning up of conntrack IMO - the more I look at that code, the more I see problems waiting in the wings. Just try writing to the hashtable size via sysfs while the system is running if you wanna see even more boom! opportunities ;)
Jon.
On Mon, 2010-02-01 at 10:17 -0500, Kyle McMartin wrote:
On Sun, Jan 31, 2010 at 04:12:07AM -0500, Jon Masters wrote:
The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Thanks for the heads up, Jon. I'll watch this and the internal thread for a fix.
Well, I sent a summary for why it happens. It happens because an IPv6 error (set via icmpv6_error) causes us to set the conntrack (ct) for an incoming skb to nf_conntrack_untracked (a catchall struct). We then try to free that like any other conntrack, back into the (now per-namespace) cache, but it's not a SL[U]B allocated struct, it's a static...boom.
The conntrack code should catch this and error it, it should also do per-namespace cache allocation, and per-namespace hashtable metadata. I'm *very* surprised if this isn't biting a lot more Fedora users.
Jon.
On Tue, 2010-02-02 at 12:03 -0500, Jon Masters wrote:
On Mon, 2010-02-01 at 10:17 -0500, Kyle McMartin wrote:
On Sun, Jan 31, 2010 at 04:12:07AM -0500, Jon Masters wrote:
The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Thanks for the heads up, Jon. I'll watch this and the internal thread for a fix.
Well, I sent a summary for why it happens. It happens because an IPv6 error (set via icmpv6_error) causes us to set the conntrack (ct) for an incoming skb to nf_conntrack_untracked (a catchall struct). We then try to free that like any other conntrack, back into the (now per-namespace) cache, but it's not a SL[U]B allocated struct, it's a static...boom.
The conntrack code should catch this and error it, it should also do per-namespace cache allocation, and per-namespace hashtable metadata. I'm *very* surprised if this isn't biting a lot more Fedora users.
Oh, it is. I grabbed the bugs and updated them with a link to the upstream discussion. They're going to add per-namespace untracked ct's in addition to reworking the hashtable bits. I bet there's more aftermath of the "multiple namespace" support left to fix.
https://bugzilla.redhat.com/show_bug.cgi?id=533087
Jon.
On Tue, 2010-02-02 at 12:33 -0500, Jon Masters wrote:
On Tue, 2010-02-02 at 12:03 -0500, Jon Masters wrote:
On Mon, 2010-02-01 at 10:17 -0500, Kyle McMartin wrote:
On Sun, Jan 31, 2010 at 04:12:07AM -0500, Jon Masters wrote:
The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Thanks for the heads up, Jon. I'll watch this and the internal thread for a fix.
Well, I sent a summary for why it happens. It happens because an IPv6 error (set via icmpv6_error) causes us to set the conntrack (ct) for an incoming skb to nf_conntrack_untracked (a catchall struct). We then try to free that like any other conntrack, back into the (now per-namespace) cache, but it's not a SL[U]B allocated struct, it's a static...boom.
The conntrack code should catch this and error it, it should also do per-namespace cache allocation, and per-namespace hashtable metadata. I'm *very* surprised if this isn't biting a lot more Fedora users.
Oh, it is. I grabbed the bugs and updated them with a link to the upstream discussion. They're going to add per-namespace untracked ct's in addition to reworking the hashtable bits. I bet there's more aftermath of the "multiple namespace" support left to fix.
I updated that bug with a simple hack that will refuse to free the untracked conntrack struct if we are so pathologically inclined. With that hack, my KVM box is stable and running fine.
Patrick will do a per-namespace version of the untracked conntrack too, to join the per-namespace cachep, hashtable, and other bugs we've found as a result of this. Expect an upstream post tomorrow with the patches, which should be pulled into Fedora urgently.
Jon.
On Tue, 2010-02-02 at 14:26 -0500, Jon Masters wrote:
On Tue, 2010-02-02 at 12:33 -0500, Jon Masters wrote:
On Tue, 2010-02-02 at 12:03 -0500, Jon Masters wrote:
On Mon, 2010-02-01 at 10:17 -0500, Kyle McMartin wrote:
On Sun, Jan 31, 2010 at 04:12:07AM -0500, Jon Masters wrote:
The disabling of netfilter on bridges is not really "solving" this problem. The problem is that the hashing code needs fixing. Until that changes, whenever libvirtd plays with namespaces (as it does), we run the risk of falling over as we play with the size of the hashtables.
Thanks for the heads up, Jon. I'll watch this and the internal thread for a fix.
Well, I sent a summary for why it happens. It happens because an IPv6 error (set via icmpv6_error) causes us to set the conntrack (ct) for an incoming skb to nf_conntrack_untracked (a catchall struct). We then try to free that like any other conntrack, back into the (now per-namespace) cache, but it's not a SL[U]B allocated struct, it's a static...boom.
The conntrack code should catch this and error it, it should also do per-namespace cache allocation, and per-namespace hashtable metadata. I'm *very* surprised if this isn't biting a lot more Fedora users.
Oh, it is. I grabbed the bugs and updated them with a link to the upstream discussion. They're going to add per-namespace untracked ct's in addition to reworking the hashtable bits. I bet there's more aftermath of the "multiple namespace" support left to fix.
I updated that bug with a simple hack that will refuse to free the untracked conntrack struct if we are so pathologically inclined. With that hack, my KVM box is stable and running fine.
Patrick will do a per-namespace version of the untracked conntrack too, to join the per-namespace cachep, hashtable, and other bugs we've found as a result of this. Expect an upstream post tomorrow with the patches, which should be pulled into Fedora urgently.
http://lkml.org/lkml/2010/2/3/112
Jon.
kernel@lists.fedoraproject.org