On Tue, May 12, 2015 at 12:03:14PM +0800, d tbsky wrote:
hi:
I use teamd to connect two linux hosts. the connection use 3 nic
at each host with direct ethernet cable. the protocol is roundrobin.
it runs fine for several month. until one day I found the
DRBD(Distributed Replicated Block Device) connection over the link was
broken. I test the network and found about 1/3 of ping packets lost
between the hosts. it seems one nic dead, but "ifconfig" shows me all
the nic are there and no TX/RX errors. "teamdctl team0 state" show all
three links are up. those are production machines so I don't have much
time to debug the situation. I poweroff then poweron the two hosts and
everything is back to normal again.
I use roundrobin to get triple bandwith. but if one nic problem
will cause the whole link down, I need to reconsider it. I use the
default "ethtool" to detect link status, will "arp_ping" be better
to
detect link failure?
I have heard about that you should not use teamd/bonding to connect
two hosts directly. but I never found the reason why. maybe someone
can give a short advice about it?
Connecting directly offers some challenges because some NICs have
WoL enabled, so even when the other side is down, the link is still
up, so you get a wrong status.
The "ethtool" relies on the NIC to report the link connectivity. If
you are talking to another host and the WoL is disabled, etc, it might
be good enough. But if you are talking with a switch, then it's
limited to the link between the host and the switch, while the
arp_ping will check the full path between peers. The cost is
sending/receiving packets at a fixed period of time.
You could also use arp_ping for back-to-back connection because that
would also check not only if the link is functional but also if the
host can process the packet and reply back.
fbl
my linux version is scientific linux 7.0. teamd version which comes
with the os is 1.9 (teamd-1.9-15)
"teamdctl team0 config dump" shows as below:
{
"device": "team0",
"ports": {
"enp3s0f0": {
"link_watch": {
"name": "ethtool"
}
},
"enp3s0f1": {
"link_watch": {
"name": "ethtool"
}
},
"enp5s0f1": {
"link_watch": {
"name": "ethtool"
}
}
},
"runner": {
"name": "roundrobin"
}
}
_______________________________________________
libteam mailing list
libteam(a)lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/libteam