Mon, Mar 14, 2016 at 12:04:31PM IST, jiri(a)resnulli.us wrote:
Mon, Mar 14, 2016 at 10:57:26AM CET, idosch(a)mellanox.com wrote:
>Mon, Mar 14, 2016 at 11:48:10AM IST, jiri(a)mellanox.com wrote:
>>Mon, Mar 14, 2016 at 09:53:24AM CET, idosch(a)mellanox.com wrote:
>>>We are seeing failures in our nightly regression where ping fails due to
>>>rate being lower than limit. The machine logs suggest the first few
>>>packets are dropped, but the rest go through just fine.
>>
>>Wouldn't it be better just to lower the rate in this cases?
>>15 sec wait in case of failover does not look correct to me.
>
>Why not? The whole point is to check that the links can adjust and they
>do, but it takes some time.
>
>Anyway, these are the rates that caused failures in last night's run:
>57
>63
>76
>
>We can set it to 50 and hope for the best.
I think we should do that.
OK. I'll change that later today and do some testing, but we should
stick to this patch if 50 doesn't work for us. We need to have reliable
tests, so that when we don't see 100% at the end we know it's not one of
those arbitrary failover failures.
>
>
>>
>>>
>>>
>>>>
>>>>Solve that by adding a sleep() between the time we configure the links
>>>>and testing ping, so that the links are stable during the test.
>>>>
>>>>Signed-off-by: Ido Schimmel <idosch(a)mellanox.com>
>>>>---
>>>> recipes/switchdev/l2-005-bridge_bond_failover.py | 4 ++++
>>>> recipes/switchdev/l2-007-bridge_team_failover.py | 4 ++++
>>>> 2 files changed, 8 insertions(+)
>>>>
>>>>diff --git a/recipes/switchdev/l2-005-bridge_bond_failover.py
b/recipes/switchdev/l2-005-bridge_bond_failover.py
>>>>index c7b4eca..c0b8406 100644
>>>>--- a/recipes/switchdev/l2-005-bridge_bond_failover.py
>>>>+++ b/recipes/switchdev/l2-005-bridge_bond_failover.py
>>>>@@ -33,21 +33,25 @@ def do_task(ctl, hosts, ifaces, aliases):
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if1.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if1.set_link_up()
>>>> sw_if2.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if2.set_link_up()
>>>> sw_if1.set_link_down()
>>>> sw_if3.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if1.set_link_up()
>>>> sw_if3.set_link_up()
>>>> sw_if2.set_link_down()
>>>> sw_if4.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> do_task(ctl, [ctl.get_host("machine1"),
>>>>diff --git a/recipes/switchdev/l2-007-bridge_team_failover.py
b/recipes/switchdev/l2-007-bridge_team_failover.py
>>>>index 23d1a7c..47050ff 100644
>>>>--- a/recipes/switchdev/l2-007-bridge_team_failover.py
>>>>+++ b/recipes/switchdev/l2-007-bridge_team_failover.py
>>>>@@ -39,21 +39,25 @@ def do_task(ctl, hosts, ifaces, aliases):
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if1.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if1.set_link_up()
>>>> sw_if2.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if2.set_link_up()
>>>> sw_if1.set_link_down()
>>>> sw_if3.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> sw_if1.set_link_up()
>>>> sw_if3.set_link_up()
>>>> sw_if2.set_link_down()
>>>> sw_if4.set_link_down()
>>>>+ sleep(15)
>>>> tl.ping_simple(m1_lag1, m2_lag1)
>>>>
>>>> do_task(ctl, [ctl.get_host("machine1"),
>>>>--
>>>>2.4.10
>>>>
>>_______________________________________________
>>LNST-developers mailing list
>>lnst-developers(a)lists.fedorahosted.org
>>https://lists.fedorahosted.org/admin/lists/lnst-developers@lists.fedorahosted.org