#84: Controller hangs when configuration fails with exception on a slave
----------------------+-----------------------------
Reporter: rpazdera | Owner:
Type: defect | Status: new
Priority: major | Milestone: Stable Release
Component: lnst-ctl | Version: git
Keywords: | Blocked By:
Blocking: |
----------------------+-----------------------------
This occurred while running multiple recipes in a set. The controller
doesn't do cleanup and proceeds directly to the next recipe. However, the
previous connection is not cancelled which leaves the controller hanging.
'''Slave Log:'''
{{{
2013-08-05 09:13:48 (127.0.0.1) INFO: Recieved connection from
192.168.122.1
2013-08-05 09:13:48 (127.0.0.1) INFO: Recieved a controller
connection.
2013-08-05 09:13:49 (127.0.0.1) DEBUG: Executing: "ip link set
eth1 down"
2013-08-05 09:13:50 (127.0.0.1) DEBUG: Executing: "ip link set
eth2 down"
2013-08-05 09:13:51 (127.0.0.1) DEBUG: Executing: "ip link set
eth3 down"
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Executing: "ip link set
eth4 down"
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:52 (127.0.0.1) DEBUG: Test cache index commited
2013-08-05 09:13:53 (127.0.0.1) INFO: Initializing 'eth' device
class
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ip addr flush
eth4"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ethtool -A
eth4 rx off tx off"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ip link set
eth4 up"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ip addr flush
eth3"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ethtool -A
eth3 rx off tx off"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ip link set
eth3 up"
2013-08-05 09:13:53 (127.0.0.1) INFO: Initializing 'team'
device class
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ip link set
eth3 down"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "ip link set
eth4 down"
2013-08-05 09:13:53 (127.0.0.1) DEBUG: Executing: "teamd -r -d
-c "{ \"runner\": {\"name\": \"roundrobin\"} }"
-t t_team0 -D"
2013-08-05 09:13:53 (127.0.0.1) ERROR:
Stderr:
----------------------------
Daemon process failed.
Failed to start daemon: Invalid argument
----------------------------
2013-08-05 09:13:53 (127.0.0.1) ERROR: Command execution failed
(exited with 1)
2013-08-05 09:13:53 (127.0.0.1) DEBUG:
Traceback (most recent call last):
File "/root/lnst/lnst/Slave/NetTestSlave.py", line 466, in
_process_msg
result = method(*msg["args"])
File "/root/lnst/lnst/Slave/NetTestSlave.py", line 163, in
configure_interface
self._netconfig.configure(if_id)
File "/root/lnst/lnst/Slave/NetConfig.py", line 93, in configure
device.configure()
File "/root/lnst/lnst/Slave/NetConfigDevice.py", line 255, in
configure
exec_cmd("teamd -r -d -c \"%s\" -t %s %s" % (teamd_config,
dev_name, dbus_option))
File "/root/lnst/lnst/Common/ExecCmd.py", line 68, in exec_cmd
raise err
ExecCmdFail: Command execution failed (exited with 1)
}}}
'''Controller log:'''
{{{
2013-08-05 11:12:06 (127.0.0.1) INFO: Processing recipe file
"recipes/smoke/tests/recipe-bond-team.xml"
2013-08-05 11:12:06 (127.0.0.1) INFO: Querying slave machine
f16.
2013-08-05 11:12:06 (127.0.0.1) INFO: Adding slave machine f16
to slave pool.
2013-08-05 11:12:06 (127.0.0.1) INFO: Querying slave machine
f18.
2013-08-05 11:12:06 (127.0.0.1) INFO: Adding slave machine f18
to slave pool.
2013-08-05 11:12:06 (127.0.0.1) INFO: Querying slave machine
rhel6.
2013-08-05 11:12:09 (127.0.0.1) WARNING: Machine 'rhel6' not
responding. Skipping.
2013-08-05 11:12:09 (127.0.0.1) DEBUG: Executing: "cd
"/home/rpazdera/work/lnst-readonly-clean/test_tools" && tar cf
"/tmp/tmpkGvoqn" "tcp_conn""
2013-08-05 11:12:09 (127.0.0.1) DEBUG: Executing: "cd
"/home/rpazdera/work/lnst-readonly-clean/test_tools" && tar cf
"/tmp/tmp8JVhOV" "multicast""
2013-08-05 11:12:09 (127.0.0.1) INFO: Provisioning initialized
2013-08-05 11:12:09 (127.0.0.1) INFO: machine 1 uses f18
2013-08-05 11:12:09 (127.0.0.1) INFO: machine 2 uses f16
2013-08-05 11:12:09 (127.0.0.1) INFO: Connecting to RPC on
machine 1 (192.168.122.12)
2013-08-05 11:11:52 (192.168.122.12) INFO: Recieved a controller
connection.
2013-08-05 11:11:53 (192.168.122.12) DEBUG: Executing: "ip link set
eth1 down"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "brctl addbr
lnstbr0"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "iptables -I
FORWARD 1 -j REJECT -i lnstbr0 -o any"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "iptables -I
FORWARD 1 -j REJECT -i any -o lnstbr0"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "iptables -I
FORWARD 1 -j ACCEPT -i lnstbr0 -o lnstbr0"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "ip6tables -I
FORWARD 1 -j REJECT -i lnstbr0 -o any"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "ip6tables -I
FORWARD 1 -j REJECT -i any -o lnstbr0"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "ip6tables -I
FORWARD 1 -j ACCEPT -i lnstbr0 -o lnstbr0"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "ip link set
lnstbr0 up"
2013-08-05 11:12:10 (127.0.0.1) INFO: Creating interface if-2
(52:54:01:00:00:01) on machine 1
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "virsh attach-
interface Fedora18 bridge lnstbr0 --mac 52:54:01:00:00:01"
2013-08-05 11:12:10 (127.0.0.1) DEBUG: Executing: "virsh domif-
setlink Fedora18 52:54:01:00:00:01 up"
2013-08-05 11:11:54 (192.168.122.12) DEBUG: Executing: "ip link set
eth2 down"
2013-08-05 11:12:11 (127.0.0.1) DEBUG: Executing: "ip link set
lnstbr0 up"
2013-08-05 11:12:11 (127.0.0.1) INFO: Creating interface if-1
(52:54:01:00:00:02) on machine 1
2013-08-05 11:12:11 (127.0.0.1) DEBUG: Executing: "virsh attach-
interface Fedora18 bridge lnstbr0 --mac 52:54:01:00:00:02"
2013-08-05 11:12:11 (127.0.0.1) DEBUG: Executing: "virsh domif-
setlink Fedora18 52:54:01:00:00:02 up"
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Executing: "ip link set
eth3 down"
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:11:55 (192.168.122.12) DEBUG: Test cache index commited
2013-08-05 11:12:13 (127.0.0.1) INFO: Connecting to RPC on
machine 2 (192.168.122.10)
2013-08-05 09:13:48 (192.168.122.10) INFO: Recieved a controller
connection.
2013-08-05 09:13:49 (192.168.122.10) DEBUG: Executing: "ip link set
eth1 down"
2013-08-05 09:13:50 (192.168.122.10) DEBUG: Executing: "ip link set
eth2 down"
2013-08-05 11:12:14 (127.0.0.1) DEBUG: Executing: "ip link set
lnstbr0 up"
2013-08-05 11:12:14 (127.0.0.1) INFO: Creating interface if-2
(52:54:01:00:00:03) on machine 2
2013-08-05 11:12:14 (127.0.0.1) DEBUG: Executing: "virsh attach-
interface Fedora16 bridge lnstbr0 --mac 52:54:01:00:00:03"
2013-08-05 11:12:14 (127.0.0.1) DEBUG: Executing: "virsh domif-
setlink Fedora16 52:54:01:00:00:03 up"
2013-08-05 09:13:51 (192.168.122.10) DEBUG: Executing: "ip link set
eth3 down"
2013-08-05 11:12:15 (127.0.0.1) DEBUG: Executing: "ip link set
lnstbr0 up"
2013-08-05 11:12:15 (127.0.0.1) INFO: Creating interface if-1
(52:54:01:00:00:04) on machine 2
2013-08-05 11:12:15 (127.0.0.1) DEBUG: Executing: "virsh attach-
interface Fedora16 bridge lnstbr0 --mac 52:54:01:00:00:04"
2013-08-05 11:12:15 (127.0.0.1) DEBUG: Executing: "virsh domif-
setlink Fedora16 52:54:01:00:00:04 up"
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Executing: "ip link set
eth4 down"
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 09:13:52 (192.168.122.10) DEBUG: Test cache index commited
2013-08-05 11:12:17 (127.0.0.1) INFO: Configuring interface
if-1 on machine 1
2013-08-05 11:12:00 (192.168.122.12) INFO: Initializing 'eth' device
class
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ip addr flush
eth3"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ethtool -A
eth3 rx off tx off"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ip link set
eth3 up"
2013-08-05 11:12:17 (127.0.0.1) INFO: Configuring interface
if-2 on machine 1
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ip addr flush
eth2"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ethtool -A
eth2 rx off tx off"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ip link set
eth2 up"
2013-08-05 11:12:17 (127.0.0.1) INFO: Configuring interface
testiface on machine 1
2013-08-05 11:12:00 (192.168.122.12) INFO: Initializing 'bond'
device class
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "modprobe
bonding max_bonds=0"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "echo
"+t_bond0" > /sys/class/net/bonding_masters"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ip link set
eth2 down"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "echo "+eth2"
/sys/class/net/t_bond0/bonding/slaves"
2013-08-05
11:12:00 (192.168.122.12) DEBUG: Executing: "ip link set
eth3 down"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "echo "+eth3"
/sys/class/net/t_bond0/bonding/slaves"
2013-08-05
11:12:00 (192.168.122.12) DEBUG: Executing: "ip addr add
192.168.100.240/24 dev t_bond0"
2013-08-05 11:12:00 (192.168.122.12) DEBUG: Executing: "ip link set
t_bond0 up"
2013-08-05 11:12:17 (127.0.0.1) INFO: Configuring interface
if-1 on machine 2
2013-08-05 09:13:53 (192.168.122.10) INFO: Initializing 'eth' device
class
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ip addr flush
eth4"
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ethtool -A
eth4 rx off tx off"
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ip link set
eth4 up"
2013-08-05 11:12:17 (127.0.0.1) INFO: Configuring interface
if-2 on machine 2
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ip addr flush
eth3"
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ethtool -A
eth3 rx off tx off"
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ip link set
eth3 up"
2013-08-05 11:12:17 (127.0.0.1) INFO: Configuring interface
testiface on machine 2
2013-08-05 09:13:53 (192.168.122.10) INFO: Initializing 'team'
device class
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ip link set
eth3 down"
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "ip link set
eth4 down"
2013-08-05 09:13:53 (192.168.122.10) DEBUG: Executing: "teamd -r -d
-c "{ \"runner\": {\"name\": \"roundrobin\"} }"
-t t_team0 -D"
2013-08-05 09:13:53 (192.168.122.10) ERROR:
Stderr:
----------------------------
Daemon process failed.
Failed to start daemon: Invalid argument
----------------------------
2013-08-05 09:13:53 (192.168.122.10) ERROR: Command execution failed
(exited with 1)
2013-08-05 09:13:53 (192.168.122.10) DEBUG:
Traceback (most recent call last):
File "/root/lnst/lnst/Slave/NetTestSlave.py", line 466, in
_process_msg
result = method(*msg["args"])
File "/root/lnst/lnst/Slave/NetTestSlave.py", line 163, in
configure_interface
self._netconfig.configure(if_id)
File "/root/lnst/lnst/Slave/NetConfig.py", line 93, in configure
device.configure()
File "/root/lnst/lnst/Slave/NetConfigDevice.py", line 255, in
configure
exec_cmd("teamd -r -d -c \"%s\" -t %s %s" % (teamd_config,
dev_name, dbus_option))
File "/root/lnst/lnst/Common/ExecCmd.py", line 68, in exec_cmd
raise err
ExecCmdFail: Command execution failed (exited with 1)
2013-08-05 11:12:18 (127.0.0.1) DEBUG:
Traceback (most recent call last):
File "./lnst-ctl", line 73, in get_recipe_result
packet_capture, log_ctl, pool_checks)
File "./lnst-ctl", line 54, in process_recipe
return nettestctl.run_recipe(packet_capture)
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/NetTestController.py", line 395, in run_recipe
self._prepare_network()
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/NetTestController.py", line 173, in _prepare_network
self._prepare_interface(m_id, iface_xml_data)
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/NetTestController.py", line 236, in
_prepare_interface
iface.configure()
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/Machine.py", line 424, in configure
self._get_config())
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/Machine.py", line 101, in _rpc_call
result = self._msg_dispatcher.wait_for_result(self._id)
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/NetTestController.py", line 581, in wait_for_result
self._process_message(msg)
File "/home/rpazdera/work/lnst-readonly-
clean/lnst/Controller/NetTestController.py", line 594, in _process_message
raise CommandException(msg)
CommandException: CommandException: Recieved an exception from slave:
2
2013-08-05 11:12:18 (127.0.0.1) ERROR: CommandException:
Recieved an exception from slave: 2
2013-08-05 11:12:18 (127.0.0.1) INFO: Processing recipe file
"recipes/smoke/tests/recipe-bond-vlan.xml"
2013-08-05 11:12:18 (127.0.0.1) INFO: Querying slave machine
f16.
2013-08-05 11:12:18 (127.0.0.1) INFO: Adding slave machine f16
to slave pool.
2013-08-05 11:12:18 (127.0.0.1) INFO: Querying slave machine
f18.
2013-08-05 11:12:18 (127.0.0.1) INFO: Adding slave machine f18
to slave pool.
2013-08-05 11:12:18 (127.0.0.1) INFO: Querying slave machine
rhel6.
2013-08-05 11:12:21 (127.0.0.1) WARNING: Machine 'rhel6' not
responding. Skipping.
2013-08-05 11:12:21 (127.0.0.1) DEBUG: Executing: "cd
"/home/rpazdera/work/lnst-readonly-clean/test_tools" && tar cf
"/tmp/tmp7kbgEw" "tcp_conn""
2013-08-05 11:12:21 (127.0.0.1) DEBUG: Executing: "cd
"/home/rpazdera/work/lnst-readonly-clean/test_tools" && tar cf
"/tmp/tmp2DTQ1G" "multicast""
2013-08-05 11:12:21 (127.0.0.1) INFO: Provisioning initialized
2013-08-05 11:12:21 (127.0.0.1) INFO: machine 1 uses f16
2013-08-05 11:12:21 (127.0.0.1) INFO: machine 2 uses f18
2013-08-05 11:12:21 (127.0.0.1) INFO: Connecting to RPC on
machine 1 (192.168.122.10)
}}}
--
Ticket URL: <
https://fedorahosted.org/lnst/ticket/84>
lnst <
https://fedorahosted.org/lnst/>
Linux Network Stack Test