Hi,
one of the next steps for Cockpit to take would be to discover machines
on the network and show them in the "Add Server" dialog, as a
convenience.
This is a little report of where I am with that.
- OpenSLP relies on unicast replies to multicast queries. This does not
work in the presence of NAT or normal firewalls.
The fundamental problem seems to be that the connection tracking
facility in Linux does not expect the replies and thus doesn't create
a 'RELATED' connection flow for them. Both NAT and normal firewall
rules rely on connection tracking to work.
http://www.cs.helsinki.fi/linux/linux-kernel/2001-12/0789.html
https://lkml.org/lkml/2013/5/7/614
https://lkml.org/lkml/2013/5/7/830
The proper fix would be to write a user space conntrack helper for use
with "nfct helper add". nfct is part of conntrack-tools, and we
should probably contact upstream for advise before writing any code.
- Before we have the proper fix, we can try to workaround the conntrack
problem.
- In order to receive the unicast replies, we can punch a small hole in
the firewall while we listen for those replies. The hole would allow
packets from *:427 to the socket we listen on, nothing else. This is
done inside cockpitd.
- We can explicitly disable NAT for packets to the SLP multicast address
239.155.155.253. This needs to be done on the machine doing the NAT,
which is normally the host for the virtual test machines. We would
thus need to ask people to mess with an important firewall.
The challenge is to make a permanent change to the host firewall setup
that doesn't get clobbered by libvirt. Thus, the virtual network for
the virtual machines needs to use forward mode 'route', we explicitly
enable masquerading in the firewall (globally), and then disallow the
multicast address:
# firewall-cmd --add-masquerade
# firewall-cmd --direct --add-rule ipv4 nat POSTROUTING 0 \
-d 239.255.255.253 -j ACCEPT
(Direct rules go before the general masquerade rule. Not sure if this
is guaranteed, though.)
- We can also switch NAT on before PREPARE and switch it off before
VERIFY. This makes OpenSLP work, and we also get better isolated test
runs. We really don't want to access the Internet from tests.
(We can't have two networks and just use the right one since FreeIPA
needs the same IP address during setup as during normal operation, so
this needs to happen on the same network.)
- OpenSLP also has a startup problem where it enters the failed state
reliably on every boot. This should be easily fixable.
- We also need to open the firewall for incoming SLP queries. Trivial.
- Using Avahi would just work. We could list all instances of
_workstation._tcp.local, say, or _cockpit-ssh._tcp.local.
Avahi also has dynamic notifications, so the list of servers in the
"Add Servers" dialog would be 'live' without any extra magic like
polling or out-of-band notifications.
So:
* NAT on/off switch in vm-prep.
* PREPARE and VERIFY check that the switch is in the right position.
* Bugs in OpenSLP startup are workedaround.
* cockpitd learns to punch holes and to talk SLP.
* The "Add Server" dialog shows the result of find-srvs:service-agent.
* Tests for the above.