On 08/01/2014 01:39 AM, Stef Walter wrote:
On 01.08.2014 07:21, Trevor Jay wrote:
> On Thu, Jul 31, 2014 at 10:38:59PM +0200, Stef Walter wrote:
>> I've heard this concept slung around, but never saw it in real
>> life. What does a Docker privileged container look like and how
>> does it work? Any documentation? A trivial google search doesn't
>> seem to turn up anything definitive.
>>
> "Normal" containers run with a munged network (i.e. 172.* address),
> dropped kernel capabilities, and under a limited SELinux security
> context (i.e. system_u:system_r:svirt_lxc_net_t:s0:c712,c869). Docker
> containers started with the "--privileged" option still run with a
> munged network but have fewer (maybe even no, I'd have to check the
> source) kernel capabilties dropped and run under a more lenient
> SELinux. The more lenient context is something like:
> system_u:system_r:docker_t:s0. I can't remember exactly off the top
> of my head.
Yes a --priv container can be thought of as a container with NO
security
containment. The SELinux transition is to unconfined_t, and no
capabilities are dropped.
The problem with them is that you are still in "Process Containment" via
namespaces.
> To give a concrete example: normal containers probably can't
access
> the /dev/ pseudo-filesystem the way Cockpit (I assume) needs to. I
> would expect that a "--privileged" container could.
Interesting.
Cockpit needs to do stuff like this in the main host:
* Access the file-system (eg: journal, and much more)
* Access the DBus system bus and activate things there, like udisksd,
NetworkManager, systemd parts ... also some of cockpit's dependencies
like storaged are activated on the main system bus.
* Read access to cgroup tree
* Connect to docker socket
* Run host commands like shutdown.
* Authenticate against host PAM stack and user database
The actual networking in use for running Cockpit isn't *that* important,
as we would connect out to NetworkManager anyway to do configuration.
But we would need to be able to ask the kernel about the throughput of
the various interfaces.
Also when you connect out remotely to have Cockpit look at
multiple-machines, it does so via SSH. So we would need to somehow add
that SSH subprocess into an appropriate privileged container. Or perhaps
stop using SSH for this purpose ...
In addition cockpit starts a real PAM/systemd/audit session once logged
in, and the logged in processes run under unconfined_t selinux context
(similar as you would for a shell). So the semantics of this would need
to be figured out.
Lots of work. Would be interested in the results if you end up playing
with this.
Is there would be a way to *only* add a file system namespace containing
the entire system file system + the cockpit data/binaries bind mounted
in. This would still run into a few problems above, but many of them
would just work.
Stef
_______________________________________________
cockpit-devel mailing list
cockpit-devel(a)lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/cockpit-devel We have kicked around
the idea of a "Super Priv container" where you
could switch to a limited number of namespaces. The idea would be to
only switch the mnt namespace and then maintain the current namespaces
of the host. Then mount all file systems under say /sysimage.
Something like
docker run --privilege --namespace-add=all --namespace-drop=mnt -v
/sysimage:/ cockpit
Then the cockpit daemon could run and see all of /. Theoretically if it
was staticlly linked it could just chroot /sysimage inside the
container. Or understand that everything is offset in /sysimage.
Bottom line the container processes would be allowed to see /proc of the
host and communicate with fifo files in /run to talk to docker. It
should be able to communicate using dbus to systemd ...
Problem is we don't have this yet. We have brought it up with docker
and they like the idea, but there could be complications.
--net="host"
I think will not change the network and UTS namespace.
None of the others have been implemented.
sh-4.3# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt
Iface
0.0.0.0 10.199.0.1 0.0.0.0 UG 0 0 0
wlan0
10.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0
tun0
10.5.30.160 0.0.0.0 255.255.255.255 UH 0 0 0
tun0
10.11.5.19 0.0.0.0 255.255.255.255 UH 0 0 0
tun0
10.199.0.0 0.0.0.0 255.255.240.0 U 0 0 0
wlan0
172.16.0.0 0.0.0.0 255.255.0.0 U 0 0 0
tun0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0
docker0
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0
virbr0
209.132.183.55 10.199.0.1 255.255.255.255 UGH 0 0 0
wlan0
sh-4.3# hostname
redsox.boston.devel.redhat.com
sh-4.3# docker run --rm -ti -v /usr/bin/netstat:/usr/bin/netstat
--net=host fedora /bin/sh
sh-4.2# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt
Iface
0.0.0.0 10.199.0.1 0.0.0.0 UG 0 0 0
wlan0
10.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0
tun0
10.5.30.160 0.0.0.0 255.255.255.255 UH 0 0 0
tun0
10.11.5.19 0.0.0.0 255.255.255.255 UH 0 0 0
tun0
10.199.0.0 0.0.0.0 255.255.240.0 U 0 0 0
wlan0
172.16.0.0 0.0.0.0 255.255.0.0 U 0 0 0
tun0
172.17.0.0 0.0.0.0 255.255.0.0 U 0 0 0
docker0
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0
virbr0
209.132.183.55 10.199.0.1 255.255.255.255 UGH 0 0 0
wlan0
sh-4.2# hostname
redsox