On Wed, Apr 02, 2014 at 12:02:41PM +0300, "Thomas B. Rücker" wrote:
we're using SSSD in combination with active directory and have received
complaints from users about a corner case in our setup.
Our AD servers are only reachable from within our corporate network,
connection attempts from the outside are dropped by firewalls. This
leads to the following scenario:
- user takes machine (e.g. laptop) outside the corporate network
- user tries to authenticate (or in some cases also tries to "ls" which
causes uid/gid lookup)
- sssd will try to reach the configured servers for up to 30s
This is not so clear to me, are you saying that it takes up to 30
seconds for SSSD to realize it's offline and switch to the offline
- sssd goes (back) into offline mode and uses cached credentials and
authenticates the user
I'm using a very similar setup on my laptop where I authenticate against
LDAP and Kerberos servers inside Red Hat's internal network. I see a
couple of seconds lag sometimes, but not 30s as you describe..
This will however NOT happen if sssd gets told by the IP stack that a
connection to the target IP is not possible (e.g. "ip route add
blackhole 192.0.2.23/32" or one of the routers along the way generates
an ICMP unreachable). In such cases sssd will go immediately into
offline mode and use cached credentials.
So I suspect the dropping of packets instead of rejecting makes the
I'm aware that this is over all sensible behaviour, but what I would
hope to fine tune is how sssd stays in offline mode. Currently it seems
like it will leave offline mode when it tries to reconnect (hardcoded
30s?). That leads to a flip flop scenario where it seems to be 30s
offline and 30s "online/connecting" and users have a fairly high chance
to hit a time during which their authentication will seemingly stall.
Newer versions have the 'offline_timeout' option available. For the
later versions, I would suggest to fine tune the timeouts, so the
offline detection is faster.
So my question is:
Is there a better way to deal with this in the sssd context?
If not we'll probably have to implement separate connection checking and
inject and remove blackhole routes accordingly. Not the nicest of
workarounds in my book.
Can you enable debugging and see where the biggest lag is? Maybe we
could see what exactly takes the longest and lower the appropriate
PS: We're using sssd on many distributions, but our main distro at the
moment is ubuntu 12.04 with sssd 1.8.6 and we'll be rolling out 14.04 in
addition, which has sssd 1.11.3.
I remember in 1.9 we fixed a bug where we would attempt to resolve
kpasswd in addition to kdc on authentication. I can't find the commit
rigth now, but it would be nice if you could check some newer version
and see if the situation is somewhat better recently.