On Wed, Apr 06, 2016 at 05:30:46AM -0000, adam.kosseck(a)defence.gov.au wrote:
Hi,
I've built a large number of RHEL 6 servers across multiple AD domains with identical
SSSD/krb5 configurations.
SSSD authentication works fine for most of these servers, but every once in a while on
various servers it just seems to stop working. If users are cached they continue to work,
but non cached users are denied access.
If I login with a local account and restart the SSSD service it usually starts working
again.
Below are logs taken from an Oracle server on a management network, for which I have been
unable to get SSSD to work at all (local kinit and net ads join commands work ok – but
SSSD authentication fails).
I cleared my logs, cleared my cache, raised the SSSD log levels to 7, then started the
service & executed "getent passwd firstname.lastname" and then stopped SSSD
when it failed.
Firstly I'd like to work out why this server isn't working with SSSD, then work
out why SSSD appears to be flakey - any help would be greatly appreciated :)
Hi Adam, see some comments inline..
/etc/krb5.conf
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm =
DOMAIN.SUBDOMAIN.COM
dns_lookup_realm = true
dns_lookup_kdc = true
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
rdns = false
[realms]
#
DOMAIN.SUBDOMAIN.COM = {
#kdc =
dc.domain.subdomain.com
#admin_server =
dc.domain.subdomain.com
#}
[domain_realm]
#.domain.subdomain.com =
DOMAIN.SUBDOMAIN.COM
#domain.subdomain.com =
DOMAIN.SUBDOMAIN.COM
/etc/sssd/sssd.conf
[sssd]
config_file_version = 2
debug_level = 1
domains =
domain.subdomain.com
services = nss, pam, ssh, sudo
[
domain/domain.subdomain.com]
debug_level = 1
id_provider = ad
access_provider = ad
auth_provider = ad
chpass_provider = ad
# Permits offline logins:
cache_credentials = true
default_shell = /bin/bash
fallback_homedir = /home/%d/%u
ldap_schema = rfc2307bis
I would recommend to stick to the default "ldap_schema = ad" here.
#Allows users to login without specifying FQDN
default_domain_suffix =
domain.subdomain.com
This option belongs to the [sssd] section, but I don't think it's
needed, unless you use use_fully_qualified_names = true.
#performance related (+ avoids RHEL 6.6 bug)
ldap_referrals = false
This option is already the default for the ad provider so you can remove
it as well.
#Don't use SELinux
selinux_provider = none
The selinux provider is only implemented for IPA, so you can remove this
line as well. Please note that the "selinux_provider" has little to do
with SELinux settings on the host, it's only used for setting a login
label for IPA users.
#Ignore root forest domain, and don't update DNS records dynamically.
subdomains_provider = none
I think this setting with the combination of ID mapping might be the issue
here. When you enable ID mapping (which is the default for the AD
provider), then SSSD needs to know the SID of the domain to perform the
ID mapping. And the SID of the domain SSSD is enrolled with is read in
the subdomains provider (which is maybe a bit counter-intuitive, given
it's called the subdomains provider..). We have a ticket open to add an
option to only check the master domain:
https://fedorahosted.org/sssd/ticket/2828
instead of completely disabling the subdomains provider, but it's not
implemented yet.
But given you also set the schema to rfc2307bis, I'm not sure if you
actually wanted to use ID mapping or POSIX attributes?
In the domain log, I see that SSSD tried to detect if POSIX attributes
are replicated in the Global Catalog or not and then the account request
failed. I'm not sure why is that, can you send logs with a higher debug
level and with the schema set to 'ad' ?
dyndns_update = false
[ssh]
debug_level = 1
[nss]
debug_level = 1
[pam]
debug_level = 1
[sudo]
debug_level = 1
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]]
[sdap_posix_check_next] (0x0400): Searching for POSIX attributes with base
[DC=DOMAIN,DC=SUBDOMAIN,DC=COM]
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [sdap_get_generic_ext_step]
(0x0400): calling ldap_search_ext with
[(|(&(uidNumber=*)(objectclass=user))(&(gidNumber=*)(objectclass=group)))][DC=DOMAIN,DC=SUBDOMAIN,DC=COM].
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [objectclass]
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [uidNumber]
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [sdap_get_generic_ext_step]
(0x1000): Requesting attrs: [gidNumber]
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [be_run_online_cb] (0x0080):
Going online. Running callbacks.
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]]
[sdap_get_generic_op_finished] (0x0400): Search result: Success(0), no errmsg set
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [sdap_posix_check_done]
(0x1000): Cycled through all bases
(Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [disable_gc] (0x0040): POSIX
attributes were requested but are not present on the server side. Global Catalog lookups
will be disabled
The logs with a higher debug level would hopefully show what's going on
here.
> (Wed Apr 6 14:51:29 2016) [sssd[be[domain.subdomain.com]]] [acctinfo_callback]
(0x0100): Request processed. Returned 3,0,Success