Hi All
We've got SSSD 1.13.0 installed as part of a Centos 7.2.1511 installation.
We've used realmd to join the host concerned to our 2008R2 AD system. This went really well, and consequently we've been using SSSD to provide login services and kerberos integration for our fairly large hadoop system.
The authconfig that's implicitly run as part of realmd produces the following sssd.conf:
[sssd] domains = <joined domain> config_file_version = 2 services = nss, pam
[pam] debug_level = 0x0080
[nss] timeout = 20 force_timeout = 600 debug_level = 0x0080
[domain/<joined domain>] ad_domain = <joined domain> krb5_realm = <JOINED DOMAIN> realmd_tags = manages-system joined-with-samba cache_credentials = true id_provider = ad krb5_store_password_if_offline = True default_shell = /bin/bash ldap_id_mapping = True use_fully_qualified_names = False fallback_homedir = /home/%u@%d access_provider = simple simple_allow_groups = <AD group allowing logins> krb5_use_kdc_info = False entry_cache_timeout = 300 debug_level = 0x0080 ad_server = <active directory server>
As I've said - this works really well. We did have some stability issues initially, but they've been fixed by defining the 'ad_server' rather than using autodiscovery.
Logins work fine, kerberos TGTs are issued on login, and password changes are honoured correctly.
However, in general day to day use, we have noticed a few anomalies, that we just can't track down.
Firstly (this has happened a few times), a user will change their AD password (via a Windows PC).
Subsequent logins - sometimes with specific client software - fail with
pam_sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=<remote PC name> user=<username> pam_sss(sshd:auth): received for user <username>: 17 (failure setting user credentials)
So in this example, the person concerned has changed their AD password. Further attempts to access this system via SSH work fine. However, using SFTP doesn't work (the above is output into /var/log/secure).
There are no local controls on sftp logins, and the user concerned was working fine (using both sftp and ssh) until they updated their password.
There is no separate sftp daemon running, and it only affects one individual currently (but we have seen some very similar instances before)
The second issue we have is around phantom groups in AD.
Hadoop uses an id -Gn command to see group membership for authorisation.
With some users - we've seen 6 currently - we see certain groups failing to be looked up:
id -Gn <username>
id: cannot find name for group ID xxxxyyyyy <group name> <group name> <group name> <group name> <etc...>
The xxxxyyyyy indicates:
xxxx = hashed realm name yyyyy = RID from group in AD
We can't find any group with that number on the AD side!
We can work around this by adding a local group (into /etc/group) for the GIDs affected. This means the id -Gn runs correctly, and the hadoop namenode can function correctly - but this is a workaround and we'd like to get to the bottom of the issue.
Rather than flooding this post now with logfiles, just thought I'd see if this looked familiar to anyone. Happy to upload any logs, amend logging levels, etc.
Many thanks Simon
The first step in debugging any strangeness is usually https://docs.pagure.org/SSSD.sssd/users/troubleshooting.html
On 14 Mar 2018, at 16:18, simonc99@hotmail.com wrote:
Hi All
We've got SSSD 1.13.0 installed as part of a Centos 7.2.1511 installation.
But this is quite an old version. I would first check if the 7.4 version behaves any better. About the ‘phantom groups, there was a fix that sounds related that will be released in 7.4.6 in a couple of weeks and actually even sooner in 7.5.0.
We've used realmd to join the host concerned to our 2008R2 AD system. This went really well, and consequently we've been using SSSD to provide login services and kerberos integration for our fairly large hadoop system.
The authconfig that's implicitly run as part of realmd produces the following sssd.conf:
[sssd] domains = <joined domain> config_file_version = 2 services = nss, pam
[pam] debug_level = 0x0080
[nss] timeout = 20 force_timeout = 600 debug_level = 0x0080
[domain/<joined domain>] ad_domain = <joined domain> krb5_realm = <JOINED DOMAIN> realmd_tags = manages-system joined-with-samba cache_credentials = true id_provider = ad krb5_store_password_if_offline = True default_shell = /bin/bash ldap_id_mapping = True use_fully_qualified_names = False fallback_homedir = /home/%u@%d access_provider = simple simple_allow_groups = <AD group allowing logins> krb5_use_kdc_info = False entry_cache_timeout = 300 debug_level = 0x0080 ad_server = <active directory server>
As I've said - this works really well. We did have some stability issues initially, but they've been fixed by defining the 'ad_server' rather than using autodiscovery.
Logins work fine, kerberos TGTs are issued on login, and password changes are honoured correctly.
However, in general day to day use, we have noticed a few anomalies, that we just can't track down.
Firstly (this has happened a few times), a user will change their AD password (via a Windows PC).
Subsequent logins - sometimes with specific client software - fail with
pam_sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=<remote PC name> user=<username> pam_sss(sshd:auth): received for user <username>: 17 (failure setting user credentials)
So in this example, the person concerned has changed their AD password. Further attempts to access this system via SSH work fine. However, using SFTP doesn't work (the above is output into /var/log/secure).
There are no local controls on sftp logins, and the user concerned was working fine (using both sftp and ssh) until they updated their password.
There is no separate sftp daemon running, and it only affects one individual currently (but we have seen some very similar instances before)
The second issue we have is around phantom groups in AD.
Hadoop uses an id -Gn command to see group membership for authorisation.
With some users - we've seen 6 currently - we see certain groups failing to be looked up:
id -Gn <username>
id: cannot find name for group ID xxxxyyyyy <group name> <group name> <group name> <group name> <etc...>
The xxxxyyyyy indicates:
xxxx = hashed realm name yyyyy = RID from group in AD
We can't find any group with that number on the AD side!
We can work around this by adding a local group (into /etc/group) for the GIDs affected. This means the id -Gn runs correctly, and the hadoop namenode can function correctly - but this is a workaround and we'd like to get to the bottom of the issue.
Rather than flooding this post now with logfiles, just thought I'd see if this looked familiar to anyone. Happy to upload any logs, amend logging levels, etc.
Many thanks Simon _______________________________________________ sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org
Thanks Jakub. Unfortunately we're on a mandated linux distro (RHEL/CentOS 7) so we're tied into their releases. Currently 1.15.2. We've definitely still got the groups issue.
An id -G returns a table of numbers with no errors, an id -Gn doesn't. We get at least one missing group with 'cannot find name for group ID' <group id>
This looks like the getgrgid() call as mentioned in the troubleshooting document - thanks. Now I'm thinking, could it be something is broken in the AD setup.
Sorry for any confusion - I mentioned Centos, but didn't mention - we're also running a later version of Red Hat as well (same symptoms - i.e., what looks like missing grpgid() for certain users) - hence the later 1.15 release mentioned above...
Thanks Jakub - appreciate it.
Regarding your group issue, do you or have you had trusted domains and the mystery group is from another domain? Long shot, it we had the same error when it was trying to resolve the foreign group memberships.
On Wed, Mar 14, 2018, 11:19 AM simonc99@hotmail.com wrote:
Hi All
We've got SSSD 1.13.0 installed as part of a Centos 7.2.1511 installation.
We've used realmd to join the host concerned to our 2008R2 AD system. This went really well, and consequently we've been using SSSD to provide login services and kerberos integration for our fairly large hadoop system.
The authconfig that's implicitly run as part of realmd produces the following sssd.conf:
[sssd] domains = <joined domain> config_file_version = 2 services = nss, pam
[pam] debug_level = 0x0080
[nss] timeout = 20 force_timeout = 600 debug_level = 0x0080
[domain/<joined domain>] ad_domain = <joined domain> krb5_realm = <JOINED DOMAIN> realmd_tags = manages-system joined-with-samba cache_credentials = true id_provider = ad krb5_store_password_if_offline = True default_shell = /bin/bash ldap_id_mapping = True use_fully_qualified_names = False fallback_homedir = /home/%u@%d access_provider = simple simple_allow_groups = <AD group allowing logins> krb5_use_kdc_info = False entry_cache_timeout = 300 debug_level = 0x0080 ad_server = <active directory server>
As I've said - this works really well. We did have some stability issues initially, but they've been fixed by defining the 'ad_server' rather than using autodiscovery.
Logins work fine, kerberos TGTs are issued on login, and password changes are honoured correctly.
However, in general day to day use, we have noticed a few anomalies, that we just can't track down.
Firstly (this has happened a few times), a user will change their AD password (via a Windows PC).
Subsequent logins - sometimes with specific client software - fail with
pam_sss(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=<remote PC name> user=<username> pam_sss(sshd:auth): received for user <username>: 17 (failure setting user credentials)
So in this example, the person concerned has changed their AD password. Further attempts to access this system via SSH work fine. However, using SFTP doesn't work (the above is output into /var/log/secure).
There are no local controls on sftp logins, and the user concerned was working fine (using both sftp and ssh) until they updated their password.
There is no separate sftp daemon running, and it only affects one individual currently (but we have seen some very similar instances before)
The second issue we have is around phantom groups in AD.
Hadoop uses an id -Gn command to see group membership for authorisation.
With some users - we've seen 6 currently - we see certain groups failing to be looked up:
id -Gn <username>
id: cannot find name for group ID xxxxyyyyy <group name> <group name> <group name> <group name> <etc...>
The xxxxyyyyy indicates:
xxxx = hashed realm name yyyyy = RID from group in AD
We can't find any group with that number on the AD side!
We can work around this by adding a local group (into /etc/group) for the GIDs affected. This means the id -Gn runs correctly, and the hadoop namenode can function correctly - but this is a workaround and we'd like to get to the bottom of the issue.
Rather than flooding this post now with logfiles, just thought I'd see if this looked familiar to anyone. Happy to upload any logs, amend logging levels, etc.
Many thanks Simon _______________________________________________ sssd-users mailing list -- sssd-users@lists.fedorahosted.org To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org
Thanks Max - we have had some domain merging in the past. This is an old AD with a lot of history!
We'll have a good look into this - much appreciated.
So - as an update, setting
ignore_group_members = True
Resolves our broken group issues (as well as giving a nice performance boost)
Hope this helps someone!
Simon
Hi, faced the same issue with some accounts that were migrated from another domain in an Active Directory Inter-Forest scenario. Removing SIDHistory for those accounts solved the problem!
sssd-users@lists.fedorahosted.org