All (but particularly Sumit since he wrote the comments on
https://bugzilla.redhat.com/show_bug.cgi?id=1984591),
There are at least two problems created by this recently-introduced sssd
bug. One problem is solvable by the suggested work-around, the other is
not. The work-around suggested is:
[domain/name.of.joined.domain]
ad_enabled_domains =
dom1.example.com,
dom2.example.com,
dom3.example.com
In order to query only the desired AD domains.
What is the bug?
the sssd-ad man page says "The AD provider can be used to get user
information and authenticate users from trusted domains. Currently
only trusted domains in the same forest are recognized.".
What is happening is that untrusted AD domains are being discovered. A
very specific type of untrusted domains. When the joined domain has no
trust with that other domain, but that other domain trusts the original
domain – that is a one-way trust (the wrong way). To the joined domain,
this is an untrusted domain and should not be discovered.
This is actually very common in corporate environments.
You may have a main AD domain, call it
CORP.COMPANY.COM. Then for testing
and new production evaluation, you might have a test AD domain called
LAB-TEST.COMPANY.COM.
CORP.COMPANY.COM is tightly controlled, with full
audits and corporate security.
LAB-TEST.COMPANY.COM is a test AD domain –
it’s the wild, wild west!
So
LAB-TEST.COMPANY.COM trusts the main AD domain (in order that users can
log into this test domain with their CORP accounts). But
CORP.COMPANY.COM
does not trust
LAB-TEST.COMPANY.COM – nor should it!! (That’s the wild,
wild west, doing so would compromise corporate security.)
Thus, a server joined to domain
CORP.COMPANY.COM should discover
CORP.COMPANY.COM and any domains trusted by
CORP.COMPANY.COM. It should
*NOT* discover
LAB-TEST.COMPANY.COM, as
CORP.COMPANY.COM does not trust
this domain.
A server joined to
LAB-TEST.COMPANY.COM should discover
LAB-TEST.COMPANY.COM
and all domains trusted by
LAB-TEST.COMPANY.COM. Including
CORP.COMPANY.COM,
as
LAB-TEST.COMPANY.COM trusts
CORP.COMPANY.COM.
The bug is that a server joined to
CORP.COMPANY.COM discovers
LAB-TEST.COMPANY.COM, which it shouldn’t.
What problems does this cause?
Two problems.
1. Many of these untrusted discovered “lab” domains are accessible
only to specific network locations. That is, they’re firewalled off to a
particular lab. So sssd attempts to query these inaccessible AD domains
and takes a long time to time out. This problem can be worked around by
the suggested work-around in the Bugzilla:
[
domain/corp.company.com]
ad_enabled_domains =
corp.company.com
So then, while
LAB-TEST.COMPANY.COM is still erroneously discovered, it is
no longer searched. Sssd is again fast.
2. Bogus messages in /var/log/sssd_nss.log file. Even with no debug
level set in the [nss] stanza, these error messages appear multiple times a
second. It quickly fills up the /var/log filesystem.
[root@auspdfdlobv01 sssd]# cat sssd_nss.log |grep "The Data Provider
returned an error"
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
From debug level 9, it is clear that this is arising from a query of these
erroneously-discovered untrusted domains. Here’s an example of one
instance of above with debug level 9 turned on. So
emeaicmd.geodll.company.com is one of these erroneously-discovered
untrusted lab domains, that happens to be firewalled off from this
particular AD client:
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x1000): Got reply from
Data Provider - DP error code: 0 errno: 0 error message: Success
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Looking up [oracle(a)company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Object [oracle(a)company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache_add_to_domain]
(0x0400): CR #9: Adding [oracle(a)company.com] to negative cache
(2021-10-07 9:50:02): [nss] [is_user_local_by_name] (0x0400): User
oracle(a)company.com is a local user
(2021-10-07 9:50:02): [nss] [sss_ncache_set_str] (0x0400): Adding
[
NCE/USER/company.com/oracle@company.com] to negative cache
(2021-10-07 9:50:02): [nss] [cache_req_validate_domain_type] (0x2000):
Request type POSIX-only for domain
EMEAICMD.geodll.company.com type POSIX
is valid
(2021-10-07 9:50:02): [nss] [cache_req_set_domain] (0x0400): CR #9: Using
domain [
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_prepare_domain_data] (0x0400): CR
#9: Preparing input data for domain [
EMEAICMD.geodll.company.com] rules
(2021-10-07 9:50:02): [nss] [cache_req_search_send] (0x0400): CR #9:
Looking up oracle(a)emeaicmd.geodll.company.com
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #9:
Checking negative cache for [oracle(a)emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_ncache_check_str] (0x2000): Checking
negative cache for [NCE/USER/
EMEAICMD.geodll.company.com/oracle@emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #9: [
oracle(a)emeaicmd.geodll.company.com] is not present in negative cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Looking up [oracle(a)emeaicmd.geodll.company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #9:
Object [oracle(a)emeaicmd.geodll.company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_dp] (0x0400): CR #9: Looking
up [oracle(a)emeaicmd.geodll.company.com] in data provider
(2021-10-07 9:50:02): [nss] [sss_dp_issue_request] (0x0400): Issuing
request for [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_dp_get_account_msg] (0x0400): Creating
request for [
EMEAICMD.geodll.company.com
][0x3][BE_REQ_INITGROUPS][name=oracle@emeaicmd.geodll.company.com:-]
(2021-10-07 9:50:02): [nss] [sbus_add_timeout] (0x2000): 0x564d6ccd6670
(2021-10-07 9:50:02): [nss] [sss_dp_internal_get_send] (0x0400): Entering
request [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Looking up [oracle(a)company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Object [oracle(a)company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache_add_to_domain]
(0x0400): CR #12: Adding [oracle(a)company.com] to negative cache
(2021-10-07 9:50:02): [nss] [is_user_local_by_name] (0x0400): User
oracle(a)company.com is a local user
(2021-10-07 9:50:02): [nss] [sss_ncache_set_str] (0x0400): Adding
[
NCE/USER/company.com/oracle@company.com] to negative cache
(2021-10-07 9:50:02): [nss] [cache_req_validate_domain_type] (0x2000):
Request type POSIX-only for domain
EMEAICMD.geodll.company.com type POSIX
is valid
(2021-10-07 9:50:02): [nss] [cache_req_set_domain] (0x0400): CR #12: Using
domain [
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_prepare_domain_data] (0x0400): CR
#12: Preparing input data for domain [
EMEAICMD.geodll.company.com] rules
(2021-10-07 9:50:02): [nss] [cache_req_search_send] (0x0400): CR #12:
Looking up oracle(a)emeaicmd.geodll.company.com
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #12:
Checking negative cache for [oracle(a)emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_ncache_check_str] (0x2000): Checking
negative cache for [NCE/USER/
EMEAICMD.geodll.company.com/oracle@emeaicmd.geodll.company.com]
(2021-10-07 9:50:02): [nss] [cache_req_search_ncache] (0x0400): CR #12: [
oracle(a)emeaicmd.geodll.company.com] is not present in negative cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Looking up [oracle(a)emeaicmd.geodll.company.com] in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_cache] (0x0400): CR #12:
Object [oracle(a)emeaicmd.geodll.company.com] was not found in cache
(2021-10-07 9:50:02): [nss] [cache_req_search_dp] (0x0400): CR #12:
Looking up [oracle(a)emeaicmd.geodll.company.com] in data provider
(2021-10-07 9:50:02): [nss] [sss_dp_issue_request] (0x0400): Issuing
request for [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_dp_issue_request] (0x0400): Identical
request in progress: [0x564d6be36a70:3:oracle@emeaicmd.geodll.company.com@
EMEAICMD.geodll.company.com]
(2021-10-07 9:50:02): [nss] [sss_dp_req_destructor] (0x0400): Deleting
request: [0x564d6be36a70:3:oracle@company.com@company.com]
(2021-10-07 9:50:02): [nss] [sbus_remove_timeout] (0x2000): 0x564d6ccd6670
(2021-10-07 9:50:02): [nss] [sbus_dispatch] (0x4000): dbus conn:
0x564d6ccc9300
(2021-10-07 9:50:02): [nss] [sbus_dispatch] (0x4000): Dispatching.
(2021-10-07 9:50:02): [nss] [sss_dp_get_reply] (0x0010): The Data Provider
returned an error [org.freedesktop.DBus.Error.Failed]
The suggested work-around does not resolve problem #2.
BTW, here is a listing of the domains discovered on that sssd client:
[root@auspdfdlobv01 ~]# sssctl domain-list
amer.company.com
company.com
japn.company.com
emea.company.com
apac.company.com
EMEAICMD.geodll.company.com
geodll.company.com
EMEAICM.GEODLL.COMPANY.COM
alienware.com
corp.svcs
perotsystems.net
companyservices.dmz
Beer.Town
production.online.company.com
jp-poclab.companypoc.com
emea-poclab.companypoc.com
oldev.preol.company.com
olqa.preol.company.com
ap-poclab.companypoc.com
[root@auspdfdlobv01 ~]#
This sssd client is joined to
amer.company.com, so the only trusted domains
are the first 5. The parent domain and the 4 regional domains. All
those other domains below that are untrusted domains. More specifically,
they trust
company.com, but
company.com does not trust them. (one way
trust – the wrong way.) Some look like the real wild wild west (Beer.Town
?).
Spike