Sumit,

Ok, I had time today to get all those logs you wanted.  I have the /var/log/ssd/* logs from when it core dumps.  And when it doesn't.  All done at debug level = 9.

the core dump appears as below in /var/log/messages as so:

Sep 25 14:18:19 ol8test01 systemd-coredump[84828]: Process 84817 (sssd_be) of user 0 dumped core.#012#012Stack trace of thread 84817:#012#0  0x00007f5068a613c0 ad_get_account_domain_search (libsss_ad.so)#012#1        0x00007f5068a61552 ad_get_account_domain_connect_done (libsss_ad.so)#012#2  0x00007f506841ff82 sdap_id_op_connect_done (libsss_ldap_common.so)#012#3  0x00007f5068415f9a sdap_auth_done (libsss_ldap_common.so)#0      12#4  0x00007f50706560f9 tevent_common_invoke_immediate_handler (libtevent.so.0)#012#5  0x00007f5070656127 tevent_common_loop_immediate (libtevent.so.0)#012#6  0x00007f507065bf1f epoll_event_loop_once (libteven      t.so.0)#012#7  0x00007f507065a1bb std_event_loop_once (libtevent.so.0)#012#8  0x00007f5070655395 _tevent_loop_once (libtevent.so.0)#012#9  0x00007f507065563b tevent_common_loop_wait (libtevent.so.0)#012#10 0x00      007f507065a14b std_event_loop_wait (libtevent.so.0)#012#11 0x00007f50738f7a07 server_loop (libsss_util.so)#012#12 0x000055fe299ae38b main (sssd_be)#012#13 0x00007f506fe46813 __libc_start_main (libc.so.6)#012#14       0x000055fe299ae54e _start (sssd_be)
Sep 25 14:19:49 ol8test01 realmd[84809]: * /usr/bin/systemctl restart sssd.service
Sep 25 14:19:49 ol8test01 systemd-logind[1226]: Failed to start session scope session-81343.scope: Process with ID 84885 does not exist.
Sep 25 14:19:49 ol8test01 systemd[1]: Stopping System Security Services Daemon...

So the segfault occurs at 14:18:19 in the /var/log/sssd/* logs.

I included the good sssd.conf file and the bad sssd.conf file.   The only difference is in the bad sssd.conf file, each [domain/XXXX] stanza has these lines removed:

    ldap_sasl_authid = XXX
    ldap_search_base = XXX

But still -- that shouldn't cause a segfault.

Here's the dropbox links to the log tarballs and the sssd.conf files.

https://www.dropbox.com/sh/4pvsnlo7ab8azt6/AAAXkBg99wCd-A6tZsxJZm33a?dl=0

BTW, this occurs only on RHEL8.  With that same sssd.conf file on RHEL7, it does not segfault.

Spike

On Mon, Sep 23, 2019 at 2:48 PM Spike White <spikewhitetx@gmail.com> wrote:
Yes, as you say -- our adcli invocation must add host/<fqdn>@<REALM> to the userPrincipalName.

Here's the attributes associated with a random server AD joined via adcli/sssd:

dn: CN=ACMORASTG01,OU=Servers,OU=UNIX,DC=amer,DC=company,DC=com
cn: ACMORASTG01
distinguishedName: CN=ACMORASTG01,OU=Servers,OU=UNIX,DC=amer,DC=company,DC=com
name: ACMORASTG01
sAMAccountName: ACMORASTG01$
dNSHostName: acmorastg01.company.com
userPrincipalName: host/acmorastg01.company.com@AMER.COMPANY.COM
servicePrincipalName: RestrictedKrbHost/acmorastg01.company.com
servicePrincipalName: RestrictedKrbHost/ACMORASTG01
servicePrincipalName: host/acmorastg01.company.com
servicePrincipalName: host/ACMORASTG01

I'll try to get the logs before and after, share them via dropbox.

Spike


On Mon, Sep 23, 2019 at 6:41 AM Sumit Bose <sbose@redhat.com> wrote:
On Mon, Sep 16, 2019 at 05:47:04PM -0500, Spike White wrote:
> All,
>
> This was a case where 'realm permit' of a user was causing a back-end sssd
> process (sssd_be) to core dump.  (sigsegv).   I reported this to this group
> a few months ago.  We're working this case with the Linux OS vendor.  Turns
> out, if we explicitly add:
>
> ldap_sasl_authid = host/<HOST>@<HOST's REALM>
>
> to each [domain/XXX.COMPANY.COM] stanza in /etc/sssd/sssd.conf file, it no
> longer core dumps.
>
> That is, we have these child AD domains defined in sssd.conf
>
> [domain/AMER.COMPANY.COM]
>
> [domain/EMEA.COMPANY.COM]
>
> [domain/APAC.COMPANY.COM]
>
> However, our host is registered in only one child domain.  Say AMER for a
> server amerhost1 in North America.   So we'd set:
>
> ldap_sasl_authid = host/amerhost1@AMER.COMPANY.COM  in each domain stanza
> above.

Hi,

it would be good to see some before and after debug logs.


If ldap_sasl_authid is not set SSSD tries to determine it from the
keytab with a priority as given in the sssd-ldap man page:

               hostname@REALM
               netbiosname$@REALM
               host/hostname@REALM
               *$@REALM
               host/*@REALM
               host/*

For a domain other than AMER.COMPANY.COM all patters with '@REALM' would
not match since the realm in the keytab will be AMER.COMPANY.COM. The
last entry would match 'host/amerhost1@AMER.COMPANY.COM' but maybe there
is another matching entry before in the keytab which matches first? The
logs would show which principal was selected with ldap_sasl_authid set.

What is a but puzzling is that by default
'host/amerhost1@AMER.COMPANY.COM' is a service principal and AD does not
allow service principals for authentication. So I assume that you either
added 'host/amerhost1@AMER.COMPANY.COM' to the userPrincipalName
attribute of the host object or configured AD to allow service
principals for authentication.

The second thing which is puzzling, if the wrong principal was chosen
for authentication, authentication will just fail and the backend should
switch into offline mode.

And finally, according to the case you've opened the crash happened in
the process which handles the AMER.COMPANY.COM domain in not in one of
the others which might have chosen a wrong principal.

So, if you can attach to the case the logs with 'debug_level=9' in all
[domain/...] sections of sssd.conf once with ldap_sasl_authid set and
once without if might help to understand why SSSD fails without
ldap_sasl_authid set.

bye,
Sumit

>
> Why does this prevent sssd_be from core dumping?  Not a clue!  But sssd
> performs flawlessly once this is added.
>
> Spike
>
>
> On Thu, Aug 8, 2019 at 9:09 AM Spike White <spikewhitetx@gmail.com> wrote:
>
> > Here is the bugzilla link to the ticket:
> >
> >    https://bugzilla.redhat.com/show_bug.cgi?id=1738375
> >
> >    So it appears a BZ has been created.
> >
> > Spike
> >
> > On Tue, Jul 16, 2019 at 3:32 PM Jakub Hrozek <jhrozek@redhat.com> wrote:
> >
> >> On Tue, Jul 16, 2019 at 12:32:29PM -0500, Spike White wrote:
> >> > The following case has been opened with RHEL support on this.  It was
> >> > opened this morning:
> >> >
> >> > (SEV 4) Case #02427449 ('realm permit group@DOMAIN' causing background
> >> > process sssd_be to segfault.)
> >>
> >> Thank you, comment added. I hope a BZ would be created soon.
> >> _______________________________________________
> >> sssd-users mailing list -- sssd-users@lists.fedorahosted.org
> >> To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org
> >> Fedora Code of Conduct:
> >> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> >> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> >> List Archives:
> >> https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.org
> >>
> >

> _______________________________________________
> sssd-users mailing list -- sssd-users@lists.fedorahosted.org
> To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org
> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.org
_______________________________________________
sssd-users mailing list -- sssd-users@lists.fedorahosted.org
To unsubscribe send an email to sssd-users-leave@lists.fedorahosted.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedorahosted.org/archives/list/sssd-users@lists.fedorahosted.org