On 09/27/2016 07:38 AM, Richard Collins wrote:
Running Red Hat Enterprise Linux Server release 6.5 (Santiago) -
2.6.32-431.el6.x86_64
SSSD version: sssd-1.13.3-22.el6_8.4.x86_64
I'm seeing (seemingly random?) shutdown/termination of sssd across
multiple nodes, all with the same configuration. To my knowledge there
is no process going around killing things, we even have a scheduled job
to check sssd status and restart every 5 minutes if unavailable:
/var/log/sssd/sssd.log:284469:(Mon Sep 26 12:21:29 2016) [sssd]
[monitor_quit_signal] (0x2000): Received shutdown command
/var/log/sssd/sssd.log:318707:(Mon Sep 26 16:19:19 2016) [sssd]
[monitor_quit_signal] (0x2000): Received shutdown command
/var/log/sssd/sssd.log:321889:(Mon Sep 26 16:43:12 2016) [sssd]
[monitor_quit_signal] (0x2000): Received shutdown command
/var/log/sssd/sssd.log:474327:(Tue Sep 27 10:29:39 2016) [sssd]
[monitor_quit_signal] (0x2000): Received shutdown command
/var/log/sssd/sssd.log:475205:(Tue Sep 27 10:34:36 2016) [sssd]
[monitor_quit_signal] (0x2000): Received shutdown command
The monitor_quit_signal function should only be called when the SSSD
monitor process receives SIGINT or SIGTERM. It looks like you already
have debug_level = 9 in the monitor section of sssd.conf, I would hope
to see some useful more messages in /var/log/sssd/sssd.log around the
same timeframe as above.
If that is not the case, you could try running a systemtap script like
the one here to determine if there is an unexpected script or process
sending these signals:
https://sourceware.org/systemtap/examples/process/sigkill.stp
Right before each shutdown, there are lots of the following
nss_cmd_getbynam and sss_ncache_check_str entries for 'root' in
sssd_nss.log:
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [nss_cmd_getbynam] (0x0400):
Running command [38][SSS_NSS_INITGR] with input [root].
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [sss_parse_name_for_domains]
(0x0200): name 'root' matched without domain, user is root
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [nss_cmd_getbynam] (0x0100):
Requesting info for [root] from [<ALL>]
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [sss_ncache_check_str] (0x2000):
Checking negative cache for [NCE/USER/MYDOMAIN/root]
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [nss_cmd_initgroups_search]
(0x0400): User [root] does not exist in [MYDOMAIN]! (negative cache)
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [nss_cmd_initgroups_search]
(0x0080): No matching domain found for [root], fail!
(Mon Sep 26 16:43:11 2016) [sssd[nss]] [reset_idle_timer] (0x4000): Idle
timer re-set for client [0xf7e120][24]
(Mon Sep 26 16:43:12 2016) [sssd[nss]] [sss_responder_ctx_destructor]
(0x0400): Responder is being shut down
(Mon Sep 26 16:43:12 2016) [sssd[nss]] [client_destructor] (0x2000):
Terminated client [0xf7e120][24]
(Mon Sep 26 16:43:12 2016) [sssd[nss]] [client_destructor] (0x2000):
Terminated client [0xf840e0][23]
(Mon Sep 26 16:43:12 2016) [sssd[nss]] [client_destructor] (0x2000):
Terminated client [0xf7b500][22]
You have 'filter_users = root' in the sssd.conf so these messages about
'root' should be expected. When the monitor shutdown is called it will
terminate child processes which is why the NSS Responder gets shut down
here.
Corresponding AD log for same period:
(Mon Sep 26 16:43:10 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x4000): dbus conn: 0x142aa90
(Mon Sep 26 16:43:10 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x4000): Dispatching.
(Mon Sep 26 16:43:10 2016) [sssd[be[MYDOMAIN]]] [sbus_message_handler]
(0x2000): Received SBUS method org.freedesktop.sssd.service.ping on path
/org/freedesktop/sssd/service
(Mon Sep 26 16:43:10 2016) [sssd[be[MYDOMAIN]]]
[sbus_get_sender_id_send] (0x2000): Not a sysbus message, quit
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x1440c50/0x143e080
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x1440c50/0x143e030
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x4000): dbus conn: 0x143eb00
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x0080): Connection is not open for dispatching.
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [be_client_destructor]
(0x0400): Removed SUDO client
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x1444030/0x14420b0
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x1444030/0x1442060
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x4000): dbus conn: 0x1443250
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x0080): Connection is not open for dispatching.
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [be_client_destructor]
(0x0400): Removed PAM client
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x143d070/0x142c0d0
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x143d070/0x142aeb0
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x4000): dbus conn: 0x143c570
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_dispatch]
(0x0080): Connection is not open for dispatching.
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [be_client_destructor]
(0x0400): Removed NSS client
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [be_ptask_destructor]
(0x0400): Terminating periodic task [SUDO Smart Refresh]
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [be_ptask_destructor]
(0x0400): Terminating periodic task [SUDO Full Refresh]
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sdap_handle_release]
(0x2000): Trace: sh[0x14f9ff0], connected[1], ops[(nil)],
ldap[0x1449c10], destructor_lock[0], release_memory[0]
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]]
[remove_connection_callback] (0x4000): Successfully removed connection
callback.
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [sbus_remove_watch]
(0x2000): 0x142f250/0x1417480
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [remove_socket_symlink]
(0x4000): The symlink points to
[/var/lib/sss/pipes/private/sbus-dp_MYDOMAIN.11328]
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [remove_socket_symlink]
(0x4000): The path including our pid is
[/var/lib/sss/pipes/private/sbus-dp_MYDOMAIN.11328]
(Mon Sep 26 16:43:12 2016) [sssd[be[MYDOMAIN]]] [remove_socket_symlink]
(0x4000): Removed the symlink
AD controllers are WIN2012R2
SSSD is configured with a single domain (MYDOMAIN)
######begin sssd.conf (redacted)#####
[sssd]
config_file_version = 2
services = nss, pam, sudo
domains = MYDOMAIN
debug_level = 9
[nss]
default_shell = /bin/bash
debug_level = 9
filter_users = root
filter_groups = root
[pam]
debug_level = 9
[sudo]
debug_level = 9
[domain/MYDOMAIN]
id_provider = ldap
access_provider = simple
cache_credentials = false
debug_level = 9
ldap_server = _srv_
ldap_search_base = #########
ldap_id_use_start_tls = true
ldap_tls_reqcert = allow
ldap_default_bind_dn = #########
ldap_default_authtok_type = password
ldap_default_authtok = #########
ldap_user_search_base = ou=BusinessUnits,dc=mydomain
ldap_user_object_class = user
ldap_id_mapping = true
ldap_schema = ad
ldap_group_search_base = #########
ldap_group_object_class = group
ldap_referrals = false
enumerate = false
override_homedir = /export/home/%u
ldap_group_nesting_level = 5
ldap_use_tokengroups = false
simple_allow_groups = sasi,sasadmin,sasmgt ldap_access_order = expire
ldap_account_expire_policy = ad
######end sssd.conf#####
For the most part this sssd.conf looks okay to me except for
ldap_server = _srv_
I could not find this option in the man page, it looks to be invalid or
deprecated.
simple_allow_groups = sasi,sasadmin,sasmgt ldap_access_order = expire
ldap_account_expire_policy = ad
Are these three options each defined on the same line, or is it the
email formatting that may have appended these to one line?
I would fix these and see if that helps.
>
>
>
> This document is strictly confidential and is intended for use by the
> addressee unless otherwise indicated. Allied Irish Banks AIB and AIB
> Group are registered business names of Allied Irish Banks p.l.c. Allied
> Irish Banks, p.l.c. is regulated by the Central Bank of Ireland.
> Registered Office: Bankcentre, Ballsbridge, Dublin 4. Tel: + 353 1
> 6600311; Registered in Ireland: Registered No. 24173. ~~~~~~~Please
> consider the environment before printing this Email~~~~~~~~ This email
> has been scanned by an external Email Security System. This Disclaimer
> has been generated by CMDis
>
>
> _______________________________________________
> sssd-users mailing list -- sssd-users(a)lists.fedorahosted.org
> To unsubscribe send an email to sssd-users-leave(a)lists.fedorahosted.org
>