On Tue, Apr 14, 2015 at 03:37:33PM +0200, Jean-Baptiste Denis wrote:
> Perhaps, maybe the backend is signalling to the frontend too
soon to
> check the cache again after the inital update.
In function sysdb_initgroups_with_views in file src/db/sysdb_search.c, We
wrapped the ldb_wait call :
ret = ldb_request(domain->sysdb->ldb, req);
if (ret == LDB_SUCCESS) {
DEBUG(SSSDBG_TRACE_FUNC, "XXSYSDB before %d %s %d\n", ret, name,
res->count);
ret = ldb_wait(req->handle, LDB_WAIT_ALL);
DEBUG(SSSDBG_TRACE_FUNC, "XXSYSDB after %d %s %d\n", ret, name,
res->count);
}
In some cases (we guess the ones that cause problem on the client side), we only
have one result after the ldb_wait call :
/var/log/sssd/sssd_nss.log:(Tue Apr 14 15:25:32:222973 2015) [sssd[nss]]
[sysdb_initgroups_with_views] (0x0400): XXSYSDB before 0 jbdenis 1
/var/log/sssd/sssd_nss.log:(Tue Apr 14 15:25:32:223031 2015) [sssd[nss]]
[sysdb_initgroups_with_views] (0x0400): XXSYSDB after 0 jbdenis 1
We suppose that when everything is fine on the client side, we've got six
results after ldb_wait :
/var/log/sssd/sssd_nss.log:(Tue Apr 14 15:25:32:438755 2015) [sssd[nss]]
[sysdb_initgroups_with_views] (0x0400): XXSYSDB before 0 jbdenis 1
/var/log/sssd/sssd_nss.log:(Tue Apr 14 15:25:32:439140 2015) [sssd[nss]]
[sysdb_initgroups_with_views] (0x0400): XXSYSDB after 0 jbdenis 6
DO you think this is relevant to our problem here ? If indeed the backend is
signalling to thre frontend too soon, where could we check that ? Do you have a
hint ?
I think this means the frontend (responder) either checks too soon or
the back end wrote incomplete data.
The responder is the sssd_nss process. When the getgrouplist() request
arrives, the cache validity is checked. If the cache is empty or too
old, the sssd_nss process queries the sssd_be process to update the
cache. When the sssd_be process is done, it sends a dbus signal (over a
private unix socket, not the system bus) that the cache is up-to-date
and the
I wonder if adding another sysdb_initgroups call into
sdap_get_initgr_recv() would verify when/if the groups were written?
> But I'm not sure how to help you without a local reproducer :-/
Yep, we understand. We're trying to build a test case, but no luck so.
Thank you for your help.
I tried to write a simple program that just calls getgrouplist() in many
concurrent threads to simulate your behaviour, but couldn't reproduce
the problem...