URL: https://github.com/SSSD/sssd/pull/289 Author: fidencio Title: #289: Check the timestamp cache during the users/groups cleanup Action: opened
PR body: """ This patch set basically changes the behaviour of the cleanup_user() and cleanup_groups() functions used by ldap_id_cleanup().
With the current behaviour the searches for the users/groups to be cleaned up are always done in the cache, while the up to date information about their expiration is kept in the timestamp cache.
So, in order to fix [3389](https://pagure.io/SSSD/sssd/issue/3369), let's perform the search in the timestamp cache whenever it's possible.
I'm really not sure whether the dn_filter I'm building is totally correct, so I'd like to ask whoever review this patchset to take a careful look that the last patch, specially in the cleanup_dn_filter() function.
The way I'm testing this issue is: - I have a LDAP server with a few users and have the following options configured on sssd.conf: ``` enumerate = true ldap_enumeration_refresh_timeout = 35 ldap_purge_cache_timeout = 60 entry_cache_timeout = 20 ``` """
To pull the PR as Git branch: git remote add ghsssd https://github.com/SSSD/sssd git fetch ghsssd pull/289/head:pr289 git checkout pr289
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
Label: +Changes requested
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ Seems that I broke integration tests. So, I'm adding the "Changes requested" label till I have it fixed. """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304615054
URL: https://github.com/SSSD/sssd/pull/289 Author: fidencio Title: #289: Check the timestamp cache during the users/groups cleanup Action: synchronized
To pull the PR as Git branch: git remote add ghsssd https://github.com/SSSD/sssd git fetch ghsssd pull/289/head:pr289 git checkout pr289
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ I've updated the patchset (and CI passed locally for me) but I'll remove the "Changes Requested" label when I get the CI results. """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304650818
URL: https://github.com/SSSD/sssd/pull/289 Author: fidencio Title: #289: Check the timestamp cache during the users/groups cleanup Action: edited
Changed field: body Original value: """ This patch set basically changes the behaviour of the cleanup_user() and cleanup_groups() functions used by ldap_id_cleanup().
With the current behaviour the searches for the users/groups to be cleaned up are always done in the cache, while the up to date information about their expiration is kept in the timestamp cache.
So, in order to fix [3389](https://pagure.io/SSSD/sssd/issue/3369), let's perform the search in the timestamp cache whenever it's possible.
I'm really not sure whether the dn_filter I'm building is totally correct, so I'd like to ask whoever review this patchset to take a careful look that the last patch, specially in the cleanup_dn_filter() function.
The way I'm testing this issue is: - I have a LDAP server with a few users and have the following options configured on sssd.conf: ``` enumerate = true ldap_enumeration_refresh_timeout = 35 ldap_purge_cache_timeout = 60 entry_cache_timeout = 20 ``` """
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ There's a one not related failure: http://sssd-ci.duckdns.org/logs/job/70/55/summary.html """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304691503
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
Label: -Changes requested
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ There are some warnings: ``` Error: UNUSED_VALUE (CWE-563): sssd-1.15.3/src/db/sysdb_ops.c:5078: value_overwrite: Overwriting previous write to "ret" with value "0". sssd-1.15.3/src/db/sysdb_ops.c:5071: returned_value: Assigning value from "sysdb_error_to_errno(ret)" to "ret" here, but that stored value is overwritten before it can be used. # 5069| ret = ldb_modify(dom->sysdb->ldb_ts, msg); # 5070| if (ret != LDB_SUCCESS) { # 5071|-> ret = sysdb_error_to_errno(ret); # 5072| DEBUG(SSSDBG_MINOR_FAILURE, # 5073| "Could not mark an entry as expired in the timestamp cache\n");
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:172: var_decl: Declaring variable "tmp_ctx" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:215: uninit_use_in_call: Using uninitialized value "tmp_ctx" when calling "_talloc_free". # 213| # 214| done: # 215|-> talloc_zfree(tmp_ctx); # 216| return ret; # 217| }
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:240: var_decl: Declaring variable "count" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:312: uninit_use_in_call: Using uninitialized value "count" when calling "sss_debug_fn". # 310| } # 311| # 312|-> DEBUG(SSSDBG_FUNC_DATA, "Found %zu expired user entries!\n", count); # 313| # 314| if (count == 0) {
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:240: var_decl: Declaring variable "count" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:314: uninit_use: Using uninitialized value "count". # 312| DEBUG(SSSDBG_FUNC_DATA, "Found %zu expired user entries!\n", count); # 313| # 314|-> if (count == 0) { # 315| ret = EOK; # 316| goto done;
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:239: var_decl: Declaring variable "msgs" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:329: uninit_use: Using uninitialized value "msgs". # 327| # 328| for (i = 0; i < count; i++) { # 329|-> name = ldb_msg_find_attr_as_string(msgs[i], SYSDB_NAME, NULL); # 330| if (!name) { # 331| DEBUG(SSSDBG_OP_FAILURE, "Entry %s has no Name Attribute ?!?\n",
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:461: var_decl: Declaring variable "count" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:519: uninit_use_in_call: Using uninitialized value "count" when calling "sss_debug_fn". # 517| } # 518| # 519|-> DEBUG(SSSDBG_FUNC_DATA, "Found %zu expired group entries!\n", count); # 520| # 521| if (count == 0) {
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:461: var_decl: Declaring variable "count" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:521: uninit_use: Using uninitialized value "count". # 519| DEBUG(SSSDBG_FUNC_DATA, "Found %zu expired group entries!\n", count); # 520| # 521|-> if (count == 0) { # 522| ret = EOK; # 523| goto done;
Error: UNINIT (CWE-457): sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:460: var_decl: Declaring variable "msgs" without initializer. sssd-1.15.3/src/providers/ldap/ldap_id_cleanup.c:529: uninit_use: Using uninitialized value "msgs". # 527| char *sanitized_dn; # 528| # 529|-> dn = ldb_dn_get_linearized(msgs[i]->dn); # 530| if (!dn) { # 531| DEBUG(SSSDBG_CRIT_FAILURE, "Cannot linearize DN!\n"); ``` """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304888982
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ btw I started reviewing the patches, feel free to do local changes, but I'll let you know when I'm done with the review by setting the changes requested label """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304889125
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ So the primary question I have is why not do what you're doing in the ldap_cleanup module in the sysdb search?
I checked where are the sysdb_search_users and sysdb_search_groups functions called and unfortunately some are in the critical paths which are already quite slow, so adding the timestamp search before persistent cache search might slow them even further..but then again, it might be cleaner to add a sysdb_search_by_orig_dn which would always search only the persistent cache.
There are two reasons I propose this: 1. the sysdb_search_users/sysdb_search_groups interface is generic and the caller should expect to return correct results regardless of whether the filter contains a timestamp attribute or not 2. I don't think the sysdb.h interface should expose the timestamp cache at all.
I agree that adding the search by orig dn is also a bit of a hack, because we already have too many search functions, but it seems like a more correct solution. """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304901159
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
Label: +Changes requested
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ On Tue, May 30, 2017 at 4:44 PM, Jakub Hrozek notifications@github.com wrote:
So the primary question I have is why not do what you're doing in the ldap_cleanup module in the sysdb search?
The main answer for this question is to avoid touching code that could affect even more parts of the code (some of them which I'm not even aware on how to test that those changes are not too intrusive).
I checked where are the sysdb_search_users and sysdb_search_groups functions called and unfortunately some are in the critical paths which are already quite slow, so adding the timestamp search before persistent cache search might slow them even further..but then again, it might be cleaner to add a sysdb_search_by_orig_dn which would always search only the persistent cache.
Sorry Jakub, I'm not sure if I'm following you here. What would sysdb_search_by_orig_dn() do? Would it do exactly what I'm doing in the "LDAP_ID_CLEANUP: Check the ts_cache during the users/groups cleanup" patch and just be used from ldap_cleanup()?
There are two reasons I propose this:
- the sysdb_search_users/sysdb_search_groups interface is generic and
the caller should expect to return correct results regardless of whether the filter contains a timestamp attribute or not 2. I don't think the sysdb.h interface should expose the timestamp cache at all.
We don't disagree here, with any of the points. Just keep in mind that sysdb_search_users() and sysdb_search_groups() do return correct results based on the timestamp they have. The problem here is how often we update the timestamp in the persistent cache.
I agree that adding the search by orig dn is also a bit of a hack, because we already have too many search functions, but it seems like a more correct solution.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-304901159, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG4et-ew4vivfmrzG6N8ziQF7GxDlekks5r_CtngaJpZM4NpESe .
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-304994051
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ On Tue, May 30, 2017 at 01:12:14PM -0700, fidencio wrote:
On Tue, May 30, 2017 at 4:44 PM, Jakub Hrozek notifications@github.com wrote:
So the primary question I have is why not do what you're doing in the ldap_cleanup module in the sysdb search?
The main answer for this question is to avoid touching code that could affect even more parts of the code (some of them which I'm not even aware on how to test that those changes are not too intrusive).
Yes, that's a fair point. If you prefer, we can apply less intrusive patches first, but work on the 'prettier' solution in the meantime.
I checked where are the sysdb_search_users and sysdb_search_groups functions called and unfortunately some are in the critical paths which are already quite slow, so adding the timestamp search before persistent cache search might slow them even further..but then again, it might be cleaner to add a sysdb_search_by_orig_dn which would always search only the persistent cache.
Sorry Jakub, I'm not sure if I'm following you here. What would sysdb_search_by_orig_dn() do? Would it do exactly what I'm doing in the "LDAP_ID_CLEANUP: Check the ts_cache during the users/groups cleanup" patch and just be used from ldap_cleanup()?
Yeah, sorry, that was unclear, it wasn't an explanation, but more of a brain dump /o\ let me try again.
So your patches essentially expanded on sysdb_search_users/sysdb_search_groups. I was checking yesterday how is sysdb_search_{users/groups} used today and whether it would be acceptable to just do what you did in the ldap_id_cleanup.c module inside the search functions themselves or if the extra search in the timestamp cache might slow some operations down.
And I noticed that almost all uses of these two search functions except the cleanup module are searches by the original DN. At the same time, these searches are done to find group members, so I would prefer to keep them as fast as possible, because those codepaths are known to be a bottleneck already.
Therefore the proposal was to add a sysdb_search_by_orig_dn, replace the search_groups/search_users which search by DN with sysdb_search_by_orig_dn and make the two search functions look up in the timestamp cache as well.
This would keep the interface from exposing the search-in-ts-cache family of functions, while avoid regressing performance.
Does this make better sense?
As I said, I think you have a point with making minimal changes per patch. So I wonder if a good approach would be to add additional patch atop this set (it doesn't matter if it's a part of this PR or additional as long as it's done eventually) that implements the by-orig-dn search. The upstream would accept both and downstream can cherry-pick the minimal changes.
There are two reasons I propose this:
- the sysdb_search_users/sysdb_search_groups interface is generic and
the caller should expect to return correct results regardless of whether the filter contains a timestamp attribute or not 2. I don't think the sysdb.h interface should expose the timestamp cache at all.
We don't disagree here, with any of the points. Just keep in mind that sysdb_search_users() and sysdb_search_groups() do return correct results based on the timestamp they have. The problem here is how often we update the timestamp in the persistent cache.
Yes, I'm trying to think about the sysdb layer as a library. If I was a library user of sysdb and didn't know its internals, it would be suprising if a generic search function behaved differently depending on what attributes you search with.
I agree that adding the search by orig dn is also a bit of a hack, because we already have too many search functions, but it seems like a more correct solution.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-304901159, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG4et-ew4vivfmrzG6N8ziQF7GxDlekks5r_CtngaJpZM4NpESe .
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/SSSD/sssd/pull/289#issuecomment-304994051
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-305101630
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ On Wed, May 31, 2017 at 8:55 AM, Jakub Hrozek notifications@github.com wrote:
On Tue, May 30, 2017 at 01:12:14PM -0700, fidencio wrote:
On Tue, May 30, 2017 at 4:44 PM, Jakub Hrozek notifications@github.com wrote:
So the primary question I have is why not do what you're doing in the ldap_cleanup module in the sysdb search?
The main answer for this question is to avoid touching code that could affect even more parts of the code (some of them which I'm not even aware on how to test that those changes are not too intrusive).
Yes, that's a fair point. If you prefer, we can apply less intrusive patches first, but work on the 'prettier' solution in the meantime.
I don't think there's a need for this. Let me take advantage of your guidance and go for the prettier solution.
I checked where are the sysdb_search_users and sysdb_search_groups functions called and unfortunately some are in the critical paths
which are
already quite slow, so adding the timestamp search before persistent
cache
search might slow them even further..but then again, it might be
cleaner to
add a sysdb_search_by_orig_dn which would always search only the
persistent
cache.
Sorry Jakub, I'm not sure if I'm following you here. What would sysdb_search_by_orig_dn() do? Would it do exactly what I'm doing in the "LDAP_ID_CLEANUP: Check the ts_cache during the users/groups cleanup"
patch
and just be used from ldap_cleanup()?
Yeah, sorry, that was unclear, it wasn't an explanation, but more of a brain dump /o\ let me try again.
So your patches essentially expanded on sysdb_search_users/sysdb_search_groups. I was checking yesterday how is sysdb_search_{users/groups} used today and whether it would be acceptable to just do what you did in the ldap_id_cleanup.c module inside the search functions themselves or if the extra search in the timestamp cache might slow some operations down.
And I noticed that almost all uses of these two search functions except the cleanup module are searches by the original DN. At the same time, these searches are done to find group members, so I would prefer to keep them as fast as possible, because those codepaths are known to be a bottleneck already.
Therefore the proposal was to add a sysdb_search_by_orig_dn, replace the search_groups/search_users which search by DN with sysdb_search_by_orig_dn and make the two search functions look up in the timestamp cache as well.
This would keep the interface from exposing the search-in-ts-cache family of functions, while avoid regressing performance.
Does this make better sense?
Okay, yes, it does make sense.
As I said, I think you have a point with making minimal changes per patch. So I wonder if a good approach would be to add additional patch atop this set (it doesn't matter if it's a part of this PR or additional as long as it's done eventually) that implements the by-orig-dn search. The upstream would accept both and downstream can cherry-pick the minimal changes.
By what you've described the proper work, although more intrusive, doesn't look very complicated.
Let me go for the approach you suggested and re-submit thisPR.
Thanks for the suggestions,
There are two reasons I propose this:
- the sysdb_search_users/sysdb_search_groups interface is generic and
the caller should expect to return correct results regardless of
whether
the filter contains a timestamp attribute or not 2. I don't think the sysdb.h interface should expose the timestamp cache at all.
We don't disagree here, with any of the points. Just keep in mind that sysdb_search_users() and sysdb_search_groups() do return correct results based on the timestamp they have. The problem here is how often we update the timestamp in the persistent cache.
Yes, I'm trying to think about the sysdb layer as a library. If I was a library user of sysdb and didn't know its internals, it would be suprising if a generic search function behaved differently depending on what attributes you search with.
I agree that adding the search by orig dn is also a bit of a hack,
because
we already have too many search functions, but it seems like a more
correct
solution.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-304901159, or
mute
the thread <https://github.com/notifications/unsubscribe-auth/AAG4et-ew
4vivfmrzG6N8ziQF7GxDlekks5r_CtngaJpZM4NpESe>
.
-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/SSSD/sssd/pull/289#issuecomment-304994051
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-305101630, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG4ejbIaGWAHlMsTpCHNhI01ya1t3FKks5r_Q7vgaJpZM4NpESe .
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-305146818
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ @jhrozek:
So, I've implemented most of your suggestion but now I'm facing an issue about making sysdb_search_users() and sysdb_search_groups() looking at the timestamp cache.
What I've done in the ldap_cleanup() code was: ``` in case a timestamp cache exists: check the timestamp cache search in the normal cache for the results gotten in the timestamp cache otherwise check the normal cache merge the results with the timestamp cache ```
The very same logic cannot be applied in the sysdb_search_users() and sysdb_search_groups() as there are some places in the code using those functions filtering by "attributes" that are not part of the timestamp cache. By doing this, the timestamp cache returns an ENOENT and we do *not* want to go an search the cache in this case. The reason we don't want to do that is because timestamps may not be up-to-date in the normal cache, so we would search in the timestamp cache, get an ENOENT as result, search in the normal cache and find users/groups (which is exactly what we're trying to solve).
Considering this limitation I'd like to check with you what you think about these two possible approaches:: - Introduce a new function that only will be used by the ldap_cleanup() for now. The changes already done for the work you suggested could still be merged, but for sure not backported. - Introduce a way to check, in the timestamp cache methods, that we're not searching for something that's not supported (and return a new error when it happens) so we for sure know that we can search in the normal cache on these cases.
The first approach seems way saner and less hacky than the second one, IMO.
I'd really like to hear some suggestions from you (and from other developers as well) about these 2 approaches or even about something else that I've missed/wasn't able to come up with. """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-306734209
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ On Wed, Jun 07, 2017 at 01:58:26AM -0700, fidencio wrote:
@jhrozek:
So, I've implemented most of your suggestion but now I'm facing an issue about making sysdb_search_users() and sysdb_search_groups() looking at the timestamp cache.
What I've done in the ldap_cleanup() code was:
in case a timestamp cache exists: check the timestamp cache search in the normal cache for the results gotten in the timestamp cache otherwise check the normal cache merge the results with the timestamp cache
I /thought/ (but I was wrong, see below) we wanted to do: 1) search the timestamp cache using the provided filter. This returns a set of results 2) for each DN from the timestamp results, search the DN in the persistent cache 3) always search the persistent cache 4) merge in the timestamp attributes into results from 3)
But as you pointed out, 3 is flawed because we are searching based on the attributes that are not in the same cache..oops..
The very same logic cannot be applied in the sysdb_search_users() and sysdb_search_groups() as there are some places in the code using those functions filtering by "attributes" that are not part of the timestamp cache. By doing this, the timestamp cache returns an ENOENT and we do *not* want to go an search the cache in this case. The reason we don't want to do that is because timestamps may not be up-to-date in the normal cache, so we would search in the timestamp cache, get an ENOENT as result, search in the normal cache and find users/groups (which is exactly what we're trying to solve).
Hmm, I didn't think about this. I'm sorry for leading you down the wrong path.
Considering this limitation I'd like to check with you what you think about these two possible approaches::
- Introduce a new function that only will be used by the ldap_cleanup() for now. The changes already done for the work you suggested could still be merged, but for sure not backported.
Yes, this is the fastest way to fix the bug. Would you like me to continue with the first round of patches?
- Introduce a way to check, in the timestamp cache methods, that we're not searching for something that's not supported (and return a new error when it happens) so we for sure know that we can search in the normal cache on these cases.
Yes, the safest way would be to explode the filter ourselves at least to learn during development that the search might be not precise.
The first approach seems way saner and less hacky than the second one, IMO.
I'd really like to hear some suggestions from you (and from other developers as well) about these 2 approaches or even about something else that I've missed/wasn't able to come up with.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/SSSD/sssd/pull/289#issuecomment-306734209
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-306774398
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ On Wed, Jun 7, 2017 at 2:04 PM, Jakub Hrozek notifications@github.com wrote:
On Wed, Jun 07, 2017 at 01:58:26AM -0700, fidencio wrote:
@jhrozek:
So, I've implemented most of your suggestion but now I'm facing an issue
about making sysdb_search_users() and sysdb_search_groups() looking at the timestamp cache.
What I've done in the ldap_cleanup() code was:
in case a timestamp cache exists: check the timestamp cache search in the normal cache for the results gotten in the timestamp cache otherwise check the normal cache merge the results with the timestamp cache
I /thought/ (but I was wrong, see below) we wanted to do:
- search the timestamp cache using the provided filter. This returns
a set of results 2) for each DN from the timestamp results, search the DN in the persistent cache 3) always search the persistent cache 4) merge in the timestamp attributes into results from 3)
But as you pointed out, 3 is flawed because we are searching based on the attributes that are not in the same cache..oops..
The very same logic cannot be applied in the sysdb_search_users() and sysdb_search_groups() as there are some places in the code using those functions filtering by "attributes" that are not part of the timestamp cache. By doing this, the timestamp cache returns an ENOENT and we do
*not*
want to go an search the cache in this case. The reason we don't want to do that is because timestamps may not be up-to-date in the normal cache, so we would search in the timestamp cache, get an ENOENT as result,
search
in the normal cache and find users/groups (which is exactly what we're trying to solve).
Hmm, I didn't think about this. I'm sorry for leading you down the wrong path.
That's no problem at all. The most part of the work related to this patch set can (and most likely) will be used upstream. So, it's just a matter of which patches we're going to backport.
Considering this limitation I'd like to check with you what you think
about these two possible approaches::
- Introduce a new function that only will be used by the ldap_cleanup()
for now. The changes already done for the work you suggested could still be merged, but for sure not backported.
Yes, this is the fastest way to fix the bug. Would you like me to continue with the first round of patches?
Nops. I'd prefer to add a new function to sysdb that does exactly what I've done in ldap_cleanup(), Something like: sysdb_search_{users,groups}_by_timestamp() (feel free to suggest a better naming) and then we can simply use it in ldap_cleanup().
It doesn't look intrusive and still looks like a better solution than my first patchset (which exposes somethings from the timestamp cache).
Is this okay for you?
- Introduce a way to check, in the timestamp cache methods, that we're
not searching for something that's not supported (and return a new error when it happens) so we for sure know that we can search in the normal cache on these cases.
Yes, the safest way would be to explode the filter ourselves at least to learn during development that the search might be not precise.
The first approach seems way saner and less hacky than the second one,
IMO.
I'd really like to hear some suggestions from you (and from other
developers as well) about these 2 approaches or even about something else that I've missed/wasn't able to come up with.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/SSSD/sssd/pull/289#issuecomment-306734209
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-306774398, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG4ekLCbrVgHJIulucTX1bc0_-teXATks5sBpHfgaJpZM4NpESe .
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-306776441
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ On Wed, Jun 07, 2017 at 05:14:23AM -0700, fidencio wrote:
On Wed, Jun 7, 2017 at 2:04 PM, Jakub Hrozek notifications@github.com wrote:
On Wed, Jun 07, 2017 at 01:58:26AM -0700, fidencio wrote:
@jhrozek:
So, I've implemented most of your suggestion but now I'm facing an issue
about making sysdb_search_users() and sysdb_search_groups() looking at the timestamp cache.
What I've done in the ldap_cleanup() code was:
in case a timestamp cache exists: check the timestamp cache search in the normal cache for the results gotten in the timestamp cache otherwise check the normal cache merge the results with the timestamp cache
I /thought/ (but I was wrong, see below) we wanted to do:
- search the timestamp cache using the provided filter. This returns
a set of results 2) for each DN from the timestamp results, search the DN in the persistent cache 3) always search the persistent cache 4) merge in the timestamp attributes into results from 3)
But as you pointed out, 3 is flawed because we are searching based on the attributes that are not in the same cache..oops..
The very same logic cannot be applied in the sysdb_search_users() and sysdb_search_groups() as there are some places in the code using those functions filtering by "attributes" that are not part of the timestamp cache. By doing this, the timestamp cache returns an ENOENT and we do
*not*
want to go an search the cache in this case. The reason we don't want to do that is because timestamps may not be up-to-date in the normal cache, so we would search in the timestamp cache, get an ENOENT as result,
search
in the normal cache and find users/groups (which is exactly what we're trying to solve).
Hmm, I didn't think about this. I'm sorry for leading you down the wrong path.
That's no problem at all. The most part of the work related to this patch set can (and most likely) will be used upstream. So, it's just a matter of which patches we're going to backport.
Considering this limitation I'd like to check with you what you think
about these two possible approaches::
- Introduce a new function that only will be used by the ldap_cleanup()
for now. The changes already done for the work you suggested could still be merged, but for sure not backported.
Yes, this is the fastest way to fix the bug. Would you like me to continue with the first round of patches?
Nops. I'd prefer to add a new function to sysdb that does exactly what I've done in ldap_cleanup(), Something like: sysdb_search_{users,groups}_by_timestamp() (feel free to suggest a better naming) and then we can simply use it in ldap_cleanup().
It doesn't look intrusive and still looks like a better solution than my first patchset (which exposes somethings from the timestamp cache).
Is this okay for you?
Sure.
- Introduce a way to check, in the timestamp cache methods, that we're
not searching for something that's not supported (and return a new error when it happens) so we for sure know that we can search in the normal cache on these cases.
Yes, the safest way would be to explode the filter ourselves at least to learn during development that the search might be not precise.
The first approach seems way saner and less hacky than the second one,
IMO.
I'd really like to hear some suggestions from you (and from other
developers as well) about these 2 approaches or even about something else that I've missed/wasn't able to come up with.
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/SSSD/sssd/pull/289#issuecomment-306734209
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-306774398, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG4ekLCbrVgHJIulucTX1bc0_-teXATks5sBpHfgaJpZM4NpESe .
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/SSSD/sssd/pull/289#issuecomment-306776441
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-306778982
URL: https://github.com/SSSD/sssd/pull/289 Author: fidencio Title: #289: Check the timestamp cache during the users/groups cleanup Action: synchronized
To pull the PR as Git branch: git remote add ghsssd https://github.com/SSSD/sssd git fetch ghsssd pull/289/head:pr289 git checkout pr289
URL: https://github.com/SSSD/sssd/pull/289 Author: fidencio Title: #289: Check the timestamp cache during the users/groups cleanup Action: synchronized
To pull the PR as Git branch: git remote add ghsssd https://github.com/SSSD/sssd git fetch ghsssd pull/289/head:pr289 git checkout pr289
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ @jhrozek, new patchset has been pushed. """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-307025551
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
Label: -Changes requested
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ On Mon, Jun 12, 2017 at 11:23:45PM -0700, fidencio wrote:
On Mon, Jun 12, 2017 at 5:48 PM, Jakub Hrozek notifications@github.com wrote:
*@jhrozek* requested changes on this pull request.
In src/db/sysdb_ops.c https://github.com/SSSD/sssd/pull/289#discussion_r121449525:
}
goto immediately;
- }
- if (ret != EOK) {
goto done;
- }
- ret = cleanup_dn_filter(tmp_ctx, &ts_res, SYSDB_UC, sub_filter, &dn_filter);
- if (ret != EOK) {
goto done;
- }
- ret = sysdb_search_ts_matches(tmp_ctx, domain->sysdb, attrs,
&ts_res, dn_filter, &res);
I wonder if we need to merge back the attributes with sss_merge_ldb_results? Because now we have only main-sysdb attributes of the results that matched in the timestamp cache, but not the values of the timestamp attributes.
See the sss_merge_ldb_results usage in sysdb_enumpwent_filter.
btw this is not too important at this moment, because the caller (cleanup function) doesn't care about the timestamp attribute values, only matches the entries by timestamp attributes, but it might be a good idea to fix this for the future..
I'm sorry for the late reply. Since there was the outage to sssd-devel notifications, I missed the update.
Correct me if I'm mistaken, but I really don't think we should do that. If we fire the search in the sysdb we'll get wrong results (the same ones that originally caused the issue) and we, for sure, don't want those wrong results merged with the correct ones.
Does that make sense? Or did I misunderstood something from your comment?
No, I think you got it exactly right, but I think it's not what happens.
Let's annotate the code and please tell me if I missed something: ``` 3578 ret = sysdb_search_ts_users(tmp_ctx, domain, sub_filter, NULL, &ts_res); 3579 if (ret == ERR_NO_TS) { /* Removed for brevity */ 3591 goto immediately; 3592 } else if (ret != EOK) { 3593 goto done; 3594 }
Here we have a bunch of results by searching the TS DB.
3595 3596 ret = cleanup_dn_filter(tmp_ctx, &ts_res, SYSDB_UC, sub_filter, &dn_filter); 3597 if (ret != EOK) { 3598 goto done; 3599 }
This gives us a filter along the lines of name=sub_filter and a large OR of DNs that matched the timestamp DB filter.
3600 3601 ret = sysdb_search_ts_matches(tmp_ctx, domain->sysdb, attrs, 3602 &ts_res, dn_filter, &res);
This function returns the peristent database records that match the dn_filter and store them into res.
3603 if (ret != EOK) { 3604 goto done; 3605 } 3606 3607 msgs_count = res->count;
Finally, res is returned.
3608 msgs = res->msgs; ```
So the problem is not that we match the timestamp db, that is correct. The problem is that we always only return the entries from the persistent db. It is currently not a problem, because the caller only requests name, ID and memberof, all of which are in the persistent db, that's why I think we may push the patches. But it might be good to at least file a ticket to fix this properly later.
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-308481947
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ Okay. Now I see.
Please, feel free to push the patches as they're and a ticket will be opened for improving this. """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-308598101
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ the remaining issue is tracked in https://pagure.io/SSSD/sssd/issue/3432 """
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-308672327
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
jhrozek commented: """ * master: * 05e579691b51ac2f81ab0c828ff6fe57bd86a8b6 * 41708e1e500e7cada3d3e606aa2b8b9869a5c734 * a71f1a655dcc2ca6dc16bb8eb1c4c9e24cfe2c3e * 9883d1e2913ff0c1db479f1ece8148e03155c7f3 * 8ad57e17779b3ec60246ac58c1691ee15745084c * 347be58e1769ba90b49a7e5ec1678ef66987f6cd * 01c6bb9b47401f9f14c4cfe5c5f03fce2e63629b
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-308674040
URL: https://github.com/SSSD/sssd/pull/289 Author: fidencio Title: #289: Check the timestamp cache during the users/groups cleanup Action: closed
To pull the PR as Git branch: git remote add ghsssd https://github.com/SSSD/sssd git fetch ghsssd pull/289/head:pr289 git checkout pr289
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
Label: +Accepted
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
Label: +Pushed
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
lslebodn commented: """ Is there a reason why there is missing unit test for new public function?
I would also appreciate integration test for this bug. we need to prevent regressions in future.
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-308676966
URL: https://github.com/SSSD/sssd/pull/289 Title: #289: Check the timestamp cache during the users/groups cleanup
fidencio commented: """ On Thu, Jun 15, 2017 at 11:19 AM, lslebodn notifications@github.com wrote:
Is there a reason why there is missing unit test for new public function?
Lukáš,
I've started changing the cmocka/test_ldap_id_cleanup.c in order to add the unit test, but the tests there are using the files provider, which doesn't have/support the timestamp cache.
I actually started extending the files provider to support the timestamp cache and then having a test for those, but ends up it's not actually desired (extending the files provider, not the tests).
So, I've decided to go without the tests, at least for now and hope that it wouldn't block those patches.
I would also appreciate integration test for this bug. we need to prevent regressions in future.
The integration test is something that can be done (and will) as soon as the fire on other places is controlled. I'll file a ticket about this. Is it okay for you?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/SSSD/sssd/pull/289#issuecomment-308676966, or mute the thread https://github.com/notifications/unsubscribe-auth/AAG4ei4ZZILZWGaqNmkSx65-GEe_7Y_eks5sEPcGgaJpZM4NpESe .
"""
See the full comment at https://github.com/SSSD/sssd/pull/289#issuecomment-308708682
sssd-devel@lists.fedorahosted.org