Hi Claas,
I do not recall a specific change 1.4.4 vs 2.0 that could explain this.
Do you confirm that 'uniqueMember' is indexed in equality on both ? What are the SRCH records in the access logs (notes=A ?). On 2.0, it lasts 2sec, you may try to capture few pstacks that would give some tips.
regards thierry
On 3/7/23 14:54, Claas Vieler wrote:
Hello, we have a search performance problem when we migrated from 1.4.4.19 to 2.0.17. Our environment has about 100k entries, about 15k users and about 10k groups. Also big groups with thousand of users, also users with thousand of group membership. So I would call it a small instance On 1.4.x query perfomance ist fine: ldapsearch for "(uniqueMember=cn=sampleuser,ou=People,dc=example,dc=com) dn " via LDAPI on 1.4.x takes approx 0,01-0,03 sec. This user is member of approx. 500 groups. I tested two migration methods:
- via replication
After initializing replica, the same query takes about _8_ sec. So I reindexed db (dsctl .. db2index) and get durations for the query from 2-3 sec. 2. via ldif export/import after importing, the same query takes about 2-3 sec But even with 2-3 sec, we talk about 2.x perfomance ten time slower than 1.4.x. Is this a know issue? I compared all cache-settings and found no differences. I have no more ideas how to optimize this. Should we wait for 2.x when its adopted to new lmdb? thanks Claas
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
Hi Claas, I do not recall a specific change 1.4.4 vs 2.0 that could explain this. Do you confirm that 'uniqueMember' is indexed in equality on both ? What are the SRCH records in the access logs (notes=A ?). On 2.0, it lasts 2sec, you may try to capture few pstacks that would give some tips. regards thierry
we need to see the exact filter that's being used, as well as the access logs lines of the slow query to really help here.
-- Sincerely,
William Brown
Senior Software Engineer, Identity and Access Management SUSE Labs, Australia
Hi Class,
First, thank you sooo much for your tests. This is really helpful.
So my understanding is that this same req was
* [10, 30]ms in 1.4 * [900, 1700]ms in 2.x o A possibility is that the filter evaluation (against the 532 returned entry) is the responsible of the 1700ms (without manageDSAit
In short it looks like there is a significant (>30 times slower) regression in RHDS12 vs RHDS11 with that testcase. In RHDS12, the handling of referral adds a 2 times slower but it is possibly fixed with https://github.com/389ds/389-ds-base/issues/5598.
best regards thierry
On 3/13/23 17:18, Claas Vieler wrote:
Hello William, sorry, your mail was stuck in my spam filter, so I doesnt see it here are the logs with and without option manageDSAit (as Thierry mentioned) without manageDSAit: [13/Mar/2023:16:16:06.583644293 +0100] conn=32 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:06.586619267 +0100] conn=32 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:06.589037720 +0100] conn=32 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:06.591155242 +0100] conn=32 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000078559 optime=0.004658221 etime=0.004734544 dn="cn=root" [13/Mar/2023:16:16:06.591326840 +0100] conn=32 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=example,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:08.321020181 +0100] conn=32 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000114773 optime=1.729694222 etime=1.729803880 [13/Mar/2023:16:16:08.321992532 +0100] conn=32 op=2 UNBIND [13/Mar/2023:16:16:08.327041073 +0100] conn=32 op=2 fd=64 closed error
- U1
with manageDSAit: [13/Mar/2023:16:16:22.324132867 +0100] conn=33 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:22.326616612 +0100] conn=33 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:22.328594648 +0100] conn=33 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:22.331154393 +0100] conn=33 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000055269 optime=0.004608598 etime=0.004661499 dn="cn=root" [13/Mar/2023:16:16:22.331366318 +0100] conn=33 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=expample,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:23.244139238 +0100] conn=33 op=2 UNBIND [13/Mar/2023:16:16:23.244725555 +0100] conn=33 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000081512 optime=0.913360154 etime=0.913438519 [1 *Gesendet:* Mittwoch, 08. März 2023 um 01:11 Uhr *Von:* "William Brown" william.brown@suse.com *An:* "389-users@lists.fedoraproject.org" 389-users@lists.fedoraproject.org *Betreff:* [389-users] Re: 2.x query performance problem
Hi Claas, I do not recall a specific change 1.4.4 vs 2.0 that could explain this. Do you confirm that 'uniqueMember' is indexed in equality on both ?
What are the SRCH records in the access logs (notes=A ?).
On 2.0, it lasts 2sec, you may try to capture few pstacks that would
give some tips.
regards thierry
we need to see the exact filter that's being used, as well as the access logs lines of the slow query to really help here.
-- Sincerely,
William Brown
Senior Software Engineer, Identity and Access Management SUSE Labs, Australia _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
Hi Claas,
Good, that means that the 2x manageDSAit is now fixed. I tried to reproduce locally (2.x) and I think I succeeded:
[14/Mar/2023:16:45:54.283507824 +0100] conn=1 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=uid=group_entry1-0001,ou=people,dc=example,dc=com)" attrs="distinguishedName" [14/Mar/2023:16:45:55.046440071 +0100] conn=1 op=1 RESULT err=0 tag=101 nentries=1000 wtime=0.000199792 optime=0.762938352 etime=0.763134856
There is 1000 groups, with each 1000 members so they are large, and uid=group_entry1_0001 belongs to all groups. The search last 0.7s that is much more than what we had in 1.4 (TBC).
Something surprising is that the server bypass the filter evaluation (when returning the entries). So it does not look like the filter contribute to the slowness.
best regards thierry
On 3/14/23 14:25, Claas Vieler wrote:
Hallo Thierry, got newest version from https://github.com/389ds/389-ds-base dc565fd https://github.com/389ds/389-ds-base/commit/dc565fdacbde6e1fd333213d707aa2c5bca9cadf(389-Directory/2.3.2 B2023.073.0958 ) I can confirm, manageDSAit makes no difference any more in query time, got etimes with 0,9 sec after import and reindexing (with and without option) but a little difference to 1.4.x ist still present :) ( 0.0x sec vs 0.9 sec) thanks and best regards Claas *Gesendet:* Montag, 13. März 2023 um 17:55 Uhr *Von:* "Thierry Bordaz" tbordaz@redhat.com *An:* 389-users@lists.fedoraproject.org *Betreff:* [389-users] Re: 2.x query performance problem
Hi Class,
First, thank you sooo much for your tests. This is really helpful.
So my understanding is that this same req was
- [10, 30]ms in 1.4
- [900, 1700]ms in 2.x o A possibility is that the filter evaluation (against the 532 returned entry) is the responsible of the 1700ms (without manageDSAit
In short it looks like there is a significant (>30 times slower) regression in RHDS12 vs RHDS11 with that testcase. In RHDS12, the handling of referral adds a 2 times slower but it is possibly fixed with https://github.com/389ds/389-ds-base/issues/5598.
best regards thierry
On 3/13/23 17:18, Claas Vieler wrote:
Hello William, sorry, your mail was stuck in my spam filter, so I doesnt see it here are the logs with and without option manageDSAit (as Thierry mentioned) without manageDSAit: [13/Mar/2023:16:16:06.583644293 +0100] conn=32 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:06.586619267 +0100] conn=32 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:06.589037720 +0100] conn=32 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:06.591155242 +0100] conn=32 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000078559 optime=0.004658221 etime=0.004734544 dn="cn=root" [13/Mar/2023:16:16:06.591326840 +0100] conn=32 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=example,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:08.321020181 +0100] conn=32 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000114773 optime=1.729694222 etime=1.729803880 [13/Mar/2023:16:16:08.321992532 +0100] conn=32 op=2 UNBIND [13/Mar/2023:16:16:08.327041073 +0100] conn=32 op=2 fd=64 closed error - U1 with manageDSAit: [13/Mar/2023:16:16:22.324132867 +0100] conn=33 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:22.326616612 +0100] conn=33 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:22.328594648 +0100] conn=33 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:22.331154393 +0100] conn=33 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000055269 optime=0.004608598 etime=0.004661499 dn="cn=root" [13/Mar/2023:16:16:22.331366318 +0100] conn=33 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=expample,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:23.244139238 +0100] conn=33 op=2 UNBIND [13/Mar/2023:16:16:23.244725555 +0100] conn=33 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000081512 optime=0.913360154 etime=0.913438519 [1 *Gesendet:* Mittwoch, 08. März 2023 um 01:11 Uhr *Von:* "William Brown" <william.brown@suse.com> *An:* "389-users@lists.fedoraproject.org" <389-users@lists.fedoraproject.org> *Betreff:* [389-users] Re: 2.x query performance problem > > Hi Claas, > I do not recall a specific change 1.4.4 vs 2.0 that could explain this. > Do you confirm that 'uniqueMember' is indexed in equality on both ? What are the SRCH records in the access logs (notes=A ?). > On 2.0, it lasts 2sec, you may try to capture few pstacks that would give some tips. > regards > thierry we need to see the exact filter that's being used, as well as the access logs lines of the slow query to really help here. -- Sincerely, William Brown Senior Software Engineer, Identity and Access Management SUSE Labs, Australia _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
_______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
Hi Class,
Indeed the same task is much faster (28 times) in 1.4.4
[14/Mar/2023:18:57:58.984319394 +0100] conn=4 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=uid=group_entry1-0001,ou=people,dc=example,dc=com)" attrs="distinguishedName" [14/Mar/2023:18:57:59.010935256 +0100] conn=4 op=1 RESULT err=0 tag=101 nentries=1000 wtime=0.000209520 optime=0.026622409 etime=0.026828069
The reason seems to be the fix https://github.com/389ds/389-ds-base/issues/5170 that force filter evaluation (even if it can be bypass) on returned entries. So a side effect of the fix is that when there is a large returned set of entries (~500 in your example) the filter evaluation is significant. Especially with attribute with a large valueset (like uniquemember).
The fix prevent to return invalid entries but the performance hit was not detected. Need to revisit this part of the fix.
best regards thierry
On 3/14/23 17:21, Thierry Bordaz wrote:
Hi Claas,
Good, that means that the 2x manageDSAit is now fixed. I tried to reproduce locally (2.x) and I think I succeeded:
[14/Mar/2023:16:45:54.283507824 +0100] conn=1 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=uid=group_entry1-0001,ou=people,dc=example,dc=com)" attrs="distinguishedName" [14/Mar/2023:16:45:55.046440071 +0100] conn=1 op=1 RESULT err=0 tag=101 nentries=1000 wtime=0.000199792 optime=0.762938352 etime=0.763134856
There is 1000 groups, with each 1000 members so they are large, and uid=group_entry1_0001 belongs to all groups. The search last 0.7s that is much more than what we had in 1.4 (TBC).
Something surprising is that the server bypass the filter evaluation (when returning the entries). So it does not look like the filter contribute to the slowness.
best regards thierry
On 3/14/23 14:25, Claas Vieler wrote:
Hallo Thierry, got newest version from https://github.com/389ds/389-ds-base dc565fd https://github.com/389ds/389-ds-base/commit/dc565fdacbde6e1fd333213d707aa2c5bca9cadf(389-Directory/2.3.2 B2023.073.0958 ) I can confirm, manageDSAit makes no difference any more in query time, got etimes with 0,9 sec after import and reindexing (with and without option) but a little difference to 1.4.x ist still present :) ( 0.0x sec vs 0.9 sec) thanks and best regards Claas *Gesendet:* Montag, 13. März 2023 um 17:55 Uhr *Von:* "Thierry Bordaz" tbordaz@redhat.com *An:* 389-users@lists.fedoraproject.org *Betreff:* [389-users] Re: 2.x query performance problem
Hi Class,
First, thank you sooo much for your tests. This is really helpful.
So my understanding is that this same req was
- [10, 30]ms in 1.4
- [900, 1700]ms in 2.x o A possibility is that the filter evaluation (against the 532 returned entry) is the responsible of the 1700ms (without manageDSAit
In short it looks like there is a significant (>30 times slower) regression in RHDS12 vs RHDS11 with that testcase. In RHDS12, the handling of referral adds a 2 times slower but it is possibly fixed with https://github.com/389ds/389-ds-base/issues/5598.
best regards thierry
On 3/13/23 17:18, Claas Vieler wrote:
Hello William, sorry, your mail was stuck in my spam filter, so I doesnt see it here are the logs with and without option manageDSAit (as Thierry mentioned) without manageDSAit: [13/Mar/2023:16:16:06.583644293 +0100] conn=32 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:06.586619267 +0100] conn=32 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:06.589037720 +0100] conn=32 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:06.591155242 +0100] conn=32 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000078559 optime=0.004658221 etime=0.004734544 dn="cn=root" [13/Mar/2023:16:16:06.591326840 +0100] conn=32 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=example,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:08.321020181 +0100] conn=32 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000114773 optime=1.729694222 etime=1.729803880 [13/Mar/2023:16:16:08.321992532 +0100] conn=32 op=2 UNBIND [13/Mar/2023:16:16:08.327041073 +0100] conn=32 op=2 fd=64 closed error - U1 with manageDSAit: [13/Mar/2023:16:16:22.324132867 +0100] conn=33 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:22.326616612 +0100] conn=33 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:22.328594648 +0100] conn=33 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:22.331154393 +0100] conn=33 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000055269 optime=0.004608598 etime=0.004661499 dn="cn=root" [13/Mar/2023:16:16:22.331366318 +0100] conn=33 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=expample,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:23.244139238 +0100] conn=33 op=2 UNBIND [13/Mar/2023:16:16:23.244725555 +0100] conn=33 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000081512 optime=0.913360154 etime=0.913438519 [1 *Gesendet:* Mittwoch, 08. März 2023 um 01:11 Uhr *Von:* "William Brown" <william.brown@suse.com> *An:* "389-users@lists.fedoraproject.org" <389-users@lists.fedoraproject.org> *Betreff:* [389-users] Re: 2.x query performance problem > > Hi Claas, > I do not recall a specific change 1.4.4 vs 2.0 that could explain this. > Do you confirm that 'uniqueMember' is indexed in equality on both ? What are the SRCH records in the access logs (notes=A ?). > On 2.0, it lasts 2sec, you may try to capture few pstacks that would give some tips. > regards > thierry we need to see the exact filter that's being used, as well as the access logs lines of the slow query to really help here. -- Sincerely, William Brown Senior Software Engineer, Identity and Access Management SUSE Labs, Australia _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
_______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
got newest version from https://github.com/389ds/389-ds-base dc565fd (389-Directory/2.3.2 B2023.073.0958 ) I can confirm, manageDSAit makes no difference any more in query time, got etimes with 0,9 sec after import and reindexing (with and without option) but a little difference to 1.4.x ist still present :) ( 0.0x sec vs 0.9 sec)
Can we see the access log between the 1.4.x and 2.x version? There still seems to be a difference here which is curious :(
-- Sincerely,
William Brown
Senior Software Engineer, Identity and Access Management SUSE Labs, Australia
Hi Claas,
Rereading that thread I have a doubt regarding cache priming. The search returns ~500 groups. The first lookup of those groups is significantly longer because of entry cache priming. Could you confirm that if you do twice the same search (1.4 and 2.x), the second search in 1.4 is much faster that the second search on 2.x ?
best regards thierry
On 3/16/23 09:38, Claas Vieler wrote:
Hello William I cant see any difference expect duration best regards Claas 389-Directory/2.3.2 B2023.073.0958 [16/Mar/2023:08:24:51.321404978 +0100] conn=51 fd=66 slot=66 connection from local to /run/slapd-389ds.socket [16/Mar/2023:08:24:51.323985845 +0100] conn=51 AUTOBIND dn="cn=root" [16/Mar/2023:08:24:51.325995690 +0100] conn=51 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [16/Mar/2023:08:24:51.328098136 +0100] conn=51 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000082030 optime=0.004197632 etime=0.004276581 dn="cn=root" [16/Mar/2023:08:24:51.328272655 +0100] conn=51 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser1,ou=People,dc=example,dc=com)" attrs="distinguishedName" [16/Mar/2023:08:24:52.285988416 +0100] conn=51 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000077055 optime=0.957714945 etime=0.957784949 [16/Mar/2023:08:24:52.286275743 +0100] conn=51 op=2 UNBIND [16/Mar/2023:08:24:52.291936625 +0100] conn=51 op=2 fd=66 Disconnect - Cleanly Closed Connection - U1 389-Directory/1.4.4.19 B2022.313.1200 [16/Mar/2023:09:10:20.353075132 +0100] conn=101 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [16/Mar/2023:09:10:20.355714488 +0100] conn=101 AUTOBIND dn="cn=root" [16/Mar/2023:09:10:20.357681511 +0100] conn=101 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [16/Mar/2023:09:10:20.359700165 +0100] conn=101 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000036305 optime=0.004064382 etime=0.004098191 dn="cn=root" [16/Mar/2023:09:10:20.359896870 +0100] conn=101 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser1,ou=People,dc=example,dc=com)" attrs="distinguishedName" [16/Mar/2023:09:10:20.367652447 +0100] conn=101 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000077477 optime=0.007755733 etime=0.007830994 [16/Mar/2023:09:10:20.369055287 +0100] conn=101 op=2 UNBIND [16/Mar/2023:09:10:20.371940374 +0100] conn=101 op=2 fd=64 closed error - U1 *Gesendet:* Mittwoch, 15. März 2023 um 03:41 Uhr *Von:* "William Brown" william.brown@suse.com *An:* "389-users@lists.fedoraproject.org" 389-users@lists.fedoraproject.org *Betreff:* [389-users] Re: 2.x query performance problem
got newest version from https://github.com/389ds/389-ds-base dc565fd
(389-Directory/2.3.2 B2023.073.0958 )
I can confirm, manageDSAit makes no difference any more in query time, got etimes with 0,9 sec after import and reindexing (with and
without option)
but a little difference to 1.4.x ist still present :) ( 0.0x sec vs
0.9 sec)
Can we see the access log between the 1.4.x and 2.x version? There still seems to be a difference here which is curious :(
-- Sincerely,
William Brown
Senior Software Engineer, Identity and Access Management SUSE Labs, Australia _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
Hello Thierry, sorry for my short reply, i am not in the office this week. I can confirm that all my test run several times, the first run is always slower because of the empty cache. The mentioned results are all average values without first run. The problem occured on all tested 2.x versions. I found it on 2.0.17 but can also observe it also on newest 2.3.2. Next week I can send you the dse.ldif for analysis.
best regards Claas
Gesendet: Montag, 27. März 2023 um 11:26 Uhr Von: "Thierry Bordaz" tbordaz@redhat.com An: "General discussion list for the 389 Directory server project." 389-users@lists.fedoraproject.org, "Claas Vieler" claas21@online.ms Betreff: [389-users] Re: 2.x query performance problem Hi Claas, Rereading that thread I have a doubt regarding cache priming. The search returns ~500 groups. The first lookup of those groups is significantly longer because of entry cache priming. Could you confirm that if you do twice the same search (1.4 and 2.x), the second search in 1.4 is much faster that the second search on 2.x ? best regards thierry
On 3/16/23 09:38, Claas Vieler wrote:
Hello William
I cant see any difference expect duration best regards Claas
389-Directory/2.3.2 B2023.073.0958 [16/Mar/2023:08:24:51.321404978 +0100] conn=51 fd=66 slot=66 connection from local to /run/slapd-389ds.socket [16/Mar/2023:08:24:51.323985845 +0100] conn=51 AUTOBIND dn="cn=root" [16/Mar/2023:08:24:51.325995690 +0100] conn=51 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [16/Mar/2023:08:24:51.328098136 +0100] conn=51 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000082030 optime=0.004197632 etime=0.004276581 dn="cn=root" [16/Mar/2023:08:24:51.328272655 +0100] conn=51 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser1,ou=People,dc=example,dc=com)" attrs="distinguishedName" [16/Mar/2023:08:24:52.285988416 +0100] conn=51 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000077055 optime=0.957714945 etime=0.957784949 [16/Mar/2023:08:24:52.286275743 +0100] conn=51 op=2 UNBIND [16/Mar/2023:08:24:52.291936625 +0100] conn=51 op=2 fd=66 Disconnect - Cleanly Closed Connection - U1 389-Directory/1.4.4.19 B2022.313.1200 [16/Mar/2023:09:10:20.353075132 +0100] conn=101 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [16/Mar/2023:09:10:20.355714488 +0100] conn=101 AUTOBIND dn="cn=root" [16/Mar/2023:09:10:20.357681511 +0100] conn=101 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [16/Mar/2023:09:10:20.359700165 +0100] conn=101 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000036305 optime=0.004064382 etime=0.004098191 dn="cn=root" [16/Mar/2023:09:10:20.359896870 +0100] conn=101 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser1,ou=People,dc=example,dc=com)" attrs="distinguishedName" [16/Mar/2023:09:10:20.367652447 +0100] conn=101 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000077477 optime=0.007755733 etime=0.007830994 [16/Mar/2023:09:10:20.369055287 +0100] conn=101 op=2 UNBIND [16/Mar/2023:09:10:20.371940374 +0100] conn=101 op=2 fd=64 closed error - U1
Gesendet: Mittwoch, 15. März 2023 um 03:41 Uhr Von: "William Brown" william.brown@suse.com[mailto:william.brown@suse.com] An: "389-users@lists.fedoraproject.org"[mailto:389-users@lists.fedoraproject.org] 389-users@lists.fedoraproject.org[mailto:389-users@lists.fedoraproject.org] Betreff: [389-users] Re: 2.x query performance problem
got newest version from https://github.com/389ds/389-ds-base%5Bhttps://github.com/389ds/389-ds-base] dc565fd (389-Directory/2.3.2 B2023.073.0958 ) I can confirm, manageDSAit makes no difference any more in query time, got etimes with 0,9 sec after import and reindexing (with and without option) but a little difference to 1.4.x ist still present :) ( 0.0x sec vs 0.9 sec)
Can we see the access log between the 1.4.x and 2.x version? There still seems to be a difference here which is curious :(
-- Sincerely,
William Brown
Senior Software Engineer, Identity and Access Management SUSE Labs, Australia _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org[mailto:389-users@lists.fedoraproject.org] To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org[mailto:389-users-leave@lists.fedoraproject.org] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/%5Bhttps://docs...] List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines%5Bhttps://fedoraproje...] List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue%5Bhttps://pagure.io/fedora...] _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org[mailto:389-users@lists.fedoraproject.org] To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org[mailto:389-users-leave@lists.fedoraproject.org] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/%5Bhttps://docs...] List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines%5Bhttps://fedoraproje...] List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue%5Bhttps://pagure.io/fedora...] _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/%5Bhttps://docs...] List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines%5Bhttps://fedoraproje...] List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue%5Bhttps://pagure.io/fedora...]
Hi Claas,
I was wrong stating that I reproduced locally similar issue. I was not using 2.0.17 and was reproducing an issue related to nsslapd-idlistscanlimit that is much larger in recent 2.x versions
I tried to mimic "Our environment has about 100k entries, about 15k users and about 10k groups. Also big groups with thousand of users, also users with thousand of group membership. So I would call it a small instance".
I made a given user an uniquemember of 1000 groups. Each group having 1000 users. Then the search retrieves the 1000 groups DN.
Could you share your dse.ldif (send me directly) ? Also could you confirm you see the same perf hit with regular connection bound as "cn=directory manager" ?
best regards thierry
On 3/14/23 14:25, Claas Vieler wrote:
Hallo Thierry, got newest version from https://github.com/389ds/389-ds-base dc565fd https://github.com/389ds/389-ds-base/commit/dc565fdacbde6e1fd333213d707aa2c5bca9cadf(389-Directory/2.3.2 B2023.073.0958 ) I can confirm, manageDSAit makes no difference any more in query time, got etimes with 0,9 sec after import and reindexing (with and without option) but a little difference to 1.4.x ist still present :) ( 0.0x sec vs 0.9 sec) thanks and best regards Claas *Gesendet:* Montag, 13. März 2023 um 17:55 Uhr *Von:* "Thierry Bordaz" tbordaz@redhat.com *An:* 389-users@lists.fedoraproject.org *Betreff:* [389-users] Re: 2.x query performance problem
Hi Class,
First, thank you sooo much for your tests. This is really helpful.
So my understanding is that this same req was
- [10, 30]ms in 1.4
- [900, 1700]ms in 2.x o A possibility is that the filter evaluation (against the 532 returned entry) is the responsible of the 1700ms (without manageDSAit
In short it looks like there is a significant (>30 times slower) regression in RHDS12 vs RHDS11 with that testcase. In RHDS12, the handling of referral adds a 2 times slower but it is possibly fixed with https://github.com/389ds/389-ds-base/issues/5598.
best regards thierry
On 3/13/23 17:18, Claas Vieler wrote:
Hello William, sorry, your mail was stuck in my spam filter, so I doesnt see it here are the logs with and without option manageDSAit (as Thierry mentioned) without manageDSAit: [13/Mar/2023:16:16:06.583644293 +0100] conn=32 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:06.586619267 +0100] conn=32 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:06.589037720 +0100] conn=32 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:06.591155242 +0100] conn=32 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000078559 optime=0.004658221 etime=0.004734544 dn="cn=root" [13/Mar/2023:16:16:06.591326840 +0100] conn=32 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=example,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:08.321020181 +0100] conn=32 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000114773 optime=1.729694222 etime=1.729803880 [13/Mar/2023:16:16:08.321992532 +0100] conn=32 op=2 UNBIND [13/Mar/2023:16:16:08.327041073 +0100] conn=32 op=2 fd=64 closed error - U1 with manageDSAit: [13/Mar/2023:16:16:22.324132867 +0100] conn=33 fd=64 slot=64 connection from local to /var/lib/dirsrv/slapd-389ds/slapd-389ds.socket [13/Mar/2023:16:16:22.326616612 +0100] conn=33 AUTOBIND dn="cn=root" [13/Mar/2023:16:16:22.328594648 +0100] conn=33 op=0 BIND dn="cn=root" method=sasl version=3 mech=EXTERNAL [13/Mar/2023:16:16:22.331154393 +0100] conn=33 op=0 RESULT err=0 tag=97 nentries=0 wtime=0.000055269 optime=0.004608598 etime=0.004661499 dn="cn=root" [13/Mar/2023:16:16:22.331366318 +0100] conn=33 op=1 SRCH base="dc=example,dc=com" scope=2 filter="(uniqueMember=cn=testuser,ou=People,dc=expample,dc=com)" attrs="distinguishedName" [13/Mar/2023:16:16:23.244139238 +0100] conn=33 op=2 UNBIND [13/Mar/2023:16:16:23.244725555 +0100] conn=33 op=1 RESULT err=0 tag=101 nentries=532 wtime=0.000081512 optime=0.913360154 etime=0.913438519 [1 *Gesendet:* Mittwoch, 08. März 2023 um 01:11 Uhr *Von:* "William Brown" <william.brown@suse.com> *An:* "389-users@lists.fedoraproject.org" <389-users@lists.fedoraproject.org> *Betreff:* [389-users] Re: 2.x query performance problem > > Hi Claas, > I do not recall a specific change 1.4.4 vs 2.0 that could explain this. > Do you confirm that 'uniqueMember' is indexed in equality on both ? What are the SRCH records in the access logs (notes=A ?). > On 2.0, it lasts 2sec, you may try to capture few pstacks that would give some tips. > regards > thierry we need to see the exact filter that's being used, as well as the access logs lines of the slow query to really help here. -- Sincerely, William Brown Senior Software Engineer, Identity and Access Management SUSE Labs, Australia _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
_______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
Hi Class,
pstack is a gdb wrapper command to dump backtrace of all threads. installing gdb you may get it.
I suspect that the culprit could be the evaluation of the filter over the matching entries (~500 groups owning cn=sampleuser). Using ldapsearch could you reproduce with '-e manageDSAit' option and check if there is still a diff between 1.4.4 and 2.0. Another investigation is to put a breakpoint on slapi_vattr_filter_test. With such filter it should not be called.
Just for confirmation, you indexed 'uniqueMember' did you indexed in 'eq' ?
best regards thierry
On 3/10/23 14:47, Claas Vieler wrote:
Hello Thierry, I can confirm index on 'uniqueMember' for both versions. I also tried to recreate and reindex 'uniqueMember', same result. SRCH-records are inconspicuous, except high optime (no notes..) What exactly do you want to see in pstacks. Do you mean the output from pstack-tool? regards Claas *Gesendet:* Dienstag, 07. März 2023 um 15:38 Uhr *Von:* "Thierry Bordaz" tbordaz@redhat.com *An:* 389-users@lists.fedoraproject.org *Betreff:* [389-users] Re: 2.x query performance problem
Hi Claas,
I do not recall a specific change 1.4.4 vs 2.0 that could explain this.
Do you confirm that 'uniqueMember' is indexed in equality on both ? What are the SRCH records in the access logs (notes=A ?). On 2.0, it lasts 2sec, you may try to capture few pstacks that would give some tips.
regards thierry
On 3/7/23 14:54, Claas Vieler wrote:
Hello, we have a search performance problem when we migrated from 1.4.4.19 to 2.0.17. Our environment has about 100k entries, about 15k users and about 10k groups. Also big groups with thousand of users, also users with thousand of group membership. So I would call it a small instance On 1.4.x query perfomance ist fine: ldapsearch for "(uniqueMember=cn=sampleuser,ou=People,dc=example,dc=com) dn " via LDAPI on 1.4.x takes approx 0,01-0,03 sec. This user is member of approx. 500 groups. I tested two migration methods: 1. via replication After initializing replica, the same query takes about _8_ sec. So I reindexed db (dsctl .. db2index) and get durations for the query from 2-3 sec. 2. via ldif export/import after importing, the same query takes about 2-3 sec But even with 2-3 sec, we talk about 2.x perfomance ten time slower than 1.4.x. Is this a know issue? I compared all cache-settings and found no differences. I have no more ideas how to optimize this. Should we wait for 2.x when its adopted to new lmdb? thanks Claas _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
_______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam, report it:https://pagure.io/fedora-infrastructure/new_issue
389-users@lists.fedoraproject.org