Rich and the List Thank for your continue support,
We are still seeing a index issues with memberof plugging, we are not sure at this point if this is related to our software or the plugin cfg behavior, I see 2 entries files.db4 for memberof plugin see bellow, is this correct? the 389-admin GUI shows only the memberof indexed, when I try to check for index corruption and run
-rw------- 1 ldap-ds ldap-ds 4005888 Oct 20 13:01 memberOf.db4 --rw------- 1 ldap-ds ldap-ds 3915776 Nov 23 07:58 memberof.db4
when I try to check for index values and use either memberof or memberOf files for the following attribute fails, what I am missing?
dbscan -f /var/lib/dirsrv/slapd-ldap/db/userRoot/memberof.db4 -k "dc=xxx,dc=com" Can't find key 'dc=xxx,dc=com'
same for
memberOf.db4 file
Thank you Isabella
On 11/10/2015 11:12 AM, ghiureai wrote:
Rich, thank you for all support for last day , unfortunately there is a strong wave in developers team:" the multimaster replication is creating issues with UI" ( I do not totally agree since can not be reproduce+ full describe the issues). Is been decided to moved down to master slave, please I need to know if I still need to exclude member of plugin from replication in this case ?
Thanks a lot Isabella On 11/10/2015 09:23 AM, Rich Megginson wrote:
On 11/10/2015 10:14 AM, Adrian Damian wrote:
Rich,
Thanks for your help. Let me jump in with more details.
We've seen index corruption on a number of occasions. It seems to affect searchable attributes for which there are indexes. Queries on an attribute in LDAP that used to work suddenly stopped working. They would return incomplete results and no results at all, although the data on the server was the same. The fix on those situations was to drop the index corresponding to the attribute and re-create it.
So in this case, you have some sort of LDAP search client, and you are doing a search for '(indexed_attribute=known_value)' and you are not seeing a result, and this is what you mean by "index corruption"?
Are you aware of the dbscan tool? https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/10/ht...
This tool allows you to examine the index file in the database directly.
dbscan -f /var/lib/dirsrv/slapd-instance_name/db/userRoot/indexed_attribute.db4 -k known_value
This will allow you to look at the indexed_attribute index directly for the value "known_value".
We've run the db fix script that the LDAP distribution comes with
What db fix script? Do you have a link to it, or a link to the product documentation for the script?
and there are no reports of corruption when this problem occurs. That makes it very hard to detect. We don't know what else to look for when we run into this again and more importantly, we don't know what triggers it and how to prevent it.
Mind you we are currently doing active development changing both the software clients that access the LDAP servers as well as the configurations of the servers. It is possible to had been written to both masters in the master replication configuration when the problem occurred but because there were multiple clients concurrently accessing the servers it is hard to figure out what triggered the issue.
Adrian
On 11/09/2015 05:06 PM, Rich Megginson wrote:
On 11/09/2015 05:47 PM, Ghiurea, Isabella wrote:
Hi Rich, Thank you for your feedback , as always greatly appreciate when comes from 389-DS RH support. We are not using vm just plain hardware, here is the description I got from developers team related to the issues they are seeing when running integration tests with multimaster replication : "index corruption: put content, run tests: OK, do more stuff (reads, writes, etc), ru tests: FAIL, notice "missing attributes", rebuild index(ices), run tests: OK. "
What does this mean? What program is printing these index corruption messages? Is it some tool provided by Red Hat?
Unfortunately, I understood this cases/issue can not be reproduce on regular basis, no mode details can be provide at this time
All reads and writes are going to only the master replication DS, not slave . I totally agree with your this is the way to cfg and maintain Directory Server in a operation critical env: multmaster replication only one master for writes. Here is the DS version: rpm -qa | grep 389-ds 389-ds-console-doc-1.2.6-1.el6.noarch 389-ds-base-libs-1.2.11.15-34.el6_5.x86_64 389-ds-1.2.2-1.el6.noarch 389-ds-base-1.2.11.15-34.el6_5.x86_64
This is quite an old version of 389-ds-base. I suggest upgrading to RHEL 6.7 with latest patches.
389-ds-console-1.2.6-1.el6.noarch
Thank you Isabella
FWD:
We have cfg multimaster replication /fractional replication memberof plugging excluded , we are seeing from time to time index corruption with some indexes , there is a strong feeling from developers this are related to DS multimaster replication internal settings. What version of 389? rpm -q 389-ds-base I'm assuming you are not using IPA. What does "index corruption" mean? What exactly do you see?
Are you running in virtual machines? If so, what kind? vmware? kvm? Are you using virtual disks or dedicated physical devices/paravirt?
We are writing to only one DS , same server at all time but reading from all DS 's cfg for mutlmaster. Are you seeing "index corruption" on the write master or on all servers?
Are other people seen this kind of issues with multimaster rep cfg , should we start avoiding this replication cfg at all ?
This is the recommended way to deploy. If this is not working for you, either you have a configuration problem, or there is some sort of vm or hardware problem, or there is a serious bug that requires fixing ASAP.
We choose the multimaster for the fast and reliable option to switch between master DS's , moving one step down to master/slave may require some down time when switching DS's back. Isabella
Hi Rich, Thank you for your feedback , as always greatly appreciate when comes from 389-DS RH support. We are not using vm just plain hardware, here is the description I got from developers team related to the issues they are seeing when running tests with multimaster replication :index corruption: put content, run tests: OK, do more stuff (reads, writes, etc), ru tests: FAIL, notice "missing attributes", rebuild index(ices), run tests: OK.
I belive we the reads and writes right now are only the master replication DS , not slave . I totally agree with your this is the way to cfg and maint DS in a operation env: multmaster replication with one master for writes. More comments , imput I appreciate rpm -qa | grep 389-ds 389-ds-console-doc-1.2.6-1.el6.noarch 389-ds-base-libs-1.2.11.15-34.el6_5.x86_64 389-ds-1.2.2-1.el6.noarch 389-ds-base-1.2.11.15-34.el6_5.x86_64 389-ds-console-1.2.6-1.el6.noarch 389-dsgw-1.1.11-1.el6.x86_64
From: ghiureai [isabella.ghiurea@nrc-cnrc.gc.ca] Sent: Monday, November 09, 2015 1:05 PM To: 389-users@lists.fedoraproject.org Subject: multimaster replication and index corruption
Hi List, We have cfg multimaster replication /fractional replication memberof plugging excluded , we are seeing from time to time index corruption with some indexes , there is a strong feeling from developers this are related to DS multimaster replication internal settings. We are writing to only one DS , same server at all time but reading from all DS 's cfg for mutlmaster. Are other people seen this kind of issues with multimaster rep cfg , should we start avoiding this replication cfg at all ? We choose the multimaster for the fast and reliable option to switch between master DS's , moving one step down to master/slave may require some down time when switching DS's back. Isabella
-- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
On 11/24/2015 10:02 AM, ghiureai wrote:
Rich and the List Thank for your continue support,
We are still seeing a index issues with memberof plugging, we are not sure at this point if this is related to our software or the plugin cfg behavior, I see 2 entries files.db4 for memberof plugin see bellow, is this correct? the 389-admin GUI shows only the memberof indexed, when I try to check for index corruption and run
-rw------- 1 ldap-ds ldap-ds 4005888 Oct 20 13:01 memberOf.db4 --rw------- 1 ldap-ds ldap-ds 3915776 Nov 23 07:58 memberof.db4
That's very bad. I thought we fixed that case issue with db files a long time ago.
when I try to check for index values and use either memberof or memberOf files for the following attribute fails, what I am missing?
dbscan -f /var/lib/dirsrv/slapd-ldap/db/userRoot/memberof.db4 -k "dc=xxx,dc=com" Can't find key 'dc=xxx,dc=com'
Not sure. Try doing dbscan -f /var/lib/dirsrv/slapd-ldap/db/userRoot/memberof.db4 first, to see what the keys look like
same for
memberOf.db4 file
Thank you Isabella
On 11/10/2015 11:12 AM, ghiureai wrote:
Rich, thank you for all support for last day , unfortunately there is a strong wave in developers team:" the multimaster replication is creating issues with UI" ( I do not totally agree since can not be reproduce+ full describe the issues). Is been decided to moved down to master slave, please I need to know if I still need to exclude member of plugin from replication in this case ?
Thanks a lot Isabella On 11/10/2015 09:23 AM, Rich Megginson wrote:
On 11/10/2015 10:14 AM, Adrian Damian wrote:
Rich,
Thanks for your help. Let me jump in with more details.
We've seen index corruption on a number of occasions. It seems to affect searchable attributes for which there are indexes. Queries on an attribute in LDAP that used to work suddenly stopped working. They would return incomplete results and no results at all, although the data on the server was the same. The fix on those situations was to drop the index corresponding to the attribute and re-create it.
So in this case, you have some sort of LDAP search client, and you are doing a search for '(indexed_attribute=known_value)' and you are not seeing a result, and this is what you mean by "index corruption"?
Are you aware of the dbscan tool? https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/10/ht...
This tool allows you to examine the index file in the database directly.
dbscan -f /var/lib/dirsrv/slapd-instance_name/db/userRoot/indexed_attribute.db4 -k known_value
This will allow you to look at the indexed_attribute index directly for the value "known_value".
We've run the db fix script that the LDAP distribution comes with
What db fix script? Do you have a link to it, or a link to the product documentation for the script?
and there are no reports of corruption when this problem occurs. That makes it very hard to detect. We don't know what else to look for when we run into this again and more importantly, we don't know what triggers it and how to prevent it.
Mind you we are currently doing active development changing both the software clients that access the LDAP servers as well as the configurations of the servers. It is possible to had been written to both masters in the master replication configuration when the problem occurred but because there were multiple clients concurrently accessing the servers it is hard to figure out what triggered the issue.
Adrian
On 11/09/2015 05:06 PM, Rich Megginson wrote:
On 11/09/2015 05:47 PM, Ghiurea, Isabella wrote:
Hi Rich, Thank you for your feedback , as always greatly appreciate when comes from 389-DS RH support. We are not using vm just plain hardware, here is the description I got from developers team related to the issues they are seeing when running integration tests with multimaster replication : "index corruption: put content, run tests: OK, do more stuff (reads, writes, etc), ru tests: FAIL, notice "missing attributes", rebuild index(ices), run tests: OK. "
What does this mean? What program is printing these index corruption messages? Is it some tool provided by Red Hat?
Unfortunately, I understood this cases/issue can not be reproduce on regular basis, no mode details can be provide at this time
All reads and writes are going to only the master replication DS, not slave . I totally agree with your this is the way to cfg and maintain Directory Server in a operation critical env: multmaster replication only one master for writes. Here is the DS version: rpm -qa | grep 389-ds 389-ds-console-doc-1.2.6-1.el6.noarch 389-ds-base-libs-1.2.11.15-34.el6_5.x86_64 389-ds-1.2.2-1.el6.noarch 389-ds-base-1.2.11.15-34.el6_5.x86_64
This is quite an old version of 389-ds-base. I suggest upgrading to RHEL 6.7 with latest patches.
389-ds-console-1.2.6-1.el6.noarch
Thank you Isabella
FWD:
We have cfg multimaster replication /fractional replication memberof plugging excluded , we are seeing from time to time index corruption with some indexes , there is a strong feeling from developers this are related to DS multimaster replication internal settings. What version of 389? rpm -q 389-ds-base I'm assuming you are not using IPA. What does "index corruption" mean? What exactly do you see?
Are you running in virtual machines? If so, what kind? vmware? kvm? Are you using virtual disks or dedicated physical devices/paravirt?
We are writing to only one DS , same server at all time but reading from all DS 's cfg for mutlmaster. Are you seeing "index corruption" on the write master or on all servers?
Are other people seen this kind of issues with multimaster rep cfg , should we start avoiding this replication cfg at all ?
This is the recommended way to deploy. If this is not working for you, either you have a configuration problem, or there is some sort of vm or hardware problem, or there is a serious bug that requires fixing ASAP.
We choose the multimaster for the fast and reliable option to switch between master DS's , moving one step down to master/slave may require some down time when switching DS's back. Isabella
Hi Rich, Thank you for your feedback , as always greatly appreciate when comes from 389-DS RH support. We are not using vm just plain hardware, here is the description I got from developers team related to the issues they are seeing when running tests with multimaster replication :index corruption: put content, run tests: OK, do more stuff (reads, writes, etc), ru tests: FAIL, notice "missing attributes", rebuild index(ices), run tests: OK.
I belive we the reads and writes right now are only the master replication DS , not slave . I totally agree with your this is the way to cfg and maint DS in a operation env: multmaster replication with one master for writes. More comments , imput I appreciate rpm -qa | grep 389-ds 389-ds-console-doc-1.2.6-1.el6.noarch 389-ds-base-libs-1.2.11.15-34.el6_5.x86_64 389-ds-1.2.2-1.el6.noarch 389-ds-base-1.2.11.15-34.el6_5.x86_64 389-ds-console-1.2.6-1.el6.noarch 389-dsgw-1.1.11-1.el6.x86_64
From: ghiureai [isabella.ghiurea@nrc-cnrc.gc.ca] Sent: Monday, November 09, 2015 1:05 PM To:389-users@lists.fedoraproject.org Subject: multimaster replication and index corruption
Hi List, We have cfg multimaster replication /fractional replication memberof plugging excluded , we are seeing from time to time index corruption with some indexes , there is a strong feeling from developers this are related to DS multimaster replication internal settings. We are writing to only one DS , same server at all time but reading from all DS 's cfg for mutlmaster. Are other people seen this kind of issues with multimaster rep cfg , should we start avoiding this replication cfg at all ? We choose the multimaster for the fast and reliable option to switch between master DS's , moving one step down to master/slave may require some down time when switching DS's back. Isabella
-- 389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
On 11/24/2015 06:11 PM, Rich Megginson wrote:
On 11/24/2015 10:02 AM, ghiureai wrote:
Rich and the List Thank for your continue support,
We are still seeing a index issues with memberof plugging, we are not sure at this point if this is related to our software or the plugin cfg behavior, I see 2 entries files.db4 for memberof plugin see bellow, is this correct? the 389-admin GUI shows only the memberof indexed, when I try to check for index corruption and run
-rw------- 1 ldap-ds ldap-ds 4005888 Oct 20 13:01 memberOf.db4 --rw------- 1 ldap-ds ldap-ds 3915776 Nov 23 07:58 memberof.db4
That's very bad. I thought we fixed that case issue with db files a long time ago.
when I try to check for index values and use either memberof or memberOf files for the following attribute fails, what I am missing?
dbscan -f /var/lib/dirsrv/slapd-ldap/db/userRoot/memberof.db4 -k "dc=xxx,dc=com" Can't find key 'dc=xxx,dc=com'
Not sure. Try doing dbscan -f /var/lib/dirsrv/slapd-ldap/db/userRoot/memberof.db4 first, to see what the keys look like
I think all the keys have a prefix indicating presence, substr or equality, so it should ne more likely "=dc=xxx,dc=com"
same for
memberOf.db4 file
Thank you Isabella
On 11/10/2015 11:12 AM, ghiureai wrote:
Rich, thank you for all support for last day , unfortunately there is a strong wave in developers team:" the multimaster replication is creating issues with UI" ( I do not totally agree since can not be reproduce+ full describe the issues). Is been decided to moved down to master slave, please I need to know if I still need to exclude member of plugin from replication in this case ?
Thanks a lot Isabella On 11/10/2015 09:23 AM, Rich Megginson wrote:
On 11/10/2015 10:14 AM, Adrian Damian wrote:
Rich,
Thanks for your help. Let me jump in with more details.
We've seen index corruption on a number of occasions. It seems to affect searchable attributes for which there are indexes. Queries on an attribute in LDAP that used to work suddenly stopped working. They would return incomplete results and no results at all, although the data on the server was the same. The fix on those situations was to drop the index corresponding to the attribute and re-create it.
So in this case, you have some sort of LDAP search client, and you are doing a search for '(indexed_attribute=known_value)' and you are not seeing a result, and this is what you mean by "index corruption"?
Are you aware of the dbscan tool? https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/10/ht...
This tool allows you to examine the index file in the database directly.
dbscan -f /var/lib/dirsrv/slapd-instance_name/db/userRoot/indexed_attribute.db4 -k known_value
This will allow you to look at the indexed_attribute index directly for the value "known_value".
We've run the db fix script that the LDAP distribution comes with
What db fix script? Do you have a link to it, or a link to the product documentation for the script?
and there are no reports of corruption when this problem occurs. That makes it very hard to detect. We don't know what else to look for when we run into this again and more importantly, we don't know what triggers it and how to prevent it.
Mind you we are currently doing active development changing both the software clients that access the LDAP servers as well as the configurations of the servers. It is possible to had been written to both masters in the master replication configuration when the problem occurred but because there were multiple clients concurrently accessing the servers it is hard to figure out what triggered the issue.
Adrian
On 11/09/2015 05:06 PM, Rich Megginson wrote:
On 11/09/2015 05:47 PM, Ghiurea, Isabella wrote: > Hi Rich, > Thank you for your feedback , as always greatly appreciate when > comes from 389-DS RH support. > We are not using vm just plain hardware, here is the description > I got from developers team related to the issues they are seeing > when running integration tests with multimaster replication : > "index corruption: put content, run tests: OK, do more stuff (reads, > writes, etc), ru tests: FAIL, notice "missing attributes", rebuild > index(ices), run tests: OK. " What does this mean? What program is printing these index corruption messages? Is it some tool provided by Red Hat?
> Unfortunately, I understood this cases/issue can not be reproduce > on regular basis, no mode details can be provide at this time > > All reads and writes are going to only the master replication DS, > not slave . > I totally agree with your this is the way to cfg and maintain > Directory Server in a operation critical env: multmaster > replication only one master for writes. > Here is the DS version: > rpm -qa | grep 389-ds > 389-ds-console-doc-1.2.6-1.el6.noarch > 389-ds-base-libs-1.2.11.15-34.el6_5.x86_64 > 389-ds-1.2.2-1.el6.noarch > 389-ds-base-1.2.11.15-34.el6_5.x86_64 This is quite an old version of 389-ds-base. I suggest upgrading to RHEL 6.7 with latest patches.
> 389-ds-console-1.2.6-1.el6.noarch > > > Thank you > Isabella > > FWD: > > > We have cfg multimaster replication /fractional replication memberof > plugging excluded , we are seeing from time to time index corruption > with some indexes , there is a strong feeling from developers this > are related to DS multimaster replication internal settings. > What version of 389? rpm -q 389-ds-base > I'm assuming you are not using IPA. > What does "index corruption" mean? What exactly do you see? > > Are you running in virtual machines? If so, what kind? vmware? kvm? > Are you using virtual disks or dedicated physical devices/paravirt? > > We are writing to only one DS , same server at all time but reading > from all DS 's cfg for mutlmaster. > Are you seeing "index corruption" on the write master or on all > servers? > > > Are other people seen this kind of issues with multimaster rep cfg , > should we start avoiding this replication cfg at all ? > > This is the recommended way to deploy. If this is not working for > you, either you have a configuration problem, or there is some sort > of vm or hardware problem, or there is a serious bug that requires > fixing ASAP. > > We choose the multimaster for the fast and reliable option to switch > between master DS's , moving one step down to master/slave may > require some down time when switching DS's back. > Isabella > > > > > > Hi Rich, > Thank you for your feedback , as always greatly appreciate when > comes from 389-DS RH support. > We are not using vm just plain hardware, here is the description > I got from developers team related to the issues they are seeing > when running tests with multimaster replication :index corruption: > put content, run tests: OK, do more stuff (reads, writes, etc), ru > tests: FAIL, notice "missing attributes", rebuild index(ices), run > tests: OK. > > I belive we the reads and writes right now are only the master > replication DS , not slave . > I totally agree with your this is the way to cfg and maint DS in a > operation env: multmaster replication with one master for writes. > More comments , imput I appreciate > rpm -qa | grep 389-ds > 389-ds-console-doc-1.2.6-1.el6.noarch > 389-ds-base-libs-1.2.11.15-34.el6_5.x86_64 > 389-ds-1.2.2-1.el6.noarch > 389-ds-base-1.2.11.15-34.el6_5.x86_64 > 389-ds-console-1.2.6-1.el6.noarch > 389-dsgw-1.1.11-1.el6.x86_64 > > ________________________________________ > From: ghiureai [isabella.ghiurea@nrc-cnrc.gc.ca] > Sent: Monday, November 09, 2015 1:05 PM > To:389-users@lists.fedoraproject.org > Subject: multimaster replication and index corruption > > Hi List, > We have cfg multimaster replication /fractional replication memberof > plugging excluded , we are seeing from time to time index corruption > with some indexes , there is a strong feeling from developers this are > related to DS multimaster replication internal settings. > We are writing to only one DS , same server at all time but reading > from all DS 's cfg for mutlmaster. > Are other people seen this kind of issues with multimaster rep cfg , > should we start avoiding this replication cfg at all ? > We choose the multimaster for the fast and reliable option to switch > between master DS's , moving one step down to master/slave may require > some down time when switching DS's back.
> Isabella
389 users mailing list 389-users@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users
-- 389 users mailing list 389-users@%(host_name)s http://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org
389-users@lists.fedoraproject.org