Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions 1) is it normal to have so many entries in cn=changelog? On another replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
On 8/9/21 8:09 AM, Kees Bakker wrote:
Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions
- is it normal to have so many entries in cn=changelog? On another
replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
HTH,
Mark
On 09-08-2021 16:00, Mark Reynolds wrote:
On 8/9/21 8:09 AM, Kees Bakker wrote:
Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions
- is it normal to have so many entries in cn=changelog? On another
replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
Let's hope that 1 million is enough :-)
HTH,
Mark
[1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
On 8/9/21 11:20 AM, Kees Bakker wrote:
On 09-08-2021 16:00, Mark Reynolds wrote:
On 8/9/21 8:09 AM, Kees Bakker wrote:
Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions
- is it normal to have so many entries in cn=changelog? On another
replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
HTH,
Mark
[1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote:
On 09-08-2021 16:00, Mark Reynolds wrote:
On 8/9/21 8:09 AM, Kees Bakker wrote:
Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions
- is it normal to have so many entries in cn=changelog? On another
replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
HTH,
Mark
[1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
-- Directory Server Development Team
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote:
On 09-08-2021 16:00, Mark Reynolds wrote:
On 8/9/21 8:09 AM, Kees Bakker wrote:
Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions
- is it normal to have so many entries in cn=changelog? On another
replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
HTH,
Mark
[1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote:
On 09-08-2021 16:00, Mark Reynolds wrote:
On 8/9/21 8:09 AM, Kees Bakker wrote:
Hi,
When my dirsrv was trying to compact the databases I was getting this error
[07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - Compacting databases ... [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - Compacting DB start: userRoot [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - compactdb: compact userRoot - 417 pages freed [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - Compacting DB start: ipaca [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - compactdb: compact ipaca - 419 pages freed [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - Compacting DB start: changelog [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - NSMMReplicationPlugin - changelog program - cl5CompactDBs - compacting replication changelogs... [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - compactdb: failed to compact changelog; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock table is out of available lock entries [08/Aug/2021:00:00:54.876618702 +0200] - ERR - NSMMReplicationPlugin - changelog program - cl5CompactDBs - Failed to compact a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate memory [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - Compacting databases finished.
There are about 402k entries in cn=changelog.
I have a few questions
- is it normal to have so many entries in cn=changelog? On another
replica I have almost 3M entries Isn't this cleaned up? 2) the number of locks is 50000 (there are two config items). Should I increase that number? If so, increase to what? 3) is there maybe something else going on, causing the exhaustion of the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
Hmm, that maxage was already at 2 days
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
HTH,
Mark
[1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote:
On 09-08-2021 16:00, Mark Reynolds wrote:
On 8/9/21 8:09 AM, Kees Bakker wrote: > Hi, > > When my dirsrv was trying to compact the databases I was getting > this > error > > [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - > Compacting databases ... > [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - > Compacting DB start: userRoot > [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - > compactdb: compact userRoot - 417 pages freed > [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - > Compacting DB start: ipaca > [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - > compactdb: compact ipaca - 419 pages freed > [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - > Compacting DB start: changelog > [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - > NSMMReplicationPlugin - changelog program - cl5CompactDBs - > compacting > replication changelogs... > [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock > table is out of available lock entries > [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - > compactdb: failed to compact changelog; db error - 12 Cannot > allocate > memory > [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock > table is out of available lock entries > [08/Aug/2021:00:00:54.876618702 +0200] - ERR - > NSMMReplicationPlugin - > changelog program - cl5CompactDBs - Failed to compact > a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate > memory > [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - > Compacting databases finished. > > There are about 402k entries in cn=changelog. > > I have a few questions > 1) is it normal to have so many entries in cn=changelog? On another > replica I have almost 3M entries Isn't this cleaned up? > 2) the number of locks is 50000 (there are two config items). > Should I > increase that number? If so, increase to what? > 3) is there maybe something else going on, causing the > exhaustion of > the locks?
Ok, so by default there is no changelog trimming enabled. So the changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
I recommend setting up the changelog max age to 7 days:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Once that the trimming cleans up the changelog the database compaction should work without issue. You can also increase the database locks to 1 million (that does require a restart of the server to take effect).
Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
HTH,
Mark
[1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote:
On 09-08-2021 16:00, Mark Reynolds wrote: > On 8/9/21 8:09 AM, Kees Bakker wrote: >> Hi, >> >> When my dirsrv was trying to compact the databases I was getting >> this >> error >> >> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >> Compacting databases ... >> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >> Compacting DB start: userRoot >> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >> compactdb: compact userRoot - 417 pages freed >> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >> Compacting DB start: ipaca >> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >> compactdb: compact ipaca - 419 pages freed >> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >> Compacting DB start: changelog >> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >> compacting >> replication changelogs... >> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >> table is out of available lock entries >> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >> compactdb: failed to compact changelog; db error - 12 Cannot >> allocate >> memory >> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >> table is out of available lock entries >> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >> NSMMReplicationPlugin - >> changelog program - cl5CompactDBs - Failed to compact >> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >> memory >> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >> Compacting databases finished. >> >> There are about 402k entries in cn=changelog. >> >> I have a few questions >> 1) is it normal to have so many entries in cn=changelog? On another >> replica I have almost 3M entries Isn't this cleaned up? >> 2) the number of locks is 50000 (there are two config items). >> Should I >> increase that number? If so, increase to what? >> 3) is there maybe something else going on, causing the >> exhaustion of >> the locks? > > Ok, so by default there is no changelog trimming enabled. So the > changelog will grow without bounds, which is bad.
How much of this [1] applies? Indeed it says "By default the Replication Changelog does not use any trimming by default." So, how is the trimming actually enabled? I have these entries
dn: cn=changelog5,cn=config cn: changelog5 nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb nsslapd-changelogmaxage: 30d objectClass: top objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
> > I recommend setting up the changelog max age to 7 days: > > https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... > > > > > Once that the trimming cleans up the changelog the database > compaction > should work without issue. You can also increase the database > locks to > 1 million (that does require a restart of the server to take > effect).
Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
> > HTH, > > Mark > [1] https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
On 11-08-2021 11:49, Kees Bakker wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote: > On 09-08-2021 16:00, Mark Reynolds wrote: >> On 8/9/21 8:09 AM, Kees Bakker wrote: >>> Hi, >>> >>> When my dirsrv was trying to compact the databases I was getting >>> this >>> error >>> >>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>> Compacting databases ... >>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>> Compacting DB start: userRoot >>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>> compactdb: compact userRoot - 417 pages freed >>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>> Compacting DB start: ipaca >>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>> compactdb: compact ipaca - 419 pages freed >>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>> Compacting DB start: changelog >>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>> compacting >>> replication changelogs... >>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>> table is out of available lock entries >>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>> compactdb: failed to compact changelog; db error - 12 Cannot >>> allocate >>> memory >>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>> table is out of available lock entries >>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>> NSMMReplicationPlugin - >>> changelog program - cl5CompactDBs - Failed to compact >>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>> memory >>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>> Compacting databases finished. >>> >>> There are about 402k entries in cn=changelog. >>> >>> I have a few questions >>> 1) is it normal to have so many entries in cn=changelog? On another >>> replica I have almost 3M entries Isn't this cleaned up? >>> 2) the number of locks is 50000 (there are two config items). >>> Should I >>> increase that number? If so, increase to what? >>> 3) is there maybe something else going on, causing the >>> exhaustion of >>> the locks? >> >> Ok, so by default there is no changelog trimming enabled. So the >> changelog will grow without bounds, which is bad. > > How much of this [1] applies? Indeed it says "By default the > Replication Changelog does not use any trimming by default." > So, how is the trimming actually enabled? I have these entries > > dn: cn=changelog5,cn=config > cn: changelog5 > nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb > nsslapd-changelogmaxage: 30d > objectClass: top > objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
This is probably introduced in commit a5029c8091 for Pull Request [1]. The queue function is now called with a relative time.
[1] https://github.com/389ds/389-ds-base/pull/4450 [2] https://github.com/389ds/389-ds-base/issues/4384
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
> > >> >> I recommend setting up the changelog max age to 7 days: >> >> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... >> >> >> >> >> Once that the trimming cleans up the changelog the database >> compaction >> should work without issue. You can also increase the database >> locks to >> 1 million (that does require a restart of the server to take >> effect). > > Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
> >> >> HTH, >> >> Mark >> > [1] > https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html > >
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 8/11/21 8:30 AM, Kees Bakker wrote:
On 11-08-2021 11:49, Kees Bakker wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote: > On 8/9/21 11:20 AM, Kees Bakker wrote: >> On 09-08-2021 16:00, Mark Reynolds wrote: >>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>> Hi, >>>> >>>> When my dirsrv was trying to compact the databases I was getting >>>> this >>>> error >>>> >>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>> Compacting databases ... >>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>> Compacting DB start: userRoot >>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>> compactdb: compact userRoot - 417 pages freed >>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>> Compacting DB start: ipaca >>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>> compactdb: compact ipaca - 419 pages freed >>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>> Compacting DB start: changelog >>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>> compacting >>>> replication changelogs... >>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - >>>> BDB2055 Lock >>>> table is out of available lock entries >>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>> allocate >>>> memory >>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - >>>> BDB2055 Lock >>>> table is out of available lock entries >>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>> NSMMReplicationPlugin - >>>> changelog program - cl5CompactDBs - Failed to compact >>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot >>>> allocate >>>> memory >>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>> Compacting databases finished. >>>> >>>> There are about 402k entries in cn=changelog. >>>> >>>> I have a few questions >>>> 1) is it normal to have so many entries in cn=changelog? On >>>> another >>>> replica I have almost 3M entries Isn't this cleaned up? >>>> 2) the number of locks is 50000 (there are two config items). >>>> Should I >>>> increase that number? If so, increase to what? >>>> 3) is there maybe something else going on, causing the >>>> exhaustion of >>>> the locks? >>> >>> Ok, so by default there is no changelog trimming enabled. So the >>> changelog will grow without bounds, which is bad. >> >> How much of this [1] applies? Indeed it says "By default the >> Replication Changelog does not use any trimming by default." >> So, how is the trimming actually enabled? I have these entries >> >> dn: cn=changelog5,cn=config >> cn: changelog5 >> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >> nsslapd-changelogmaxage: 30d >> objectClass: top >> objectClass: extensibleobject > > So trimming is set, but it's set to 30 days (30d), that > could/should be > shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
This is probably introduced in commit a5029c8091 for Pull Request [1]. The queue function is now called with a relative time.
[1] https://github.com/389ds/389-ds-base/pull/4450 [2] https://github.com/389ds/389-ds-base/issues/4384
Yes I think you found a bug. Are you building the server yourself? Can I send you a code patch to try?
Mark
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
> >> >> >>> >>> I recommend setting up the changelog max age to 7 days: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... >>> >>> >>> >>> >>> >>> Once that the trimming cleans up the changelog the database >>> compaction >>> should work without issue. You can also increase the database >>> locks to >>> 1 million (that does require a restart of the server to take >>> effect). >> >> Let's hope that 1 million is enough :-) > > It might not be, you can keep bumping it up if needed. There is no > limit, or negative impact on db performance. > > Mark > >> >>> >>> HTH, >>> >>> Mark >>> >> [1] >> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html >> >> >> > > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 11-08-2021 14:38, Mark Reynolds wrote:
On 8/11/21 8:30 AM, Kees Bakker wrote:
On 11-08-2021 11:49, Kees Bakker wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote: > On 09-08-2021 18:43, Mark Reynolds wrote: >> On 8/9/21 11:20 AM, Kees Bakker wrote: >>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>> Hi, >>>>> >>>>> When my dirsrv was trying to compact the databases I was getting >>>>> this >>>>> error >>>>> >>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>> Compacting databases ... >>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>> Compacting DB start: userRoot >>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>> compactdb: compact userRoot - 417 pages freed >>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>> Compacting DB start: ipaca >>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>> compactdb: compact ipaca - 419 pages freed >>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>> Compacting DB start: changelog >>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>> compacting >>>>> replication changelogs... >>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - >>>>> BDB2055 Lock >>>>> table is out of available lock entries >>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>> allocate >>>>> memory >>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - >>>>> BDB2055 Lock >>>>> table is out of available lock entries >>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>> NSMMReplicationPlugin - >>>>> changelog program - cl5CompactDBs - Failed to compact >>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot >>>>> allocate >>>>> memory >>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>> Compacting databases finished. >>>>> >>>>> There are about 402k entries in cn=changelog. >>>>> >>>>> I have a few questions >>>>> 1) is it normal to have so many entries in cn=changelog? On >>>>> another >>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>> 2) the number of locks is 50000 (there are two config items). >>>>> Should I >>>>> increase that number? If so, increase to what? >>>>> 3) is there maybe something else going on, causing the >>>>> exhaustion of >>>>> the locks? >>>> >>>> Ok, so by default there is no changelog trimming enabled. So the >>>> changelog will grow without bounds, which is bad. >>> >>> How much of this [1] applies? Indeed it says "By default the >>> Replication Changelog does not use any trimming by default." >>> So, how is the trimming actually enabled? I have these entries >>> >>> dn: cn=changelog5,cn=config >>> cn: changelog5 >>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>> nsslapd-changelogmaxage: 30d >>> objectClass: top >>> objectClass: extensibleobject >> >> So trimming is set, but it's set to 30 days (30d), that >> could/should be >> shortened to "7d". > > Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
This is probably introduced in commit a5029c8091 for Pull Request [1]. The queue function is now called with a relative time.
[1] https://github.com/389ds/389-ds-base/pull/4450 [2] https://github.com/389ds/389-ds-base/issues/4384
Yes I think you found a bug. Are you building the server yourself? Can I send you a code patch to try?
No, I'm "just" a CentOS user, with access to the Github source code. And the last time I build an rpm was more than 20 years ago in the early RedHat days.
Also, I don't want to mess too much with our production system.
Mark
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
> > The first entry in cn=changelog is more than two months old. (May > 26th > is about > the time I installed this server as a FreeIPA replica.) > > dn: changenumber=1,cn=changelog > objectClass: top > objectClass: changelogentry > objectClass: extensibleObject > targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 > changeNumber: 1 > targetDn: > idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com > changeTime: 20210526122901Z > changeType: modify > changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== > > Is there a way to see trimming in action? I'm afraid it is not doing > anything.
What diagnostic can I enable to see the trimming in action?
> >> >>> >>> >>>> >>>> I recommend setting up the changelog max age to 7 days: >>>> >>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... >>>> >>>> >>>> >>>> >>>> >>>> Once that the trimming cleans up the changelog the database >>>> compaction >>>> should work without issue. You can also increase the database >>>> locks to >>>> 1 million (that does require a restart of the server to take >>>> effect). >>> >>> Let's hope that 1 million is enough :-) >> >> It might not be, you can keep bumping it up if needed. There is no >> limit, or negative impact on db performance. >> >> Mark >> >>> >>>> >>>> HTH, >>>> >>>> Mark >>>> >>> [1] >>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html >>> >>> >>> >> >> -- >> Directory Server Development Team >> > _______________________________________________ > 389-users mailing list -- 389-users@lists.fedoraproject.org > To unsubscribe send an email to > 389-users-leave@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: > https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... > > > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
On 8/11/21 8:52 AM, Kees Bakker wrote:
On 11-08-2021 14:38, Mark Reynolds wrote:
On 8/11/21 8:30 AM, Kees Bakker wrote:
On 11-08-2021 11:49, Kees Bakker wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote: > On 8/9/21 1:16 PM, Kees Bakker wrote: >> On 09-08-2021 18:43, Mark Reynolds wrote: >>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>> Hi, >>>>>> >>>>>> When my dirsrv was trying to compact the databases I was >>>>>> getting >>>>>> this >>>>>> error >>>>>> >>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> Compacting databases ... >>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> Compacting DB start: userRoot >>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> compactdb: compact userRoot - 417 pages freed >>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> Compacting DB start: ipaca >>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> compactdb: compact ipaca - 419 pages freed >>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> Compacting DB start: changelog >>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>> compacting >>>>>> replication changelogs... >>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - >>>>>> BDB2055 Lock >>>>>> table is out of available lock entries >>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>> allocate >>>>>> memory >>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - >>>>>> BDB2055 Lock >>>>>> table is out of available lock entries >>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>> NSMMReplicationPlugin - >>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot >>>>>> allocate >>>>>> memory >>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - >>>>>> bdb_compact - >>>>>> Compacting databases finished. >>>>>> >>>>>> There are about 402k entries in cn=changelog. >>>>>> >>>>>> I have a few questions >>>>>> 1) is it normal to have so many entries in cn=changelog? On >>>>>> another >>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>> Should I >>>>>> increase that number? If so, increase to what? >>>>>> 3) is there maybe something else going on, causing the >>>>>> exhaustion of >>>>>> the locks? >>>>> >>>>> Ok, so by default there is no changelog trimming enabled. So >>>>> the >>>>> changelog will grow without bounds, which is bad. >>>> >>>> How much of this [1] applies? Indeed it says "By default the >>>> Replication Changelog does not use any trimming by default." >>>> So, how is the trimming actually enabled? I have these entries >>>> >>>> dn: cn=changelog5,cn=config >>>> cn: changelog5 >>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>> nsslapd-changelogmaxage: 30d >>>> objectClass: top >>>> objectClass: extensibleobject >>> >>> So trimming is set, but it's set to 30 days (30d), that >>> could/should be >>> shortened to "7d". >> >> Somehow I doubt that this will do anything. > > Ahhh, I thought this was the Replication Changelog, but it is the > "Retro" Changelog. Ok so check this out: > > https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... > > > > > Or update config directly using ldapmodify: > > # ldapmodify -D ... > dn: cn=Retro Changelog Plugin,cn=plugins,cn=config > changetype: modify > replace: nsslapd-changelogmaxage: 7d > nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
This is probably introduced in commit a5029c8091 for Pull Request [1]. The queue function is now called with a relative time.
[1] https://github.com/389ds/389-ds-base/pull/4450 [2] https://github.com/389ds/389-ds-base/issues/4384
Yes I think you found a bug. Are you building the server yourself? Can I send you a code patch to try?
No, I'm "just" a CentOS user, with access to the Github source code. And the last time I build an rpm was more than 20 years ago in the early RedHat days.
Also, I don't want to mess too much with our production system.
Understood. What version of 389-ds-base are you using?
Thanks, Mark
Mark
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
> > > HTH, > > Mark > >> >> The first entry in cn=changelog is more than two months old. (May >> 26th >> is about >> the time I installed this server as a FreeIPA replica.) >> >> dn: changenumber=1,cn=changelog >> objectClass: top >> objectClass: changelogentry >> objectClass: extensibleObject >> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >> changeNumber: 1 >> targetDn: >> idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com >> changeTime: 20210526122901Z >> changeType: modify >> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >> >> Is there a way to see trimming in action? I'm afraid it is not >> doing >> anything.
What diagnostic can I enable to see the trimming in action?
>> >>> >>>> >>>> >>>>> >>>>> I recommend setting up the changelog max age to 7 days: >>>>> >>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Once that the trimming cleans up the changelog the database >>>>> compaction >>>>> should work without issue. You can also increase the database >>>>> locks to >>>>> 1 million (that does require a restart of the server to take >>>>> effect). >>>> >>>> Let's hope that 1 million is enough :-) >>> >>> It might not be, you can keep bumping it up if needed. There >>> is no >>> limit, or negative impact on db performance. >>> >>> Mark >>> >>>> >>>>> >>>>> HTH, >>>>> >>>>> Mark >>>>> >>>> [1] >>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html >>>> >>>> >>>> >>>> >>> >>> -- >>> Directory Server Development Team >>> >> _______________________________________________ >> 389-users mailing list -- 389-users@lists.fedoraproject.org >> To unsubscribe send an email to >> 389-users-leave@lists.fedoraproject.org >> Fedora Code of Conduct: >> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ >> List Guidelines: >> https://fedoraproject.org/wiki/Mailing_list_guidelines >> List Archives: >> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... >> >> >> >> Do not reply to spam on the list, report it: >> https://pagure.io/fedora-infrastructure > > -- > Directory Server Development Team >
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
On 11-08-2021 15:01, Mark Reynolds wrote:
On 8/11/21 8:52 AM, Kees Bakker wrote:
On 11-08-2021 14:38, Mark Reynolds wrote:
On 8/11/21 8:30 AM, Kees Bakker wrote:
On 11-08-2021 11:49, Kees Bakker wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote: > On 09-08-2021 19:25, Mark Reynolds wrote: >> On 8/9/21 1:16 PM, Kees Bakker wrote: >>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>> Hi, >>>>>>> >>>>>>> When my dirsrv was trying to compact the databases I was >>>>>>> getting >>>>>>> this >>>>>>> error >>>>>>> >>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> Compacting databases ... >>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> Compacting DB start: userRoot >>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> Compacting DB start: ipaca >>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> Compacting DB start: changelog >>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>> compacting >>>>>>> replication changelogs... >>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - >>>>>>> BDB2055 Lock >>>>>>> table is out of available lock entries >>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>> allocate >>>>>>> memory >>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - >>>>>>> BDB2055 Lock >>>>>>> table is out of available lock entries >>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>> NSMMReplicationPlugin - >>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot >>>>>>> allocate >>>>>>> memory >>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - >>>>>>> bdb_compact - >>>>>>> Compacting databases finished. >>>>>>> >>>>>>> There are about 402k entries in cn=changelog. >>>>>>> >>>>>>> I have a few questions >>>>>>> 1) is it normal to have so many entries in cn=changelog? On >>>>>>> another >>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>> Should I >>>>>>> increase that number? If so, increase to what? >>>>>>> 3) is there maybe something else going on, causing the >>>>>>> exhaustion of >>>>>>> the locks? >>>>>> >>>>>> Ok, so by default there is no changelog trimming enabled. So >>>>>> the >>>>>> changelog will grow without bounds, which is bad. >>>>> >>>>> How much of this [1] applies? Indeed it says "By default the >>>>> Replication Changelog does not use any trimming by default." >>>>> So, how is the trimming actually enabled? I have these entries >>>>> >>>>> dn: cn=changelog5,cn=config >>>>> cn: changelog5 >>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>> nsslapd-changelogmaxage: 30d >>>>> objectClass: top >>>>> objectClass: extensibleobject >>>> >>>> So trimming is set, but it's set to 30 days (30d), that >>>> could/should be >>>> shortened to "7d". >>> >>> Somehow I doubt that this will do anything. >> >> Ahhh, I thought this was the Replication Changelog, but it is the >> "Retro" Changelog. Ok so check this out: >> >> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... >> >> >> >> >> Or update config directly using ldapmodify: >> >> # ldapmodify -D ... >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> changetype: modify >> replace: nsslapd-changelogmaxage: 7d >> nsslapd-changelogmaxage: 7d > > (( Typo? The replace line is without the value )) yes, sorry... > > Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
This is probably introduced in commit a5029c8091 for Pull Request [1]. The queue function is now called with a relative time.
[1] https://github.com/389ds/389-ds-base/pull/4450 [2] https://github.com/389ds/389-ds-base/issues/4384
Yes I think you found a bug. Are you building the server yourself? Can I send you a code patch to try?
No, I'm "just" a CentOS user, with access to the Github source code. And the last time I build an rpm was more than 20 years ago in the early RedHat days.
Also, I don't want to mess too much with our production system.
Understood. What version of 389-ds-base are you using?
1.4.3.23 (CentOS 8 Stream)
Thanks, Mark
Mark
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
> > dn: cn=Retro Changelog Plugin,cn=plugins,cn=config > cn: Retro Changelog Plugin > nsslapd-attribute: nsuniqueid:targetUniqueId > nsslapd-changelogmaxage: 2d > nsslapd-include-suffix: cn=dns,dc=example,dc=com > nsslapd-plugin-depends-on-named: Class of Service > nsslapd-plugin-depends-on-type: database > nsslapd-pluginDescription: Retrocl Plugin > nsslapd-pluginEnabled: on > nsslapd-pluginId: retrocl > nsslapd-pluginInitfunc: retrocl_plugin_init > nsslapd-pluginPath: libretrocl-plugin > nsslapd-pluginType: object > nsslapd-pluginVendor: 389 Project > nsslapd-pluginVersion: 1.4.3.231 > nsslapd-pluginbetxn: on > nsslapd-pluginprecedence: 25 > objectClass: top > objectClass: nsSlapdPlugin > objectClass: extensibleObject > >> >> >> HTH, >> >> Mark >> >>> >>> The first entry in cn=changelog is more than two months old. (May >>> 26th >>> is about >>> the time I installed this server as a FreeIPA replica.) >>> >>> dn: changenumber=1,cn=changelog >>> objectClass: top >>> objectClass: changelogentry >>> objectClass: extensibleObject >>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>> changeNumber: 1 >>> targetDn: >>> idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com >>> changeTime: 20210526122901Z >>> changeType: modify >>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>> >>> Is there a way to see trimming in action? I'm afraid it is not >>> doing >>> anything. > > What diagnostic can I enable to see the trimming in action? > >>> >>>> >>>>> >>>>> >>>>>> >>>>>> I recommend setting up the changelog max age to 7 days: >>>>>> >>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht... >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Once that the trimming cleans up the changelog the database >>>>>> compaction >>>>>> should work without issue. You can also increase the database >>>>>> locks to >>>>>> 1 million (that does require a restart of the server to take >>>>>> effect). >>>>> >>>>> Let's hope that 1 million is enough :-) >>>> >>>> It might not be, you can keep bumping it up if needed. There >>>> is no >>>> limit, or negative impact on db performance. >>>> >>>> Mark >>>> >>>>> >>>>>> >>>>>> HTH, >>>>>> >>>>>> Mark >>>>>> >>>>> [1] >>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html >>>>> >>>>> >>>>> >>>>> >>>> >>>> -- >>>> Directory Server Development Team >>>> >>> _______________________________________________ >>> 389-users mailing list -- 389-users@lists.fedoraproject.org >>> To unsubscribe send an email to >>> 389-users-leave@lists.fedoraproject.org >>> Fedora Code of Conduct: >>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ >>> List Guidelines: >>> https://fedoraproject.org/wiki/Mailing_list_guidelines >>> List Archives: >>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... >>> >>> >>> >>> Do not reply to spam on the list, report it: >>> https://pagure.io/fedora-infrastructure >> >> -- >> Directory Server Development Team >>
>
Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
Hi, I suspect that Kees discovered a new bug ...
As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) ) The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
Regards Pierre,
On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker keesb@ghs.com wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote:
On 8/9/21 11:20 AM, Kees Bakker wrote: > On 09-08-2021 16:00, Mark Reynolds wrote: >> On 8/9/21 8:09 AM, Kees Bakker wrote: >>> Hi, >>> >>> When my dirsrv was trying to compact the databases I was getting >>> this >>> error >>> >>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>> Compacting databases ... >>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>> Compacting DB start: userRoot >>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>> compactdb: compact userRoot - 417 pages freed >>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>> Compacting DB start: ipaca >>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>> compactdb: compact ipaca - 419 pages freed >>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>> Compacting DB start: changelog >>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>> compacting >>> replication changelogs... >>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055
Lock
>>> table is out of available lock entries >>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>> compactdb: failed to compact changelog; db error - 12 Cannot >>> allocate >>> memory >>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055
Lock
>>> table is out of available lock entries >>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>> NSMMReplicationPlugin - >>> changelog program - cl5CompactDBs - Failed to compact >>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>> memory >>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>> Compacting databases finished. >>> >>> There are about 402k entries in cn=changelog. >>> >>> I have a few questions >>> 1) is it normal to have so many entries in cn=changelog? On
another
>>> replica I have almost 3M entries Isn't this cleaned up? >>> 2) the number of locks is 50000 (there are two config items). >>> Should I >>> increase that number? If so, increase to what? >>> 3) is there maybe something else going on, causing the >>> exhaustion of >>> the locks? >> >> Ok, so by default there is no changelog trimming enabled. So the >> changelog will grow without bounds, which is bad. > > How much of this [1] applies? Indeed it says "By default the > Replication Changelog does not use any trimming by default." > So, how is the trimming actually enabled? I have these entries > > dn: cn=changelog5,cn=config > cn: changelog5 > nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb > nsslapd-changelogmaxage: 30d > objectClass: top > objectClass: extensibleobject
So trimming is set, but it's set to 30 days (30d), that could/should be shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld,
cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May 26th is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
> > >> >> I recommend setting up the changelog max age to 7 days: >> >>
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
>> >> >> >> >> Once that the trimming cleans up the changelog the database >> compaction >> should work without issue. You can also increase the database >> locks to >> 1 million (that does require a restart of the server to take >> effect). > > Let's hope that 1 million is enough :-)
It might not be, you can keep bumping it up if needed. There is no limit, or negative impact on db performance.
Mark
> >> >> HTH, >> >> Mark >> > [1] >
https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
> >
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to
389-users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ...
As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-(
I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869
Mark
Regards Pierre,
On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker <keesb@ghs.com mailto:keesb@ghs.com> wrote:
On 09-08-2021 20:23, Mark Reynolds wrote: > On 8/9/21 1:47 PM, Kees Bakker wrote: >> On 09-08-2021 19:25, Mark Reynolds wrote: >>> On 8/9/21 1:16 PM, Kees Bakker wrote: >>>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When my dirsrv was trying to compact the databases I was getting >>>>>>>> this >>>>>>>> error >>>>>>>> >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases ... >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: userRoot >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: ipaca >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: changelog >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>>> compacting >>>>>>>> replication changelogs... >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>>> allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>>> NSMMReplicationPlugin - >>>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases finished. >>>>>>>> >>>>>>>> There are about 402k entries in cn=changelog. >>>>>>>> >>>>>>>> I have a few questions >>>>>>>> 1) is it normal to have so many entries in cn=changelog? On another >>>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>>> Should I >>>>>>>> increase that number? If so, increase to what? >>>>>>>> 3) is there maybe something else going on, causing the >>>>>>>> exhaustion of >>>>>>>> the locks? >>>>>>> >>>>>>> Ok, so by default there is no changelog trimming enabled. So the >>>>>>> changelog will grow without bounds, which is bad. >>>>>> >>>>>> How much of this [1] applies? Indeed it says "By default the >>>>>> Replication Changelog does not use any trimming by default." >>>>>> So, how is the trimming actually enabled? I have these entries >>>>>> >>>>>> dn: cn=changelog5,cn=config >>>>>> cn: changelog5 >>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>>> nsslapd-changelogmaxage: 30d >>>>>> objectClass: top >>>>>> objectClass: extensibleobject >>>>> >>>>> So trimming is set, but it's set to 30 days (30d), that >>>>> could/should be >>>>> shortened to "7d". >>>> >>>> Somehow I doubt that this will do anything. >>> >>> Ahhh, I thought this was the Replication Changelog, but it is the >>> "Retro" Changelog. Ok so check this out: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog> >>> >>> >>> Or update config directly using ldapmodify: >>> >>> # ldapmodify -D ... >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >>> changetype: modify >>> replace: nsslapd-changelogmaxage: 7d >>> nsslapd-changelogmaxage: 7d >> >> (( Typo? The replace line is without the value )) > yes, sorry... >> >> Hmm, that maxage was already at 2 days > > Ok, perhaps you are running into a known bug where indexes created by > plugins are not enabled. Or might be a config issue? > > Try this first. Set in the retro changelog entry: > > nsslapd-changelog-trim-interval: 300 <--- 5 minutes (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.) > > Restart the server to make sure it was applied. You can enable "plugin" > logging to see what it does. > > nsslapd-errorlog-level: 65536 > > wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". > This will tell us what the plugin is doing when it goes to trim the > database. > > IMPORTANT, disable plugin logging immediately after you get your data. > Set the errorlog level to 0 to go back to default logging. > > If the logging shows it's not finding any candidates then it might be > the indexing issue I mentioned above. In that case you just need to > reindex the retro changelog: Before changing anything I enabled Plug-in debugging which revealed something interesting [root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config Look at cur_time. That is the "uptime" of the system, not UTC. [root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16 Next, assuming this code from retrocl_trim.c is making the decision to trim first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; } There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong? > > # dsctl slapd-YOUR_INSTANCE stop > > # dsctl slapd-YOUR_INSTANCE db2index changelog > > # dsctl slapd-YOUR_INSTANCE start > > Then wait 5 minutes and see if anything was trimmed. > > Mark > >> >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> cn: Retro Changelog Plugin >> nsslapd-attribute: nsuniqueid:targetUniqueId >> nsslapd-changelogmaxage: 2d >> nsslapd-include-suffix: cn=dns,dc=example,dc=com >> nsslapd-plugin-depends-on-named: Class of Service >> nsslapd-plugin-depends-on-type: database >> nsslapd-pluginDescription: Retrocl Plugin >> nsslapd-pluginEnabled: on >> nsslapd-pluginId: retrocl >> nsslapd-pluginInitfunc: retrocl_plugin_init >> nsslapd-pluginPath: libretrocl-plugin >> nsslapd-pluginType: object >> nsslapd-pluginVendor: 389 Project >> nsslapd-pluginVersion: 1.4.3.231 >> nsslapd-pluginbetxn: on >> nsslapd-pluginprecedence: 25 >> objectClass: top >> objectClass: nsSlapdPlugin >> objectClass: extensibleObject >> >>> >>> >>> HTH, >>> >>> Mark >>> >>>> >>>> The first entry in cn=changelog is more than two months old. (May 26th >>>> is about >>>> the time I installed this server as a FreeIPA replica.) >>>> >>>> dn: changenumber=1,cn=changelog >>>> objectClass: top >>>> objectClass: changelogentry >>>> objectClass: extensibleObject >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>>> changeNumber: 1 >>>> targetDn: >>>> idnsname=_ldap._tcp,idnsname=example.com <http://example.com>.,cn=dns,dc=example,dc=com >>>> changeTime: 20210526122901Z >>>> changeType: modify >>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>>> >>>> Is there a way to see trimming in action? I'm afraid it is not doing >>>> anything. >> >> What diagnostic can I enable to see the trimming in action? >> >>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I recommend setting up the changelog max age to 7 days: >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once that the trimming cleans up the changelog the database >>>>>>> compaction >>>>>>> should work without issue. You can also increase the database >>>>>>> locks to >>>>>>> 1 million (that does require a restart of the server to take >>>>>>> effect). >>>>>> >>>>>> Let's hope that 1 million is enough :-) >>>>> >>>>> It might not be, you can keep bumping it up if needed. There is no >>>>> limit, or negative impact on db performance. >>>>> >>>>> Mark >>>>> >>>>>> >>>>>>> >>>>>>> HTH, >>>>>>> >>>>>>> Mark >>>>>>> >>>>>> [1] >>>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html <https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Directory Server Development Team >>>>> >>>> _______________________________________________ >>>> 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> >>>> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> >>>> Fedora Code of Conduct: >>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> >>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> >>>> List Archives: >>>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> >>>> >>>> Do not reply to spam on the list, report it: >>>> https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure> >>> >>> -- >>> Directory Server Development Team >>> >> > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
--
389 Directory Server Development Team
On 11-08-2021 15:21, Mark Reynolds wrote:
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ...
As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-(
I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869
This has been fixed a while ago. And it was cherry-picked in several release branches. For me that would be 1.4.4. There is a tag 1.4.4.17 but it hasn't been published.
Does anyone know how the release process goes with epel - Modular? That's were I'm expecting it to show up. There hasn't been any update in weeks, as far as I can see. -- Kees
Mark
Regards Pierre,
On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker <keesb@ghs.com mailto:keesb@ghs.com> wrote:
On 09-08-2021 20:23, Mark Reynolds wrote: > On 8/9/21 1:47 PM, Kees Bakker wrote: >> On 09-08-2021 19:25, Mark Reynolds wrote: >>> On 8/9/21 1:16 PM, Kees Bakker wrote: >>>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When my dirsrv was trying to compact the databases I was getting >>>>>>>> this >>>>>>>> error >>>>>>>> >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases ... >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: userRoot >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: ipaca >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: changelog >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>>> compacting >>>>>>>> replication changelogs... >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>>> allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>>> NSMMReplicationPlugin - >>>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases finished. >>>>>>>> >>>>>>>> There are about 402k entries in cn=changelog. >>>>>>>> >>>>>>>> I have a few questions >>>>>>>> 1) is it normal to have so many entries in cn=changelog? On another >>>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>>> Should I >>>>>>>> increase that number? If so, increase to what? >>>>>>>> 3) is there maybe something else going on, causing the >>>>>>>> exhaustion of >>>>>>>> the locks? >>>>>>> >>>>>>> Ok, so by default there is no changelog trimming enabled. So the >>>>>>> changelog will grow without bounds, which is bad. >>>>>> >>>>>> How much of this [1] applies? Indeed it says "By default the >>>>>> Replication Changelog does not use any trimming by default." >>>>>> So, how is the trimming actually enabled? I have these entries >>>>>> >>>>>> dn: cn=changelog5,cn=config >>>>>> cn: changelog5 >>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>>> nsslapd-changelogmaxage: 30d >>>>>> objectClass: top >>>>>> objectClass: extensibleobject >>>>> >>>>> So trimming is set, but it's set to 30 days (30d), that >>>>> could/should be >>>>> shortened to "7d". >>>> >>>> Somehow I doubt that this will do anything. >>> >>> Ahhh, I thought this was the Replication Changelog, but it is the >>> "Retro" Changelog. Ok so check this out: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog> >>> >>> >>> Or update config directly using ldapmodify: >>> >>> # ldapmodify -D ... >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >>> changetype: modify >>> replace: nsslapd-changelogmaxage: 7d >>> nsslapd-changelogmaxage: 7d >> >> (( Typo? The replace line is without the value )) > yes, sorry... >> >> Hmm, that maxage was already at 2 days > > Ok, perhaps you are running into a known bug where indexes created by > plugins are not enabled. Or might be a config issue? > > Try this first. Set in the retro changelog entry: > > nsslapd-changelog-trim-interval: 300 <--- 5 minutes (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.) > > Restart the server to make sure it was applied. You can enable "plugin" > logging to see what it does. > > nsslapd-errorlog-level: 65536 > > wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". > This will tell us what the plugin is doing when it goes to trim the > database. > > IMPORTANT, disable plugin logging immediately after you get your data. > Set the errorlog level to 0 to go back to default logging. > > If the logging shows it's not finding any candidates then it might be > the indexing issue I mentioned above. In that case you just need to > reindex the retro changelog: Before changing anything I enabled Plug-in debugging which revealed something interesting [root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config Look at cur_time. That is the "uptime" of the system, not UTC. [root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16 Next, assuming this code from retrocl_trim.c is making the decision to trim first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; } There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong? > > # dsctl slapd-YOUR_INSTANCE stop > > # dsctl slapd-YOUR_INSTANCE db2index changelog > > # dsctl slapd-YOUR_INSTANCE start > > Then wait 5 minutes and see if anything was trimmed. > > Mark > >> >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> cn: Retro Changelog Plugin >> nsslapd-attribute: nsuniqueid:targetUniqueId >> nsslapd-changelogmaxage: 2d >> nsslapd-include-suffix: cn=dns,dc=example,dc=com >> nsslapd-plugin-depends-on-named: Class of Service >> nsslapd-plugin-depends-on-type: database >> nsslapd-pluginDescription: Retrocl Plugin >> nsslapd-pluginEnabled: on >> nsslapd-pluginId: retrocl >> nsslapd-pluginInitfunc: retrocl_plugin_init >> nsslapd-pluginPath: libretrocl-plugin >> nsslapd-pluginType: object >> nsslapd-pluginVendor: 389 Project >> nsslapd-pluginVersion: 1.4.3.231 >> nsslapd-pluginbetxn: on >> nsslapd-pluginprecedence: 25 >> objectClass: top >> objectClass: nsSlapdPlugin >> objectClass: extensibleObject >> >>> >>> >>> HTH, >>> >>> Mark >>> >>>> >>>> The first entry in cn=changelog is more than two months old. (May 26th >>>> is about >>>> the time I installed this server as a FreeIPA replica.) >>>> >>>> dn: changenumber=1,cn=changelog >>>> objectClass: top >>>> objectClass: changelogentry >>>> objectClass: extensibleObject >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>>> changeNumber: 1 >>>> targetDn: >>>> idnsname=_ldap._tcp,idnsname=example.com <http://example.com>.,cn=dns,dc=example,dc=com >>>> changeTime: 20210526122901Z >>>> changeType: modify >>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>>> >>>> Is there a way to see trimming in action? I'm afraid it is not doing >>>> anything. >> >> What diagnostic can I enable to see the trimming in action? >> >>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I recommend setting up the changelog max age to 7 days: >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once that the trimming cleans up the changelog the database >>>>>>> compaction >>>>>>> should work without issue. You can also increase the database >>>>>>> locks to >>>>>>> 1 million (that does require a restart of the server to take >>>>>>> effect). >>>>>> >>>>>> Let's hope that 1 million is enough :-) >>>>> >>>>> It might not be, you can keep bumping it up if needed. There is no >>>>> limit, or negative impact on db performance. >>>>> >>>>> Mark >>>>> >>>>>> >>>>>>> >>>>>>> HTH, >>>>>>> >>>>>>> Mark >>>>>>> >>>>>> [1] >>>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html <https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Directory Server Development Team >>>>> >>>> _______________________________________________ >>>> 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> >>>> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> >>>> Fedora Code of Conduct: >>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> >>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> >>>> List Archives: >>>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> >>>> >>>> Do not reply to spam on the list, report it: >>>> https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure> >>> >>> -- >>> Directory Server Development Team >>> >> > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
--
389 Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 10/11/21 2:29 PM, Kees Bakker wrote:
On 11-08-2021 15:21, Mark Reynolds wrote:
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ...
As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-(
I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869
This has been fixed a while ago. And it was cherry-picked in several release branches. For me that would be 1.4.4. There is a tag 1.4.4.17 but it hasn't been published.
Does anyone know how the release process goes with epel - Modular? That's were I'm expecting it to show up. There hasn't been any update in weeks, as far as I can see.
The module build was just done on October 9th. I think it takes a little more time for that to appear in EPEL ...
Regards, Mark
-- Kees
Mark
Regards Pierre,
On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker keesb@ghs.com wrote:
On 09-08-2021 20:23, Mark Reynolds wrote: > On 8/9/21 1:47 PM, Kees Bakker wrote: >> On 09-08-2021 19:25, Mark Reynolds wrote: >>> On 8/9/21 1:16 PM, Kees Bakker wrote: >>>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When my dirsrv was trying to compact the databases I was getting >>>>>>>> this >>>>>>>> error >>>>>>>> >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases ... >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: userRoot >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: ipaca >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: changelog >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>>> compacting >>>>>>>> replication changelogs... >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>>> allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>>> NSMMReplicationPlugin - >>>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases finished. >>>>>>>> >>>>>>>> There are about 402k entries in cn=changelog. >>>>>>>> >>>>>>>> I have a few questions >>>>>>>> 1) is it normal to have so many entries in cn=changelog? On another >>>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>>> Should I >>>>>>>> increase that number? If so, increase to what? >>>>>>>> 3) is there maybe something else going on, causing the >>>>>>>> exhaustion of >>>>>>>> the locks? >>>>>>> >>>>>>> Ok, so by default there is no changelog trimming enabled. So the >>>>>>> changelog will grow without bounds, which is bad. >>>>>> >>>>>> How much of this [1] applies? Indeed it says "By default the >>>>>> Replication Changelog does not use any trimming by default." >>>>>> So, how is the trimming actually enabled? I have these entries >>>>>> >>>>>> dn: cn=changelog5,cn=config >>>>>> cn: changelog5 >>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>>> nsslapd-changelogmaxage: 30d >>>>>> objectClass: top >>>>>> objectClass: extensibleobject >>>>> >>>>> So trimming is set, but it's set to 30 days (30d), that >>>>> could/should be >>>>> shortened to "7d". >>>> >>>> Somehow I doubt that this will do anything. >>> >>> Ahhh, I thought this was the Replication Changelog, but it is the >>> "Retro" Changelog. Ok so check this out: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog >>> >>> >>> Or update config directly using ldapmodify: >>> >>> # ldapmodify -D ... >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >>> changetype: modify >>> replace: nsslapd-changelogmaxage: 7d >>> nsslapd-changelogmaxage: 7d >> >> (( Typo? The replace line is without the value )) > yes, sorry... >> >> Hmm, that maxage was already at 2 days > > Ok, perhaps you are running into a known bug where indexes created by > plugins are not enabled. Or might be a config issue? > > Try this first. Set in the retro changelog entry: > > nsslapd-changelog-trim-interval: 300 <--- 5 minutes (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.) > > Restart the server to make sure it was applied. You can enable "plugin" > logging to see what it does. > > nsslapd-errorlog-level: 65536 > > wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". > This will tell us what the plugin is doing when it goes to trim the > database. > > IMPORTANT, disable plugin logging immediately after you get your data. > Set the errorlog level to 0 to go back to default logging. > > If the logging shows it's not finding any candidates then it might be > the indexing issue I mentioned above. In that case you just need to > reindex the retro changelog: Before changing anything I enabled Plug-in debugging which revealed something interesting [root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config Look at cur_time. That is the "uptime" of the system, not UTC. [root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16 Next, assuming this code from retrocl_trim.c is making the decision to trim first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; } There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong? > > # dsctl slapd-YOUR_INSTANCE stop > > # dsctl slapd-YOUR_INSTANCE db2index changelog > > # dsctl slapd-YOUR_INSTANCE start > > Then wait 5 minutes and see if anything was trimmed. > > Mark > >> >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> cn: Retro Changelog Plugin >> nsslapd-attribute: nsuniqueid:targetUniqueId >> nsslapd-changelogmaxage: 2d >> nsslapd-include-suffix: cn=dns,dc=example,dc=com >> nsslapd-plugin-depends-on-named: Class of Service >> nsslapd-plugin-depends-on-type: database >> nsslapd-pluginDescription: Retrocl Plugin >> nsslapd-pluginEnabled: on >> nsslapd-pluginId: retrocl >> nsslapd-pluginInitfunc: retrocl_plugin_init >> nsslapd-pluginPath: libretrocl-plugin >> nsslapd-pluginType: object >> nsslapd-pluginVendor: 389 Project >> nsslapd-pluginVersion: 1.4.3.231 >> nsslapd-pluginbetxn: on >> nsslapd-pluginprecedence: 25 >> objectClass: top >> objectClass: nsSlapdPlugin >> objectClass: extensibleObject >> >>> >>> >>> HTH, >>> >>> Mark >>> >>>> >>>> The first entry in cn=changelog is more than two months old. (May 26th >>>> is about >>>> the time I installed this server as a FreeIPA replica.) >>>> >>>> dn: changenumber=1,cn=changelog >>>> objectClass: top >>>> objectClass: changelogentry >>>> objectClass: extensibleObject >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>>> changeNumber: 1 >>>> targetDn: >>>> idnsname=_ldap._tcp,idnsname=example.com <http://example.com>.,cn=dns,dc=example,dc=com >>>> changeTime: 20210526122901Z >>>> changeType: modify >>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>>> >>>> Is there a way to see trimming in action? I'm afraid it is not doing >>>> anything. >> >> What diagnostic can I enable to see the trimming in action? >> >>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I recommend setting up the changelog max age to 7 days: >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once that the trimming cleans up the changelog the database >>>>>>> compaction >>>>>>> should work without issue. You can also increase the database >>>>>>> locks to >>>>>>> 1 million (that does require a restart of the server to take >>>>>>> effect). >>>>>> >>>>>> Let's hope that 1 million is enough :-) >>>>> >>>>> It might not be, you can keep bumping it up if needed. There is no >>>>> limit, or negative impact on db performance. >>>>> >>>>> Mark >>>>> >>>>>> >>>>>>> >>>>>>> HTH, >>>>>>> >>>>>>> Mark >>>>>>> >>>>>> [1] >>>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Directory Server Development Team >>>>> >>>> _______________________________________________ >>>> 389-users mailing list -- 389-users@lists.fedoraproject.org >>>> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org >>>> Fedora Code of Conduct: >>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ >>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >>>> List Archives: >>>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org >>>> >>>> Do not reply to spam on the list, report it: >>>> https://pagure.io/fedora-infrastructure >>> >>> -- >>> Directory Server Development Team >>> >> > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
--
389 Directory Server Development Team
-- Directory Server Development Team
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
On Mon, Oct 11, 2021 at 8:33 PM Mark Reynolds mreynolds@redhat.com wrote:
On 10/11/21 2:29 PM, Kees Bakker wrote:
On 11-08-2021 15:21, Mark Reynolds wrote:
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ...
As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-(
I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869
This has been fixed a while ago. And it was cherry-picked in several release branches. For me that would be 1.4.4. There is a tag 1.4.4.17 but it hasn't been published.
Does anyone know how the release process goes with epel - Modular? That's were I'm expecting it to show up. There hasn't been any update in weeks, as far as I can see.
The module build was just done on October 9th. I think it takes a little more time for that to appear in EPEL ...
Here's the bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a Module is currently in testing. You can update to it using: # dnf --enablerepo='epel-testing-modular' update 389-ds-base cockpit-389-ds HTH
Regards, Mark
-- Kees
Mark
Regards Pierre,
On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker keesb@ghs.com wrote:
On 09-08-2021 20:23, Mark Reynolds wrote:
On 8/9/21 1:47 PM, Kees Bakker wrote:
On 09-08-2021 19:25, Mark Reynolds wrote:
On 8/9/21 1:16 PM, Kees Bakker wrote:
On 09-08-2021 18:43, Mark Reynolds wrote: > On 8/9/21 11:20 AM, Kees Bakker wrote: >> On 09-08-2021 16:00, Mark Reynolds wrote: >>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>> Hi, >>>> >>>> When my dirsrv was trying to compact the databases I was getting >>>> this >>>> error >>>> >>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>> Compacting databases ... >>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>> Compacting DB start: userRoot >>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>> compactdb: compact userRoot - 417 pages freed >>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>> Compacting DB start: ipaca >>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>> compactdb: compact ipaca - 419 pages freed >>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>> Compacting DB start: changelog >>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>> compacting >>>> replication changelogs... >>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055
Lock
>>>> table is out of available lock entries >>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>> allocate >>>> memory >>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055
Lock
>>>> table is out of available lock entries >>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>> NSMMReplicationPlugin - >>>> changelog program - cl5CompactDBs - Failed to compact >>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot
allocate
>>>> memory >>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>> Compacting databases finished. >>>> >>>> There are about 402k entries in cn=changelog. >>>> >>>> I have a few questions >>>> 1) is it normal to have so many entries in cn=changelog? On
another
>>>> replica I have almost 3M entries Isn't this cleaned up? >>>> 2) the number of locks is 50000 (there are two config items). >>>> Should I >>>> increase that number? If so, increase to what? >>>> 3) is there maybe something else going on, causing the >>>> exhaustion of >>>> the locks? >>> >>> Ok, so by default there is no changelog trimming enabled. So the >>> changelog will grow without bounds, which is bad. >> >> How much of this [1] applies? Indeed it says "By default the >> Replication Changelog does not use any trimming by default." >> So, how is the trimming actually enabled? I have these entries >> >> dn: cn=changelog5,cn=config >> cn: changelog5 >> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >> nsslapd-changelogmaxage: 30d >> objectClass: top >> objectClass: extensibleobject > > So trimming is set, but it's set to 30 days (30d), that > could/should be > shortened to "7d".
Somehow I doubt that this will do anything.
Ahhh, I thought this was the Replication Changelog, but it is the "Retro" Changelog. Ok so check this out:
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
Or update config directly using ldapmodify:
# ldapmodify -D ... dn: cn=Retro Changelog Plugin,cn=plugins,cn=config changetype: modify replace: nsslapd-changelogmaxage: 7d nsslapd-changelogmaxage: 7d
(( Typo? The replace line is without the value ))
yes, sorry...
Hmm, that maxage was already at 2 days
Ok, perhaps you are running into a known bug where indexes created by plugins are not enabled. Or might be a config issue?
Try this first. Set in the retro changelog entry:
nsslapd-changelog-trim-interval: 300 <--- 5 minutes
(Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.)
Restart the server to make sure it was applied. You can enable "plugin" logging to see what it does.
nsslapd-errorlog-level: 65536
wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". This will tell us what the plugin is doing when it goes to trim the database.
IMPORTANT, disable plugin logging immediately after you get your data. Set the errorlog level to 0 to go back to default logging.
If the logging shows it's not finding any candidates then it might be the indexing issue I mentioned above. In that case you just need to reindex the retro changelog:
Before changing anything I enabled Plug-in debugging which revealed something interesting
[root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed
This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config
Look at cur_time. That is the "uptime" of the system, not UTC.
[root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
Next, assuming this code from retrocl_trim.c is making the decision to trim
first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld,
cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; }
There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong?
# dsctl slapd-YOUR_INSTANCE stop
# dsctl slapd-YOUR_INSTANCE db2index changelog
# dsctl slapd-YOUR_INSTANCE start
Then wait 5 minutes and see if anything was trimmed.
Mark
dn: cn=Retro Changelog Plugin,cn=plugins,cn=config cn: Retro Changelog Plugin nsslapd-attribute: nsuniqueid:targetUniqueId nsslapd-changelogmaxage: 2d nsslapd-include-suffix: cn=dns,dc=example,dc=com nsslapd-plugin-depends-on-named: Class of Service nsslapd-plugin-depends-on-type: database nsslapd-pluginDescription: Retrocl Plugin nsslapd-pluginEnabled: on nsslapd-pluginId: retrocl nsslapd-pluginInitfunc: retrocl_plugin_init nsslapd-pluginPath: libretrocl-plugin nsslapd-pluginType: object nsslapd-pluginVendor: 389 Project nsslapd-pluginVersion: 1.4.3.231 nsslapd-pluginbetxn: on nsslapd-pluginprecedence: 25 objectClass: top objectClass: nsSlapdPlugin objectClass: extensibleObject
HTH,
Mark
The first entry in cn=changelog is more than two months old. (May
26th
is about the time I installed this server as a FreeIPA replica.)
dn: changenumber=1,cn=changelog objectClass: top objectClass: changelogentry objectClass: extensibleObject targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 changeNumber: 1 targetDn: idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com changeTime: 20210526122901Z changeType: modify changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
Is there a way to see trimming in action? I'm afraid it is not doing anything.
What diagnostic can I enable to see the trimming in action?
> >> >> >>> >>> I recommend setting up the changelog max age to 7 days: >>> >>>
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/ht...
>>> >>> >>> >>> >>> Once that the trimming cleans up the changelog the database >>> compaction >>> should work without issue. You can also increase the database >>> locks to >>> 1 million (that does require a restart of the server to take >>> effect). >> >> Let's hope that 1 million is enough :-) > > It might not be, you can keep bumping it up if needed. There is no > limit, or negative impact on db performance. > > Mark > >> >>> >>> HTH, >>> >>> Mark >>> >> [1] >>
https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
>> >> > > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to
389-users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
--
389 Directory Server Development Team
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Directory Server Development Team
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 11-10-2021 20:58, Viktor Ashirov wrote:
On Mon, Oct 11, 2021 at 8:33 PM Mark Reynolds <mreynolds@redhat.com mailto:mreynolds@redhat.com> wrote:
On 10/11/21 2:29 PM, Kees Bakker wrote:
On 11-08-2021 15:21, Mark Reynolds wrote:
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ... As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-( I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869 <https://github.com/389ds/389-ds-base/issues/4869>
This has been fixed a while ago. And it was cherry-picked in several release branches. For me that would be 1.4.4. There is a tag 1.4.4.17 but it hasn't been published. Does anyone know how the release process goes with epel - Modular? That's were I'm expecting it to show up. There hasn't been any update in weeks, as far as I can see.
The module build was just done on October 9th. I think it takes a little more time for that to appear in EPEL ...
Here's the bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a Module is currently in testing. You can update to it using: # dnf --enablerepo='epel-testing-modular' update 389-ds-base cockpit-389-ds HTH
Thanks. I'll give that a try.
Regards, Mark
-- Kees
Mark
Regards Pierre, On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker <keesb@ghs.com <mailto:keesb@ghs.com>> wrote: On 09-08-2021 20:23, Mark Reynolds wrote: > On 8/9/21 1:47 PM, Kees Bakker wrote: >> On 09-08-2021 19:25, Mark Reynolds wrote: >>> On 8/9/21 1:16 PM, Kees Bakker wrote: >>>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When my dirsrv was trying to compact the databases I was getting >>>>>>>> this >>>>>>>> error >>>>>>>> >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases ... >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: userRoot >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: ipaca >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: changelog >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>>> compacting >>>>>>>> replication changelogs... >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>>> allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>>> NSMMReplicationPlugin - >>>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases finished. >>>>>>>> >>>>>>>> There are about 402k entries in cn=changelog. >>>>>>>> >>>>>>>> I have a few questions >>>>>>>> 1) is it normal to have so many entries in cn=changelog? On another >>>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>>> Should I >>>>>>>> increase that number? If so, increase to what? >>>>>>>> 3) is there maybe something else going on, causing the >>>>>>>> exhaustion of >>>>>>>> the locks? >>>>>>> >>>>>>> Ok, so by default there is no changelog trimming enabled. So the >>>>>>> changelog will grow without bounds, which is bad. >>>>>> >>>>>> How much of this [1] applies? Indeed it says "By default the >>>>>> Replication Changelog does not use any trimming by default." >>>>>> So, how is the trimming actually enabled? I have these entries >>>>>> >>>>>> dn: cn=changelog5,cn=config >>>>>> cn: changelog5 >>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>>> nsslapd-changelogmaxage: 30d >>>>>> objectClass: top >>>>>> objectClass: extensibleobject >>>>> >>>>> So trimming is set, but it's set to 30 days (30d), that >>>>> could/should be >>>>> shortened to "7d". >>>> >>>> Somehow I doubt that this will do anything. >>> >>> Ahhh, I thought this was the Replication Changelog, but it is the >>> "Retro" Changelog. Ok so check this out: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog> >>> >>> >>> Or update config directly using ldapmodify: >>> >>> # ldapmodify -D ... >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >>> changetype: modify >>> replace: nsslapd-changelogmaxage: 7d >>> nsslapd-changelogmaxage: 7d >> >> (( Typo? The replace line is without the value )) > yes, sorry... >> >> Hmm, that maxage was already at 2 days > > Ok, perhaps you are running into a known bug where indexes created by > plugins are not enabled. Or might be a config issue? > > Try this first. Set in the retro changelog entry: > > nsslapd-changelog-trim-interval: 300 <--- 5 minutes (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.) > > Restart the server to make sure it was applied. You can enable "plugin" > logging to see what it does. > > nsslapd-errorlog-level: 65536 > > wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". > This will tell us what the plugin is doing when it goes to trim the > database. > > IMPORTANT, disable plugin logging immediately after you get your data. > Set the errorlog level to 0 to go back to default logging. > > If the logging shows it's not finding any candidates then it might be > the indexing issue I mentioned above. In that case you just need to > reindex the retro changelog: Before changing anything I enabled Plug-in debugging which revealed something interesting [root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config Look at cur_time. That is the "uptime" of the system, not UTC. [root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16 Next, assuming this code from retrocl_trim.c is making the decision to trim first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; } There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong? > > # dsctl slapd-YOUR_INSTANCE stop > > # dsctl slapd-YOUR_INSTANCE db2index changelog > > # dsctl slapd-YOUR_INSTANCE start > > Then wait 5 minutes and see if anything was trimmed. > > Mark > >> >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> cn: Retro Changelog Plugin >> nsslapd-attribute: nsuniqueid:targetUniqueId >> nsslapd-changelogmaxage: 2d >> nsslapd-include-suffix: cn=dns,dc=example,dc=com >> nsslapd-plugin-depends-on-named: Class of Service >> nsslapd-plugin-depends-on-type: database >> nsslapd-pluginDescription: Retrocl Plugin >> nsslapd-pluginEnabled: on >> nsslapd-pluginId: retrocl >> nsslapd-pluginInitfunc: retrocl_plugin_init >> nsslapd-pluginPath: libretrocl-plugin >> nsslapd-pluginType: object >> nsslapd-pluginVendor: 389 Project >> nsslapd-pluginVersion: 1.4.3.231 >> nsslapd-pluginbetxn: on >> nsslapd-pluginprecedence: 25 >> objectClass: top >> objectClass: nsSlapdPlugin >> objectClass: extensibleObject >> >>> >>> >>> HTH, >>> >>> Mark >>> >>>> >>>> The first entry in cn=changelog is more than two months old. (May 26th >>>> is about >>>> the time I installed this server as a FreeIPA replica.) >>>> >>>> dn: changenumber=1,cn=changelog >>>> objectClass: top >>>> objectClass: changelogentry >>>> objectClass: extensibleObject >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>>> changeNumber: 1 >>>> targetDn: >>>> idnsname=_ldap._tcp,idnsname=example.com <http://example.com>.,cn=dns,dc=example,dc=com >>>> changeTime: 20210526122901Z >>>> changeType: modify >>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>>> >>>> Is there a way to see trimming in action? I'm afraid it is not doing >>>> anything. >> >> What diagnostic can I enable to see the trimming in action? >> >>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I recommend setting up the changelog max age to 7 days: >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once that the trimming cleans up the changelog the database >>>>>>> compaction >>>>>>> should work without issue. You can also increase the database >>>>>>> locks to >>>>>>> 1 million (that does require a restart of the server to take >>>>>>> effect). >>>>>> >>>>>> Let's hope that 1 million is enough :-) >>>>> >>>>> It might not be, you can keep bumping it up if needed. There is no >>>>> limit, or negative impact on db performance. >>>>> >>>>> Mark >>>>> >>>>>> >>>>>>> >>>>>>> HTH, >>>>>>> >>>>>>> Mark >>>>>>> >>>>>> [1] >>>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html <https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Directory Server Development Team >>>>> >>>> _______________________________________________ >>>> 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> >>>> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> >>>> Fedora Code of Conduct: >>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> >>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> >>>> List Archives: >>>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> >>>> >>>> Do not reply to spam on the list, report it: >>>> https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure> >>> >>> -- >>> Directory Server Development Team >>> >> > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure> -- -- 389 Directory Server Development Team
-- Directory Server Development Team _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
_______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
-- Directory Server Development Team _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
-- Viktor
On 12-10-2021 14:29, Kees Bakker wrote:
On 11-10-2021 20:58, Viktor Ashirov wrote:
On Mon, Oct 11, 2021 at 8:33 PM Mark Reynolds <mreynolds@redhat.com mailto:mreynolds@redhat.com> wrote:
On 10/11/21 2:29 PM, Kees Bakker wrote:
On 11-08-2021 15:21, Mark Reynolds wrote:
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ... As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-( I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869 <https://github.com/389ds/389-ds-base/issues/4869>
This has been fixed a while ago. And it was cherry-picked in several release branches. For me that would be 1.4.4. There is a tag 1.4.4.17 but it hasn't been published. Does anyone know how the release process goes with epel - Modular? That's were I'm expecting it to show up. There hasn't been any update in weeks, as far as I can see.
The module build was just done on October 9th. I think it takes a little more time for that to appear in EPEL ...
Here's the bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a Module is currently in testing. You can update to it using: # dnf --enablerepo='epel-testing-modular' update 389-ds-base cockpit-389-ds HTH
Thanks. I'll give that a try.
Snif. I installed this 1.4.4.17 update but ns-slapd now hits a deadlock and becomes unresponsive.
Regards, Mark
-- Kees
Mark
Regards Pierre, On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker <keesb@ghs.com <mailto:keesb@ghs.com>> wrote: On 09-08-2021 20:23, Mark Reynolds wrote: > On 8/9/21 1:47 PM, Kees Bakker wrote: >> On 09-08-2021 19:25, Mark Reynolds wrote: >>> On 8/9/21 1:16 PM, Kees Bakker wrote: >>>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When my dirsrv was trying to compact the databases I was getting >>>>>>>> this >>>>>>>> error >>>>>>>> >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases ... >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: userRoot >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: ipaca >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: changelog >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>>> compacting >>>>>>>> replication changelogs... >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>>> allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>>> NSMMReplicationPlugin - >>>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases finished. >>>>>>>> >>>>>>>> There are about 402k entries in cn=changelog. >>>>>>>> >>>>>>>> I have a few questions >>>>>>>> 1) is it normal to have so many entries in cn=changelog? On another >>>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>>> Should I >>>>>>>> increase that number? If so, increase to what? >>>>>>>> 3) is there maybe something else going on, causing the >>>>>>>> exhaustion of >>>>>>>> the locks? >>>>>>> >>>>>>> Ok, so by default there is no changelog trimming enabled. So the >>>>>>> changelog will grow without bounds, which is bad. >>>>>> >>>>>> How much of this [1] applies? Indeed it says "By default the >>>>>> Replication Changelog does not use any trimming by default." >>>>>> So, how is the trimming actually enabled? I have these entries >>>>>> >>>>>> dn: cn=changelog5,cn=config >>>>>> cn: changelog5 >>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>>> nsslapd-changelogmaxage: 30d >>>>>> objectClass: top >>>>>> objectClass: extensibleobject >>>>> >>>>> So trimming is set, but it's set to 30 days (30d), that >>>>> could/should be >>>>> shortened to "7d". >>>> >>>> Somehow I doubt that this will do anything. >>> >>> Ahhh, I thought this was the Replication Changelog, but it is the >>> "Retro" Changelog. Ok so check this out: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog> >>> >>> >>> Or update config directly using ldapmodify: >>> >>> # ldapmodify -D ... >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >>> changetype: modify >>> replace: nsslapd-changelogmaxage: 7d >>> nsslapd-changelogmaxage: 7d >> >> (( Typo? The replace line is without the value )) > yes, sorry... >> >> Hmm, that maxage was already at 2 days > > Ok, perhaps you are running into a known bug where indexes created by > plugins are not enabled. Or might be a config issue? > > Try this first. Set in the retro changelog entry: > > nsslapd-changelog-trim-interval: 300 <--- 5 minutes (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.) > > Restart the server to make sure it was applied. You can enable "plugin" > logging to see what it does. > > nsslapd-errorlog-level: 65536 > > wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". > This will tell us what the plugin is doing when it goes to trim the > database. > > IMPORTANT, disable plugin logging immediately after you get your data. > Set the errorlog level to 0 to go back to default logging. > > If the logging shows it's not finding any candidates then it might be > the indexing issue I mentioned above. In that case you just need to > reindex the retro changelog: Before changing anything I enabled Plug-in debugging which revealed something interesting [root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config Look at cur_time. That is the "uptime" of the system, not UTC. [root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16 Next, assuming this code from retrocl_trim.c is making the decision to trim first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; } There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong? > > # dsctl slapd-YOUR_INSTANCE stop > > # dsctl slapd-YOUR_INSTANCE db2index changelog > > # dsctl slapd-YOUR_INSTANCE start > > Then wait 5 minutes and see if anything was trimmed. > > Mark > >> >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> cn: Retro Changelog Plugin >> nsslapd-attribute: nsuniqueid:targetUniqueId >> nsslapd-changelogmaxage: 2d >> nsslapd-include-suffix: cn=dns,dc=example,dc=com >> nsslapd-plugin-depends-on-named: Class of Service >> nsslapd-plugin-depends-on-type: database >> nsslapd-pluginDescription: Retrocl Plugin >> nsslapd-pluginEnabled: on >> nsslapd-pluginId: retrocl >> nsslapd-pluginInitfunc: retrocl_plugin_init >> nsslapd-pluginPath: libretrocl-plugin >> nsslapd-pluginType: object >> nsslapd-pluginVendor: 389 Project >> nsslapd-pluginVersion: 1.4.3.231 >> nsslapd-pluginbetxn: on >> nsslapd-pluginprecedence: 25 >> objectClass: top >> objectClass: nsSlapdPlugin >> objectClass: extensibleObject >> >>> >>> >>> HTH, >>> >>> Mark >>> >>>> >>>> The first entry in cn=changelog is more than two months old. (May 26th >>>> is about >>>> the time I installed this server as a FreeIPA replica.) >>>> >>>> dn: changenumber=1,cn=changelog >>>> objectClass: top >>>> objectClass: changelogentry >>>> objectClass: extensibleObject >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>>> changeNumber: 1 >>>> targetDn: >>>> idnsname=_ldap._tcp,idnsname=example.com <http://example.com>.,cn=dns,dc=example,dc=com >>>> changeTime: 20210526122901Z >>>> changeType: modify >>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>>> >>>> Is there a way to see trimming in action? I'm afraid it is not doing >>>> anything. >> >> What diagnostic can I enable to see the trimming in action? >> >>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I recommend setting up the changelog max age to 7 days: >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog <https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once that the trimming cleans up the changelog the database >>>>>>> compaction >>>>>>> should work without issue. You can also increase the database >>>>>>> locks to >>>>>>> 1 million (that does require a restart of the server to take >>>>>>> effect). >>>>>> >>>>>> Let's hope that 1 million is enough :-) >>>>> >>>>> It might not be, you can keep bumping it up if needed. There is no >>>>> limit, or negative impact on db performance. >>>>> >>>>> Mark >>>>> >>>>>> >>>>>>> >>>>>>> HTH, >>>>>>> >>>>>>> Mark >>>>>>> >>>>>> [1] >>>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html <https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Directory Server Development Team >>>>> >>>> _______________________________________________ >>>> 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> >>>> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> >>>> Fedora Code of Conduct: >>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> >>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> >>>> List Archives: >>>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> >>>> >>>> Do not reply to spam on the list, report it: >>>> https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure> >>> >>> -- >>> Directory Server Development Team >>> >> > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure> -- -- 389 Directory Server Development Team
-- Directory Server Development Team _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
_______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
-- Directory Server Development Team _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org <mailto:389-users@lists.fedoraproject.org> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org <mailto:389-users-leave@lists.fedoraproject.org> Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ <https://docs.fedoraproject.org/en-US/project/code-of-conduct/> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines <https://fedoraproject.org/wiki/Mailing_list_guidelines> List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org <https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org> Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure <https://pagure.io/fedora-infrastructure>
-- Viktor
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 10/18/21 5:18 AM, Kees Bakker wrote:
On 12-10-2021 14:29, Kees Bakker wrote:
On 11-10-2021 20:58, Viktor Ashirov wrote:
On Mon, Oct 11, 2021 at 8:33 PM Mark Reynolds mreynolds@redhat.com wrote:
On 10/11/21 2:29 PM, Kees Bakker wrote:
On 11-08-2021 15:21, Mark Reynolds wrote:
On 8/11/21 9:09 AM, Pierre Rogier wrote:
Hi, I suspect that Kees discovered a new bug ... As the retro changelog is persistent across reboot, it is normal that their timestamp are absolute, so IMHO there is no way relative time should be used/seen in trimming computation. (in other words cur_time is wrong. ) My suspicion is that we got a regression when the slapi_eq timers where rewritten to use relative time instead of absolute time) (cur_time is provided by slapi_eq API and is now relative ( it should rather be initialized to time(0) )
Looking at the code we are mixing the use of the same monotonic time for the interval (correct) and the age of an entry (not correct).
The funny thing is that nobody have seen that retrocl trimming was broken for so long ...
That's because it silently just builds up and no one notices :-( I opened a ticket for this: https://github.com/389ds/389-ds-base/issues/4869
This has been fixed a while ago. And it was cherry-picked in several release branches. For me that would be 1.4.4. There is a tag 1.4.4.17 but it hasn't been published. Does anyone know how the release process goes with epel - Modular? That's were I'm expecting it to show up. There hasn't been any update in weeks, as far as I can see.
The module build was just done on October 9th. I think it takes a little more time for that to appear in EPEL ...
Here's the bodhi update: https://bodhi.fedoraproject.org/updates/FEDORA-EPEL-MODULAR-2021-d15385ef0a
Module is currently in testing. You can update to it using: # dnf --enablerepo='epel-testing-modular' update 389-ds-base cockpit-389-ds HTH
Thanks. I'll give that a try.
Snif. I installed this 1.4.4.17 update but ns-slapd now hits a deadlock and becomes unresponsive.
How are you reproducing this deadlock? Please provide some stack traces:
https://www.port389.org/docs/389ds/FAQ/faq.html#sts=Debugging%C2%A0Hangs
Thanks,
Mark
Regards, Mark
-- Kees
Mark
Regards Pierre, On Wed, Aug 11, 2021 at 11:50 AM Kees Bakker <keesb@ghs.com> wrote: On 09-08-2021 20:23, Mark Reynolds wrote: > On 8/9/21 1:47 PM, Kees Bakker wrote: >> On 09-08-2021 19:25, Mark Reynolds wrote: >>> On 8/9/21 1:16 PM, Kees Bakker wrote: >>>> On 09-08-2021 18:43, Mark Reynolds wrote: >>>>> On 8/9/21 11:20 AM, Kees Bakker wrote: >>>>>> On 09-08-2021 16:00, Mark Reynolds wrote: >>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When my dirsrv was trying to compact the databases I was getting >>>>>>>> this >>>>>>>> error >>>>>>>> >>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases ... >>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: userRoot >>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact userRoot - 417 pages freed >>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: ipaca >>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE - bdb_compact - >>>>>>>> compactdb: compact ipaca - 419 pages freed >>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting DB start: changelog >>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE - >>>>>>>> NSMMReplicationPlugin - changelog program - cl5CompactDBs - >>>>>>>> compacting >>>>>>>> replication changelogs... >>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR - bdb_compact - >>>>>>>> compactdb: failed to compact changelog; db error - 12 Cannot >>>>>>>> allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb - BDB2055 Lock >>>>>>>> table is out of available lock entries >>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR - >>>>>>>> NSMMReplicationPlugin - >>>>>>>> changelog program - cl5CompactDBs - Failed to compact >>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12 Cannot allocate >>>>>>>> memory >>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE - bdb_compact - >>>>>>>> Compacting databases finished. >>>>>>>> >>>>>>>> There are about 402k entries in cn=changelog. >>>>>>>> >>>>>>>> I have a few questions >>>>>>>> 1) is it normal to have so many entries in cn=changelog? On another >>>>>>>> replica I have almost 3M entries Isn't this cleaned up? >>>>>>>> 2) the number of locks is 50000 (there are two config items). >>>>>>>> Should I >>>>>>>> increase that number? If so, increase to what? >>>>>>>> 3) is there maybe something else going on, causing the >>>>>>>> exhaustion of >>>>>>>> the locks? >>>>>>> >>>>>>> Ok, so by default there is no changelog trimming enabled. So the >>>>>>> changelog will grow without bounds, which is bad. >>>>>> >>>>>> How much of this [1] applies? Indeed it says "By default the >>>>>> Replication Changelog does not use any trimming by default." >>>>>> So, how is the trimming actually enabled? I have these entries >>>>>> >>>>>> dn: cn=changelog5,cn=config >>>>>> cn: changelog5 >>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb >>>>>> nsslapd-changelogmaxage: 30d >>>>>> objectClass: top >>>>>> objectClass: extensibleobject >>>>> >>>>> So trimming is set, but it's set to 30 days (30d), that >>>>> could/should be >>>>> shortened to "7d". >>>> >>>> Somehow I doubt that this will do anything. >>> >>> Ahhh, I thought this was the Replication Changelog, but it is the >>> "Retro" Changelog. Ok so check this out: >>> >>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/managing_replication-using_the_retro_changelog_plug_in#Using_the_Retro_Changelog_Plug_in-Trimming_the_Retro_Changelog >>> >>> >>> Or update config directly using ldapmodify: >>> >>> # ldapmodify -D ... >>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >>> changetype: modify >>> replace: nsslapd-changelogmaxage: 7d >>> nsslapd-changelogmaxage: 7d >> >> (( Typo? The replace line is without the value )) > yes, sorry... >> >> Hmm, that maxage was already at 2 days > > Ok, perhaps you are running into a known bug where indexes created by > plugins are not enabled. Or might be a config issue? > > Try this first. Set in the retro changelog entry: > > nsslapd-changelog-trim-interval: 300 <--- 5 minutes (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's default is 300.) > > Restart the server to make sure it was applied. You can enable "plugin" > logging to see what it does. > > nsslapd-errorlog-level: 65536 > > wait over 5 minutes, then grep the errors log for "DSRetroclPlugin". > This will tell us what the plugin is doing when it goes to trim the > database. > > IMPORTANT, disable plugin logging immediately after you get your data. > Set the errorlog level to 0 to go back to default logging. > > If the logging shows it's not finding any candidates then it might be > the indexing issue I mentioned above. In that case you just need to > reindex the retro changelog: Before changing anything I enabled Plug-in debugging which revealed something interesting [root@linge ~]# grep trim /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108 [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702600 [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin - cltrim: ldrc=0, first_time=1622032141, cur_time=3702900 [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin - retrocl_housekeeping - changelog does not need to be trimmed This shows that the interval is 300 (even if nsslapd-changelogtrim-interval is not explicitly set in cn=changelog5,cn=config Look at cur_time. That is the "uptime" of the system, not UTC. [root@linge ~]# uptime 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16 Next, assuming this code from retrocl_trim.c is making the decision to trim first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc); slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME, "cltrim: ldrc=%d, first_time=%ld, cur_time=%ld\n", ldrc, first_time, cur_time); if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L && first_time + ts.ts_c_max_age < cur_time) { must_trim = 1; } There is a mixup of absolute (first_time) and relative (cur_time) time. Which of the two is wrong? > > # dsctl slapd-YOUR_INSTANCE stop > > # dsctl slapd-YOUR_INSTANCE db2index changelog > > # dsctl slapd-YOUR_INSTANCE start > > Then wait 5 minutes and see if anything was trimmed. > > Mark > >> >> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config >> cn: Retro Changelog Plugin >> nsslapd-attribute: nsuniqueid:targetUniqueId >> nsslapd-changelogmaxage: 2d >> nsslapd-include-suffix: cn=dns,dc=example,dc=com >> nsslapd-plugin-depends-on-named: Class of Service >> nsslapd-plugin-depends-on-type: database >> nsslapd-pluginDescription: Retrocl Plugin >> nsslapd-pluginEnabled: on >> nsslapd-pluginId: retrocl >> nsslapd-pluginInitfunc: retrocl_plugin_init >> nsslapd-pluginPath: libretrocl-plugin >> nsslapd-pluginType: object >> nsslapd-pluginVendor: 389 Project >> nsslapd-pluginVersion: 1.4.3.231 >> nsslapd-pluginbetxn: on >> nsslapd-pluginprecedence: 25 >> objectClass: top >> objectClass: nsSlapdPlugin >> objectClass: extensibleObject >> >>> >>> >>> HTH, >>> >>> Mark >>> >>>> >>>> The first entry in cn=changelog is more than two months old. (May 26th >>>> is about >>>> the time I installed this server as a FreeIPA replica.) >>>> >>>> dn: changenumber=1,cn=changelog >>>> objectClass: top >>>> objectClass: changelogentry >>>> objectClass: extensibleObject >>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4 >>>> changeNumber: 1 >>>> targetDn: >>>> idnsname=_ldap._tcp,idnsname=example.com <http://example.com>.,cn=dns,dc=example,dc=com >>>> changeTime: 20210526122901Z >>>> changeType: modify >>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA== >>>> >>>> Is there a way to see trimming in action? I'm afraid it is not doing >>>> anything. >> >> What diagnostic can I enable to see the trimming in action? >> >>>> >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> I recommend setting up the changelog max age to 7 days: >>>>>>> >>>>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/administration_guide/trimming_the_replication_changelog >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Once that the trimming cleans up the changelog the database >>>>>>> compaction >>>>>>> should work without issue. You can also increase the database >>>>>>> locks to >>>>>>> 1 million (that does require a restart of the server to take >>>>>>> effect). >>>>>> >>>>>> Let's hope that 1 million is enough :-) >>>>> >>>>> It might not be, you can keep bumping it up if needed. There is no >>>>> limit, or negative impact on db performance. >>>>> >>>>> Mark >>>>> >>>>>> >>>>>>> >>>>>>> HTH, >>>>>>> >>>>>>> Mark >>>>>>> >>>>>> [1] >>>>>> https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Directory Server Development Team >>>>> >>>> _______________________________________________ >>>> 389-users mailing list -- 389-users@lists.fedoraproject.org >>>> To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org >>>> Fedora Code of Conduct: >>>> https://docs.fedoraproject.org/en-US/project/code-of-conduct/ >>>> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines >>>> List Archives: >>>> https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org >>>> >>>> Do not reply to spam on the list, report it: >>>> https://pagure.io/fedora-infrastructure >>> >>> -- >>> Directory Server Development Team >>> >> > -- > Directory Server Development Team > _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure -- -- 389 Directory Server Development Team
-- Directory Server Development Team _______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
_______________________________________________ 389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
-- Directory Server Development Team _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
-- Viktor
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
389-users mailing list --389-users@lists.fedoraproject.org To unsubscribe send an email to389-users-leave@lists.fedoraproject.org Fedora Code of Conduct:https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines:https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives:https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it:https://pagure.io/fedora-infrastructure
389-users@lists.fedoraproject.org