On 11-08-2021 11:49, Kees Bakker wrote:
> On 09-08-2021 20:23, Mark Reynolds wrote:
>> On 8/9/21 1:47 PM, Kees Bakker wrote:
>>> On 09-08-2021 19:25, Mark Reynolds wrote:
>>>> On 8/9/21 1:16 PM, Kees Bakker wrote:
>>>>> On 09-08-2021 18:43, Mark Reynolds wrote:
>>>>>> On 8/9/21 11:20 AM, Kees Bakker wrote:
>>>>>>> On 09-08-2021 16:00, Mark Reynolds wrote:
>>>>>>>> On 8/9/21 8:09 AM, Kees Bakker wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> When my dirsrv was trying to compact the databases I
was getting
>>>>>>>>> this
>>>>>>>>> error
>>>>>>>>>
>>>>>>>>> [07/Aug/2021:23:59:02.715984489 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> Compacting databases ...
>>>>>>>>> [07/Aug/2021:23:59:02.765932397 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> Compacting DB start: userRoot
>>>>>>>>> [07/Aug/2021:23:59:03.518175414 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> compactdb: compact userRoot - 417 pages freed
>>>>>>>>> [07/Aug/2021:23:59:03.576427786 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> Compacting DB start: ipaca
>>>>>>>>> [07/Aug/2021:23:59:03.659941533 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> compactdb: compact ipaca - 419 pages freed
>>>>>>>>> [07/Aug/2021:23:59:03.718445310 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> Compacting DB start: changelog
>>>>>>>>> [08/Aug/2021:00:00:40.807571334 +0200] - NOTICE -
>>>>>>>>> NSMMReplicationPlugin - changelog program -
cl5CompactDBs -
>>>>>>>>> compacting
>>>>>>>>> replication changelogs...
>>>>>>>>> [08/Aug/2021:00:00:54.309357211 +0200] - ERR - libdb
-
>>>>>>>>> BDB2055 Lock
>>>>>>>>> table is out of available lock entries
>>>>>>>>> [08/Aug/2021:00:00:54.726504736 +0200] - ERR -
bdb_compact -
>>>>>>>>> compactdb: failed to compact changelog; db error - 12
Cannot
>>>>>>>>> allocate
>>>>>>>>> memory
>>>>>>>>> [08/Aug/2021:00:00:54.801571421 +0200] - ERR - libdb
-
>>>>>>>>> BDB2055 Lock
>>>>>>>>> table is out of available lock entries
>>>>>>>>> [08/Aug/2021:00:00:54.876618702 +0200] - ERR -
>>>>>>>>> NSMMReplicationPlugin -
>>>>>>>>> changelog program - cl5CompactDBs - Failed to
compact
>>>>>>>>> a797bb0b-be1d11eb-88c0b677-613aa2ad; db error - 12
Cannot
>>>>>>>>> allocate
>>>>>>>>> memory
>>>>>>>>> [08/Aug/2021:00:00:57.253006449 +0200] - NOTICE -
bdb_compact -
>>>>>>>>> Compacting databases finished.
>>>>>>>>>
>>>>>>>>> There are about 402k entries in cn=changelog.
>>>>>>>>>
>>>>>>>>> I have a few questions
>>>>>>>>> 1) is it normal to have so many entries in
cn=changelog? On
>>>>>>>>> another
>>>>>>>>> replica I have almost 3M entries Isn't this
cleaned up?
>>>>>>>>> 2) the number of locks is 50000 (there are two config
items).
>>>>>>>>> Should I
>>>>>>>>> increase that number? If so, increase to what?
>>>>>>>>> 3) is there maybe something else going on, causing
the
>>>>>>>>> exhaustion of
>>>>>>>>> the locks?
>>>>>>>>
>>>>>>>> Ok, so by default there is no changelog trimming enabled.
So the
>>>>>>>> changelog will grow without bounds, which is bad.
>>>>>>>
>>>>>>> How much of this [1] applies? Indeed it says "By default
the
>>>>>>> Replication Changelog does not use any trimming by
default."
>>>>>>> So, how is the trimming actually enabled? I have these
entries
>>>>>>>
>>>>>>> dn: cn=changelog5,cn=config
>>>>>>> cn: changelog5
>>>>>>> nsslapd-changelogdir: /var/lib/dirsrv/slapd-EXAMPLE-COM/cldb
>>>>>>> nsslapd-changelogmaxage: 30d
>>>>>>> objectClass: top
>>>>>>> objectClass: extensibleobject
>>>>>>
>>>>>> So trimming is set, but it's set to 30 days (30d), that
>>>>>> could/should be
>>>>>> shortened to "7d".
>>>>>
>>>>> Somehow I doubt that this will do anything.
>>>>
>>>> Ahhh, I thought this was the Replication Changelog, but it is the
>>>> "Retro" Changelog. Ok so check this out:
>>>>
>>>>
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11...
>>>>
>>>>
>>>>
>>>> Or update config directly using ldapmodify:
>>>>
>>>> # ldapmodify -D ...
>>>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config
>>>> changetype: modify
>>>> replace: nsslapd-changelogmaxage: 7d
>>>> nsslapd-changelogmaxage: 7d
>>>
>>> (( Typo? The replace line is without the value ))
>> yes, sorry...
>>>
>>> Hmm, that maxage was already at 2 days
>>
>> Ok, perhaps you are running into a known bug where indexes created by
>> plugins are not enabled. Or might be a config issue?
>>
>> Try this first. Set in the retro changelog entry:
>>
>> nsslapd-changelog-trim-interval: 300 <--- 5 minutes
> (Sorry to be picky, but it is nsslapd-changelogtrim-interval. It's
> default is 300.)
>>
>> Restart the server to make sure it was applied. You can enable
>> "plugin"
>> logging to see what it does.
>>
>> nsslapd-errorlog-level: 65536
>>
>> wait over 5 minutes, then grep the errors log for "DSRetroclPlugin".
>> This will tell us what the plugin is doing when it goes to trim the
>> database.
>>
>> IMPORTANT, disable plugin logging immediately after you get your data.
>> Set the errorlog level to 0 to go back to default logging.
>>
>> If the logging shows it's not finding any candidates then it might be
>> the indexing issue I mentioned above. In that case you just need to
>> reindex the retro changelog:
>
> Before changing anything I enabled Plug-in debugging which revealed
> something interesting
>
> [root@linge ~]# grep trim
> /var/log/dirsrv/slapd-GHS-NL/errors-plugin-debug-20210811-1108
> [11/Aug/2021:11:01:49.792450548 +0200] - DEBUG - DSRetroclPlugin -
> cltrim: ldrc=0, first_time=1622032141, cur_time=3702600
> [11/Aug/2021:11:01:49.842339467 +0200] - DEBUG - DSRetroclPlugin -
> retrocl_housekeeping - changelog does not need to be trimmed
> [11/Aug/2021:11:06:49.953401407 +0200] - DEBUG - DSRetroclPlugin -
> cltrim: ldrc=0, first_time=1622032141, cur_time=3702900
> [11/Aug/2021:11:06:50.012249703 +0200] - DEBUG - DSRetroclPlugin -
> retrocl_housekeeping - changelog does not need to be trimmed
>
> This shows that the interval is 300 (even if
> nsslapd-changelogtrim-interval is not explicitly set in
> cn=changelog5,cn=config
>
> Look at cur_time. That is the "uptime" of the system, not UTC.
>
> [root@linge ~]# uptime
> 11:21:42 up 42 days, 20:49, 2 users, load average: 0.15, 0.14, 0.16
>
> Next, assuming this code from retrocl_trim.c is making the decision
> to trim
>
> first_time = retrocl_getchangetime(SLAPI_SEQ_FIRST, &ldrc);
> slapi_log_err(SLAPI_LOG_PLUGIN, RETROCL_PLUGIN_NAME,
> "cltrim: ldrc=%d, first_time=%ld,
> cur_time=%ld\n",
> ldrc, first_time, cur_time);
> if (LDAP_SUCCESS == ldrc && first_time > (time_t)0L
&&
> first_time + ts.ts_c_max_age < cur_time) {
> must_trim = 1;
> }
>
> There is a mixup of absolute (first_time) and relative (cur_time)
> time. Which of the two is wrong?
This is probably introduced in commit a5029c8091 for Pull Request [1].
The queue function is now called with a relative time.
[1]
https://github.com/389ds/389-ds-base/pull/4450
[2]
https://github.com/389ds/389-ds-base/issues/4384
Yes I think you found a bug. Are you building the server yourself? Can
I send you a code patch to try?
Mark
>
>>
>> # dsctl slapd-YOUR_INSTANCE stop
>>
>> # dsctl slapd-YOUR_INSTANCE db2index changelog
>>
>> # dsctl slapd-YOUR_INSTANCE start
>>
>> Then wait 5 minutes and see if anything was trimmed.
>>
>> Mark
>>
>>>
>>> dn: cn=Retro Changelog Plugin,cn=plugins,cn=config
>>> cn: Retro Changelog Plugin
>>> nsslapd-attribute: nsuniqueid:targetUniqueId
>>> nsslapd-changelogmaxage: 2d
>>> nsslapd-include-suffix: cn=dns,dc=example,dc=com
>>> nsslapd-plugin-depends-on-named: Class of Service
>>> nsslapd-plugin-depends-on-type: database
>>> nsslapd-pluginDescription: Retrocl Plugin
>>> nsslapd-pluginEnabled: on
>>> nsslapd-pluginId: retrocl
>>> nsslapd-pluginInitfunc: retrocl_plugin_init
>>> nsslapd-pluginPath: libretrocl-plugin
>>> nsslapd-pluginType: object
>>> nsslapd-pluginVendor: 389 Project
>>> nsslapd-pluginVersion: 1.4.3.231
>>> nsslapd-pluginbetxn: on
>>> nsslapd-pluginprecedence: 25
>>> objectClass: top
>>> objectClass: nsSlapdPlugin
>>> objectClass: extensibleObject
>>>
>>>>
>>>>
>>>> HTH,
>>>>
>>>> Mark
>>>>
>>>>>
>>>>> The first entry in cn=changelog is more than two months old. (May
>>>>> 26th
>>>>> is about
>>>>> the time I installed this server as a FreeIPA replica.)
>>>>>
>>>>> dn: changenumber=1,cn=changelog
>>>>> objectClass: top
>>>>> objectClass: changelogentry
>>>>> objectClass: extensibleObject
>>>>> targetuniqueid: 1e89ebcb-e20111e8-8820f96e-fc0c60a4
>>>>> changeNumber: 1
>>>>> targetDn:
>>>>> idnsname=_ldap._tcp,idnsname=example.com.,cn=dns,dc=example,dc=com
>>>>> changeTime: 20210526122901Z
>>>>> changeType: modify
>>>>> changes:: YWRkOiBzUlZSZWNvc...gplbnRyeXVzbjogMTEyCi0KAA==
>>>>>
>>>>> Is there a way to see trimming in action? I'm afraid it is not
doing
>>>>> anything.
>>>
>>> What diagnostic can I enable to see the trimming in action?
>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> I recommend setting up the changelog max age to 7 days:
>>>>>>>>
>>>>>>>>
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Once that the trimming cleans up the changelog the
database
>>>>>>>> compaction
>>>>>>>> should work without issue. You can also increase the
database
>>>>>>>> locks to
>>>>>>>> 1 million (that does require a restart of the server to
take
>>>>>>>> effect).
>>>>>>>
>>>>>>> Let's hope that 1 million is enough :-)
>>>>>>
>>>>>> It might not be, you can keep bumping it up if needed. There is
no
>>>>>> limit, or negative impact on db performance.
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> HTH,
>>>>>>>>
>>>>>>>> Mark
>>>>>>>>
>>>>>>> [1]
>>>>>>>
https://directory.fedoraproject.org/docs/389ds/FAQ/changelog-trimming.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Directory Server Development Team
>>>>>>
>>>>> _______________________________________________
>>>>> 389-users mailing list -- 389-users(a)lists.fedoraproject.org
>>>>> To unsubscribe send an email to
>>>>> 389-users-leave(a)lists.fedoraproject.org
>>>>> Fedora Code of Conduct:
>>>>>
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
>>>>> List Guidelines:
>>>>>
https://fedoraproject.org/wiki/Mailing_list_guidelines
>>>>> List Archives:
>>>>>
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproje...
>>>>>
>>>>>
>>>>> Do not reply to spam on the list, report it:
>>>>>
https://pagure.io/fedora-infrastructure
>>>>
>>>> --
>>>> Directory Server Development Team
>>>>
>>>
>> --
>> Directory Server Development Team
>>
> _______________________________________________
> 389-users mailing list -- 389-users(a)lists.fedoraproject.org
> To unsubscribe send an email to 389-users-leave(a)lists.fedoraproject.org
> Fedora Code of Conduct:
>
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
>
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproje...
> Do not reply to spam on the list, report it:
>
https://pagure.io/fedora-infrastructure
_______________________________________________
389-users mailing list -- 389-users(a)lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproje...
Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure