Good afternoon
Our ns-slapd crashed earlier today, with a segfault in libback-ldbm.so while the system was running a bdb_db_compact_one_db action.
Is there anyway to trackdown/diagnose what might have caused the segfault? Some type of DB integrity check or something?
Nelson
Hi,
By any chance do you know if the crash (SIGSEV) dumped a core ? In such case you may install debuginfo rpm and analyze (gdb) the reason of the crash.
I am not sure the crash is due to a DB corruption/breakage but clearly the crash will trigger a recovery. Is the suffix (userRoot) replicated ? is it a supplier or a hub ? I have the feeling it crashed while compacting the changelog. bdb_db_compact_one_db is possibly missing a test that the 'db' (changelog) exists before dereferencing it.
regards thierry
On 3/18/21 8:08 AM, Nelson Bartley wrote:
Good afternoon
Our ns-slapd crashed earlier today, with a segfault in libback-ldbm.so while the system was running a bdb_db_compact_one_db action.
Is there anyway to trackdown/diagnose what might have caused the segfault? Some type of DB integrity check or something?
Nelson
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
On 18 Mar 2021, at 18:46, thierry bordaz tbordaz@redhat.com wrote:
Hi,
By any chance do you know if the crash (SIGSEV) dumped a core ? In such case you may install debuginfo rpm and analyze (gdb) the reason of the crash.
I am not sure the crash is due to a DB corruption/breakage but clearly the crash will trigger a recovery. Is the suffix (userRoot) replicated ? is it a supplier or a hub ? I have the feeling it crashed while compacting the changelog. bdb_db_compact_one_db is possibly missing a test that the 'db' (changelog) exists before dereferencing it.
Could be worth adding a log message and an ASSERT here so that we can catch this case if it's happening ... .
regards thierry
On 3/18/21 8:08 AM, Nelson Bartley wrote:
Good afternoon
Our ns-slapd crashed earlier today, with a segfault in libback-ldbm.so while the system was running a bdb_db_compact_one_db action.
Is there anyway to trackdown/diagnose what might have caused the segfault? Some type of DB integrity check or something?
Nelson
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.... Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
— Sincerely,
William Brown
Senior Software Engineer, 389 Directory Server SUSE Labs, Australia
389-users@lists.fedoraproject.org