Trevor,
I have not seen this before, but I also have not seen what happens when
you add invalid schema.
But to try and get the server back up and running try removing the
/var/lib/dirsrv/slapd-YOUR_INSTANCE/db/__db.00* files. So make sure the
ns-slapd process is not running, kill it if you have to, then remove
those files and try starting it back up.
HTH,
Mark
On 6/25/20 1:42 PM, Fong, Trevor wrote:
Hi There,
We tried to dynamically a new schema dynamically using
/usr/lib64/dirsrv/slapd-eldapp1/schema-reload.pl
Unfortunately, (and unknown to us at the time) the objectClass
definition misspelt a couple of the attribute names.
The schema reload process should have picked that up and refused it,
but it didn't and so proceeded to update entries using the new schema.
That's when we started getting errors like the following in the error log:
[19/Jun/2020:10:28:08.390882389 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.399523527 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.404890880 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.430284251 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.466371449 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.495859651 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.522007224 -0700] - ERR - libdb - BDB0151 fsync:
Input/output error
[19/Jun/2020:10:28:08.546930415 -0700] - ERR - libdb - BDB4519
txn_checkpoint: failed to flush the buffer cache: Input/output error
[19/Jun/2020:10:28:08.569781853 -0700] - CRIT - checkpoint_threadmain
- Serious Error---Failed to checkpoint database, err=5 (Input/output
error)
I tried restarting dirsrv and that's when it started giving errors
about the unknown (misspelt) attributes in the new objectClass.
I fixed those errors in the schema and restarting dirsrv.
I saw the following message in the error log:
NOTICE - dblayer_start - Detected Disorderly Shutdown last time
Directory Server was running, recovering database.
There was no further log, but the CPU utilization for ns-slapd was at
99.9% so I just let it run over night hoping that it wasn't stuck in a
loop.
But there was no improvement the next morning, so I ordered a RAM
increase from 4 GB à16 GB hoping that would fix it, I let it run for a
while with no indication of progress.
I also tried to run db2ldif to try to dump the db to an ldif file, but
got the same "recovering database" message. That's where it is now -
I'll let it run for a few hours and hope it does something.
Would anyone be able to offer any further advice?
Is there any way to see how it's getting along with the database recovery?
Is this db well and truly hosed?
Unfortunately this system was spec'd for development so no backups
were running so recovery from backup is not an option.
Thanks,
Trev
_______________________________________________
389-users mailing list -- 389-users(a)lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave(a)lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproje...
--
389 Directory Server Development Team