Hi there, this is 389-ds-base-1.3.5.10-11 (part of ipa-server-4.4.0-12.0.1) and suddenly daily backup has started to fail with messages:
2019-01-28T04:10:04Z INFO Backing up ipaca in REALM-COM to LDIF 2019-01-28T04:10:04Z INFO Waiting for LDIF to finish 2019-01-28T04:10:05Z DEBUG File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 171, in execute return_value = self.run() File "/usr/lib/python2.7/site-packages/ipaserver/install/ipa_backup.py", line 300, in run self.db2ldif(instance, 'ipaca', online=options.online) File "/usr/lib/python2.7/site-packages/ipaserver/install/ipa_backup.py", line 425, in db2ldif shutil.move(ldiffile, os.path.join(self.dir, ldifname)) File "/usr/lib64/python2.7/shutil.py", line 301, in move copy2(src, real_dst) File "/usr/lib64/python2.7/shutil.py", line 130, in copy2 copyfile(src, dst) File "/usr/lib64/python2.7/shutil.py", line 82, in copyfile with open(src, 'rb') as fsrc: 2019-01-28T04:10:05Z DEBUG The ipa-backup command failed, exception: IOError: [Errno 2] No such file or directory: u'/var/ lib/dirsrv/slapd-REALM-COM/ldif/REALM-COM-ipaca.ldif' 2019-01-28T04:10:05Z ERROR [Errno 2] No such file or directory: u'/var/lib/dirsrv/slapd-REALM-COM/ldif/REALM-COM-ipaca.ldif' 2019-01-28T04:10:05Z ERROR The ipa-backup command failed. See /var/log/ipabackup.log for more information
And service start fails with messages:
[02/Feb/2019:22:47:37.889779410 -0800] 389-Directory/1.3.5.10 B2016.309.1527 starting up [02/Feb/2019:22:47:37.906422534 -0800] default_mr_indexer_create: warning - plugin [caseIgnoreIA5Match] does not handle caseExactIA5Match [02/Feb/2019:22:47:37.921288555 -0800] WARNING: userRoot: entry cache size 10485760 B is less than db size 16932864 B; We recommend to increase the entry cache size nsslapd-cachememsize. [02/Feb/2019:22:47:37.921943984 -0800] WARNING: ipaca: entry cache size 10485760 B is less than db size 1757741056 B; We recommend to increase the entry cache size nsslapd-cachememsize. [02/Feb/2019:22:47:37.922701343 -0800] WARNING: changelog: entry cache size 2097152 B is less than db size 82935808 B; We recommend to increase the entry cache size nsslapd-cachememsize. [02/Feb/2019:22:47:37.925215059 -0800] Detected Disorderly Shutdown last time Directory Server was running, recovering database. [02/Feb/2019:22:47:37.926177620 -0800] libdb: BDB1546 unable to join the environment
thanks in advance for any help, Zarko
[02/Feb/2019:22:47:37.921288555 -0800] WARNING: userRoot: entry cache size 10485760 B is less than db size 16932864 B; We recommend to increase the entry cache size nsslapd-cachememsize. [02/Feb/2019:22:47:37.921943984 -0800] WARNING: ipaca: entry cache size 10485760 B is less than db size 1757741056 B; We recommend to increase the entry cache size nsslapd-cachememsize. [02/Feb/2019:22:47:37.922701343 -0800] WARNING: changelog: entry cache size 2097152 B is less than db size 82935808 B; We recommend to increase the entry cache size nsslapd-cachememsize.
These cache sizes are really low, so consider autotuning in the future to improve your performance.
[02/Feb/2019:22:47:37.925215059 -0800] Detected Disorderly Shutdown last time Directory Server was running, recovering database. [02/Feb/2019:22:47:37.926177620 -0800] libdb: BDB1546 unable to join the environment
https://access.redhat.com/solutions/2265101
This has a solution, but sadly I don’t have a red hat account: perhaps one of our redhat maintainers can look an share the answer.
If this is urgent, I would advise restoring a backup, and doing a re-init of the server (I think IPA should have tools for this)
Another option is to contact the IPA list about the issue.
I hope this helps,
thanks in advance for any help, Zarko _______________________________________________ 389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
— Sincerely,
William Brown Software Engineer, 389 Directory Server SUSE Labs
Thanks for reply William, I believe I restored db with command "ipa-restore -d --data /path/ipa-data-2018-12-30-20-12-00" but 389ds still fails to start .
Restore final logs are:
[03/Feb/2019:22:49:54.229404539 -0800] import ipaca: Closing files... [03/Feb/2019:22:50:21.082241726 -0800] All database threads now stopped [03/Feb/2019:22:50:21.091829357 -0800] import ipaca: Import complete. Processed 207184 entries in 148 seconds. (1399.89 entries/sec)
2019-02-04T06:50:21Z DEBUG stderr= 2019-02-04T06:50:21Z INFO Starting Directory Server 2019-02-04T06:50:21Z DEBUG Starting external process 2019-02-04T06:50:21Z DEBUG args=/bin/systemctl start dirsrv@X-COM.service 2019-02-04T06:51:52Z DEBUG Process finished, return code=1 2019-02-04T06:51:52Z DEBUG File "/usr/lib/python2.7/site-packages/ipapython/admintool.py", line 171, in execute return_value = self.run() File "/usr/lib/python2.7/site-packages/ipaserver/install/ipa_restore.py", line 398, in run dirsrv.start(capture_output=False) File "/usr/lib/python2.7/site-packages/ipaplatform/redhat/services.py", line 157, in start instance_name, capture_output=capture_output, wait=wait) File "/usr/lib/python2.7/site-packages/ipaplatform/base/services.py", line 285, in start skip_output=not capture_output) File "/usr/lib/python2.7/site-packages/ipapython/ipautil.py", line 494, in run raise CalledProcessError(p.returncode, arg_string, str(output)) 2019-02-04T06:51:52Z DEBUG The ipa-restore command failed, exception: CalledProcessError: Command '/bin/systemctl start dirsrv@X-COM.service' returned non-zero exit status 1 2019-02-04T06:51:52Z ERROR Command '/bin/systemctl start dirsrv@X-COM.service' returned non-zero exit status 1 2019-02-04T06:51:52Z ERROR The ipa-restore command failed. See /var/log/iparestore.log for more information
And 389ds status is
# systemctl status -l dirsrv@x-COM.service dirsrv@x-COM.service - 389 Directory Server x-COM. Loaded: loaded (/usr/lib/systemd/system/dirsrv@.service; enabled; vendor preset: disabled) Active: failed (Result: timeout) since Sun 2019-02-03 22:56:33 PST; 5min ago Process: 3533 ExecStart=/usr/sbin/ns-slapd -D /etc/dirsrv/slapd-%i -i /var/run/dirsrv/slapd-%i.pid (code=killed, signal=TERM) Process: 3526 ExecStartPre=/usr/sbin/ds_systemd_ask_password_acl /etc/dirsrv/slapd-%i/dse.ldif (code=exited, status=0/SUCCESS) Main PID: 3533 (code=killed, signal=TERM)
ns-slapd[3533]: [03/Feb/2019:22:55:04.336762406 -0800] attrcrypt - All prepared ciphers are not available. Please disable attribute encryption. ns-slapd[3533]: [03/Feb/2019:22:55:04.346874658 -0800] attrcrypt - attrcrypt_unwrap_key: failed to unwrap key for cipher AES ns-slapd[3533]: [03/Feb/2019:22:55:04.347913178 -0800] attrcrypt - attrcrypt_cipher_init: symmetric key failed to unwrap with the private key; Cert might have been renewed since the key is wrapped. To recover the encrypted contents, keep the wrapped symmetric key value. ns-slapd[3533]: [03/Feb/2019:22:55:04.350743010 -0800] attrcrypt - attrcrypt_unwrap_key: failed to unwrap key for cipher 3DES ns-slapd[3533]: [03/Feb/2019:22:55:04.351680066 -0800] attrcrypt - attrcrypt_cipher_init: symmetric key failed to unwrap with the private key; Cert might have been renewed since the key is wrapped. To recover the encrypted contents, keep the wrapped symmetric key value. ns-slapd[3533]: [03/Feb/2019:22:55:04.352567754 -0800] attrcrypt - All prepared ciphers are not available. Please disable attribute encryption. systemd[1]: dirsrv@x-COM.service start operation timed out. Terminating. systemd[1]: Failed to start 389 Directory Server x-COM.. systemd[1]: Unit dirsrv@x-COM.service entered failed state. systemd[1]: dirsrv@x-COM.service failed.
On 4 Feb 2019, at 17:07, Zarko D zarko@etcfstab.com wrote:
Thanks for reply William, I believe I restored db with command "ipa-restore -d --data /path/ipa-data-2018-12-30-20-12-00" but 389ds still fails to start .
<snip>
ns-slapd[3533]: [03/Feb/2019:22:55:04.336762406 -0800] attrcrypt - All prepared ciphers are not available. Please disable attribute encryption. ns-slapd[3533]: [03/Feb/2019:22:55:04.346874658 -0800] attrcrypt - attrcrypt_unwrap_key: failed to unwrap key for cipher AES ns-slapd[3533]: [03/Feb/2019:22:55:04.347913178 -0800] attrcrypt - attrcrypt_cipher_init: symmetric key failed to unwrap with the private key; Cert might have been renewed since the key is wrapped. To recover the encrypted contents, keep the wrapped symmetric key value. ns-slapd[3533]: [03/Feb/2019:22:55:04.350743010 -0800] attrcrypt - attrcrypt_unwrap_key: failed to unwrap key for cipher 3DES ns-slapd[3533]: [03/Feb/2019:22:55:04.351680066 -0800] attrcrypt - attrcrypt_cipher_init: symmetric key failed to unwrap with the private key; Cert might have been renewed since the key is wrapped. To recover the encrypted contents, keep the wrapped symmetric key value. ns-slapd[3533]: [03/Feb/2019:22:55:04.352567754 -0800] attrcrypt - All prepared ciphers are not available. Please disable attribute encryption. systemd[1]: dirsrv@x-COM.service start operation timed out. Terminating. systemd[1]: Failed to start 389 Directory Server x-COM.. systemd[1]: Unit dirsrv@x-COM.service entered failed state. systemd[1]: dirsrv@x-COM.service failed. _______________________________________________
So this looks like an issue in the IPA restore tool. attrcrypt is deeply tied to the *nss database* in /etc/dirsrv/slapd-x-COM, however, most people fundamentally misunderstand and think it relates to the *certificates* (apparently no one listens to me :) ). I would hazard a guess the ipa backup tool dumps certs from the db but doesn’t backup the secmod.db file, which has caused this problem on restore.
So as a result, it looks like when you restored the IPA from backup, it probably re-built/removed/changed the nss db content which has effectively erased your attribute encryption key.
Now I think attribute encryption does very little to protect from real threats anyway, but I don’t control the IPA project, so I don’t know why they thought it was reasonable to enable it.
In this case, you need to use your *latest* nss db from /etc/dirsrv/slapd- (IE from BEFORE you started the restore process …), and put it back into the location. I think you need key3.db, cert8.db and secmod.db files to fix this. Provided you can copy those back in, then it “should” startup ….
Alternately, if you want to do this “the DS way”, you could on another master ipa server do “db2ldif -r” (-r is important!), then on the restoring master in this state, do an ldif2db. There are some attr encryption parameters to check. I think you need to just disable attribute encryption in db2ldif -r call, and then it “should work” on the import.
Finally, another option is to disable attribute encryption, and then do a full-reinint via the replication process, but given this is IPA there are lots of possible edgecases that I don’t understand here, and I don’t know the ipa tools well enough to tell you how to trigger the replication re-init. The above db2ldif process is basically the same thing anyway :)
Otherwise, I’m sorry, but the best advice is to remove the IPA replica and rebuild it. I’m really really sorry, but it looks like IPA’s assumptions around attrcrypt here may have really hurt you :(
— Sincerely,
William Brown Software Engineer, 389 Directory Server SUSE Labs
William, I've tried restoring key3.db, cert8.db and secmod.db but it didn't help. Since IPA server is a VM, I was able to try different things on VM's clone, and fortunately (with the luck) I fixed the problem. Basically I tried different daily backups that I have, and with one of them, after restore was done, 389ds start failed with messages like:
libdb: BDB2506 file changelog/id2entry.db has LSN 21643/9352420, past end of log at 21243/6180 libdb: BDB2507 Commonly caused by moving a database from one database environment libdb: BDB2508 to another without clearing the database LSNs, or by removing all of libdb: BDB2509 the log files from a database environment dbp->open("changelog/id2entry.db") failed: Invalid argument (22)
And after just removing a file changelog/id2entry.db, I was able to quickly start 389ds.
On 5 Feb 2019, at 04:50, Zarko D zarko@etcfstab.com wrote:
William, I've tried restoring key3.db, cert8.db and secmod.db but it didn't help. Since IPA server is a VM, I was able to try different things on VM's clone, and fortunately (with the luck) I fixed the problem. Basically I tried different daily backups that I have, and with one of them, after restore was done, 389ds start failed with messages like:
libdb: BDB2506 file changelog/id2entry.db has LSN 21643/9352420, past end of log at 21243/6180 libdb: BDB2507 Commonly caused by moving a database from one database environment libdb: BDB2508 to another without clearing the database LSNs, or by removing all of libdb: BDB2509 the log files from a database environment dbp->open("changelog/id2entry.db") failed: Invalid argument (22)
And after just removing a file changelog/id2entry.db, I was able to quickly start 389ds.
If you remove the changelog id2entry, you’ll need to re-init from another master (if it’s standalone, probably no issue).
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
— Sincerely,
William Brown Software Engineer, 389 Directory Server SUSE Labs
Correct, I've run the command:
ipa-replica-manage -v re-initialize --from=<another-master>
... it completed successfully. The remaining problem is that this 'fixed' server doesn't get updates from another master, the error is : "Error (16) : Incremental update connection error. Backing off, will retry update later." As per this doc: https://directory.fedoraproject.org/docs/389ds/FAQ/replication-update-status... .. this is considered temporary, I am afraid it's permanent in my case.
For now my work around is having cronjob that repeats "ipa-replica-manage -v re-initialize --from=<another-master> " and this gets updates.
On 5 Feb 2019, at 09:00, Zarko D zarko@etcfstab.com wrote:
Correct, I've run the command:
ipa-replica-manage -v re-initialize --from=<another-master>
... it completed successfully. The remaining problem is that this 'fixed' server doesn't get updates from another master, the error is : "Error (16) : Incremental update connection error. Backing off, will retry update later." As per this doc: https://directory.fedoraproject.org/docs/389ds/FAQ/replication-update-status... .. this is considered temporary, I am afraid it's permanent in my case.
For now my work around is having cronjob that repeats "ipa-replica-manage -v re-initialize --from=<another-master> " and this gets updates.
This doesn’t exactly sound like a sustainable situation … I believe you can enable verbose rreplication logging by setting the nsslapd-errorlog-level to 8192 which could give us more insight.
— Sincerely,
William Brown Software Engineer, 389 Directory Server SUSE Labs
This doesn’t exactly sound like a sustainable situation …
Yes, beside replica failing, I also can't do any work on this server, add user/host fails, but at least I have a RO server with all updates.
rreplication logging by setting the nsslapd-errorlog-level to 8192 which could
This reads how to increase log level: https://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#troubleshooting
I am not sure if I have present nsslapd-errorlog-level configuration attribute. What ldapsearch command would be to discover this?
On 9 Feb 2019, at 07:39, Zarko D zarko@etcfstab.com wrote:
This doesn’t exactly sound like a sustainable situation …
Yes, beside replica failing, I also can't do any work on this server, add user/host fails, but at least I have a RO server with all updates.
:S not a great situation.
rreplication logging by setting the nsslapd-errorlog-level to 8192 which could
This reads how to increase log level: https://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#troubleshooting
I am not sure if I have present nsslapd-errorlog-level configuration attribute. What ldapsearch command would be to discover this?
Change your search base to cn=config, and use cn=Directory Manager as the binddn.
Alternately, stop the server, and edit /etc/dirsrv/slapd-instance/dse.ldif, and put the attribute into cn=config section
389-users mailing list -- 389-users@lists.fedoraproject.org To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject....
— Sincerely,
William Brown Software Engineer, 389 Directory Server SUSE Labs
389-users@lists.fedoraproject.org