My primary IPA server has failed. I was running a python script against IPA doing some user management when everything when unresponsive. I couldn't even get in at a console to check what was going on. I ended up rebooting it. After doing so, dirsrv wouldn't start because dse.ldif was missing. I have copied this file over from a replica IPA server, so dirsrv starts now. However, it seems that other services are unable to connect to LDAP properly. DNS isn't resolving when querying the primary even though ipactl shows named is running. smb and winbind won't start and it appears to be a problem with connecting to LDAP. Is there a way to check the integrity of my LDAP database? Or should I try to copy the LDAP database form my working replica to the primary?
Kristian Petersen via FreeIPA-users wrote:
My primary IPA server has failed. I was running a python script against IPA doing some user management when everything when unresponsive. I couldn't even get in at a console to check what was going on. I ended up rebooting it. After doing so, dirsrv wouldn't start because dse.ldif was missing. I have copied this file over from a replica IPA server, so dirsrv starts now. However, it seems that other services are unable to connect to LDAP properly. DNS isn't resolving when querying the primary even though ipactl shows named is running. smb and winbind won't start and it appears to be a problem with connecting to LDAP. Is there a way to check the integrity of my LDAP database? Or should I try to copy the LDAP database form my working replica to the primary?
There should have been a dse.ldif.startOK which would have been better to use. Given you have started the server already it is probably already updated, losing the old values, but worth checking.
I know that at least the value of nsslapd-localhost has the hostname stored. Replication agreements are also stored per-host in cn=config (which is not replicated).
If the database were corrupted then 389-ds should detect it. I'm suspecting that the dse.ldif from another master is the culprit.
rob
I got it working by copying a different dse.ldif file that was already on the primary server (the one I was trying to fix) and overwrite the one I copied from the replica. The startOK had already been overwritten as you suspected, but this one was only a little older than that one had been. Thanks for the pointers.
On Thu, Jan 2, 2020 at 11:41 AM Rob Crittenden rcritten@redhat.com wrote:
Kristian Petersen via FreeIPA-users wrote:
My primary IPA server has failed. I was running a python script against IPA doing some user management when everything when unresponsive. I couldn't even get in at a console to check what was going on. I ended up rebooting it. After doing so, dirsrv wouldn't start because dse.ldif was missing. I have copied this file over from a replica IPA server, so dirsrv starts now. However, it seems that other services are unable to connect to LDAP properly. DNS isn't resolving when querying the primary even though ipactl shows named is running. smb and winbind won't start and it appears to be a problem with connecting to LDAP. Is there a way to check the integrity of my LDAP database? Or should I try to copy the LDAP database form my working replica to the primary?
There should have been a dse.ldif.startOK which would have been better to use. Given you have started the server already it is probably already updated, losing the old values, but worth checking.
I know that at least the value of nsslapd-localhost has the hostname stored. Replication agreements are also stored per-host in cn=config (which is not replicated).
If the database were corrupted then 389-ds should detect it. I'm suspecting that the dse.ldif from another master is the culprit.
rob
freeipa-users@lists.fedorahosted.org