Hi everyone,
I'm stuck with a broken replica. I had a setup with two ipa server in
replica (ipa-server-4.6.4 on CentOS 7.6), let's say "idc01" and "idc02".
Due to heavy load idc01 crashed many times, and was not working anymore.
So I tried to redo the replica again. At first I tried to
"ipa-replica-manage re-initialize", with no success.
Now I'm trying to redo from scratch the replica setup: on idc02 I
removed the segments (ipa topologysegment-del, for both ca and domain
suffix), on idc01 I removed everything (ipa-server-install --uninstall),
then I joined domain (ipa-client-install), and everything is working so far.
When doing "ipa-replica-install" on idc01 I get:
[...]
[28/41]: setting up initial replication
Starting replication, please wait until this has completed.
Update in progress, 22 seconds elapsed
[ldap://idc02.my.dom.ain:389] reports: Update failed! Status: [Error
(-11) connection error: Unknown connection error (-11) - Total update
aborted]
And on idc02 (the working server), in
/var/log/dirsrv/slapd-MY-DOM-AIN/errors I find lines stating:
[20/Mar/2019:09:28:06.545187923 +0100] - INFO - NSMMReplicationPlugin -
repl5_tot_run - Beginning total update of replica
"agmt="cn=meToidc01.my.dom.ain" (idc01:389)".
[20/Mar/2019:09:28:26.528046160 +0100] - ERR - NSMMReplicationPlugin -
perform_operation - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Failed
to send extended operation: LDAP error -1 (Can't contact LDAP server)
[20/Mar/2019:09:28:26.530763939 +0100] - ERR - NSMMReplicationPlugin -
repl5_tot_log_operation_failure - agmt="cn=meToidc01.my.dom.ain"
(idc01:389): Received error -1 (Can't contact LDAP server): for total
update operation
[20/Mar/2019:09:28:26.532678072 +0100] - ERR - NSMMReplicationPlugin -
release_replica - agmt="cn=meToidc01.my.dom.ain" (idc01:389): Unable to
send endReplication extended operation (Can't contact LDAP server)
[20/Mar/2019:09:28:26.534307539 +0100] - ERR - NSMMReplicationPlugin -
repl5_tot_run - Total update failed for replica
"agmt="cn=meToidc01.my.dom.ain" (idc01:389)", error (-11)
[20/Mar/2019:09:28:26.561763168 +0100] - INFO - NSMMReplicationPlugin -
bind_and_check_pwp - agmt="cn=meToidc01.my.dom.ain" (idc01:389):
Replication bind with GSSAPI auth resumed
[20/Mar/2019:09:28:26.582389258 +0100] - WARN - NSMMReplicationPlugin -
repl5_inc_run - agmt="cn=meToidc01.my.dom.ain" (idc01:389): The remote
replica has a different database generation ID than the local database.
You may have to reinitialize the remote replica, or the local replica.
It seems that idc02 remembers something about the old replica.
Any hint?
Thank you in advance,
Giulio