We have 3 ipa servers, one of which is throwing an ERROR condition during ipa-healthcheck for the "ReplicationCheck" test. Ipa-healthcheck shows no errors when run from the other two replicas. Looking back at the logs, it appears this started about ten days ago, so it is not a transient issue as the output suggests:

[root@ipa1.id.example.com]# ipa-healthcheck --failures-only

[

{

"source": "ipahealthcheck.ds.replication",

"check": "ReplicationCheck",

"result": "ERROR",

"uuid": "2b971ca3-678e-4c26-86a0-5b352027e7e8",

"when": "20211201180013Z",

"duration": "0.687812",

"kw": {

"key": "DSREPLLE0003",

"items": [

"Replication",

"Agreement"

"msg": "The replication agreement (catoipa2.id.example.com) under \"o=ipaca\" is not in synchronization.\nStatus message: error (18) can't acquire replica (incremental update transient warning. backing off, will retry update later.)"

}

{

"source": "ipahealthcheck.ds.replication",

"check": "ReplicationCheck",

"result": "ERROR",

"uuid": "99436870-bc98-4ce8-84b1-c0b0806945c8",

"when": "20211201180013Z",

"duration": "0.687829",

"kw": {

"key": "DSREPLLE0003",

"items": [

"Replication",

"Agreement"

"msg": "The replication agreement (catoipa3.id.example.com) under \"o=ipaca\" is not in synchronization.\nStatus message: error (18) can't acquire replica (incremental update transient warning. backing off, will retry update later.)"

}

]

389-ds error logs show a slew of these:

[30/Nov/2021:23:41:35.277399980 -0800] - ERR - NSMMReplicationPlugin - send_updates - agmt="cn=caToipa3.id.example.com" (ipa2:389): Missing data encountered. If the error persists the replica must be reinitialized.

[30/Nov/2021:23:41:38.288003253 -0800] - ERR - agmt="cn=caToipa3.id.example.com" (ipa3:389) - clcache_load_buffer - Can't locate CSN 6197e149000000060000 in the changelog (DB rc=-30988). If replication stops, the consumer may need to be reinitialized.

[30/Nov/2021:23:41:38.289713999 -0800] - ERR - NSMMReplicationPlugin - send_updates - agmt="cn=caToipa3.id.example.com" (ipa3:389): Missing data encountered. If the error persists the replica must be reinitialized.

That would seem to suggest running a "ipa-replica-manage re-initialize --from $SERVER_TO_PULL_FROM" may resolve the issue, but before we try that, is there anything else we should look at?

Thanks,

Scott