On 7/29/21 9:06 AM, Rob Crittenden wrote:
Scott Serr via FreeIPA-users wrote:
> Apologies for the length/verbosity of the lasts message.
>
> I've read there can be a situation on IPA startup where the KDC server
> isn't fully up, but LDAP is up. At that point in time LDAP can get
> swamped causing failures to connect to the KDC? This appears to be my
> problem.
>
>
https://pagure.io/freeipa/issue/8544
>
> Please, anyone, let me know if this is the wrong conclusion.
>
> Thank you to the IPA folks answering questions on this list.
I'm not sure this is the same issue. In the ticket the system was widely
out of date (2 hours) and chrony couldn't sync time prior to IPA
starting because DNS wasn't running.
So 389 started, then the KDC and it got a ticket, then DNS which allowed
the time sync to happen and time moved a lot.
So see if your systems times are wildly out of sync during startup.
If you run ipactl restart after time is synced this should rule out
timing issues as each process should get a new ticket. I'm pretty sure
that 389 uses a MEMORY ccache.
rob
Thanks Rob for explaining the situation in the issue is different. Like
in the issue, my servers are VMs. The virtual HW clock might be way off
before ntp/chrony runs on boot, I will investigate this.
Although the VM hosts (Nutanix HCI) are very fast, these VMs are
sometimes very busy. It sounds like this shouldn't cause a
KDC/replication issue on startup though.
Scott
>> Scott
>>
>>
>> On 7/28/21 2:58 PM, Scott Serr via FreeIPA-users wrote:
>>> I'm running 5 ipa servers with (the latest on CentOS 8) 4.9.2.
>>>
>>> Synchronization had stopped yesterday and also 3 days ago. It
>>> actually stopped yesterday after I stopped / modified / started
"ipa1"
>>> to configure rotating logs longer so I could track down what happened
>>> 3 days ago.
>>>
>>> 2021-07-27 17:22:46 ipactl stop
>>> 2021-07-27 17:22:59 emacs dse.ldif # Modify to access and error log
>>> rotation values
>>> 2021-07-27 17:23:45 ipactl start
>>>
>>> Below seems to be what kicked off the bad behavior. I've seen a few
>>> posts about removing the keys out of dse.ldif when this happens. I'm
>>> a bit leery of doing this, as I don't fully understand what is going
>>> on. (is it comparable to clearing out known_host entries when using ssh?)
>>>
>>> [27/Jul/2021:17:23:49.818525015 -0600] - ERR - attrcrypt_unwrap_key -
>>> Failed to unwrap key for cipher AES
>>> [27/Jul/2021:17:23:49.820422259 -0600] - ERR - attrcrypt_cipher_init -
>>> Symmetric key failed to unwrap with the private key; Cert might have
>>> been renewed since the
>>> key is wrapped. To recover the encrypted contents, keep the wrapped
>>> symmetric key value.
>>> [27/Jul/2021:17:23:50.040967207 -0600] - ERR - attrcrypt_unwrap_key -
>>> Failed to unwrap key for cipher 3DES
>>> [27/Jul/2021:17:23:50.043074553 -0600] - ERR - attrcrypt_cipher_init -
>>> Symmetric key failed to unwrap with the private key; Cert might have
>>> been renewed since the
>>> key is wrapped. To recover the encrypted contents, keep the wrapped
>>> symmetric key value.
>>> [27/Jul/2021:17:23:50.044268421 -0600] - ERR - attrcrypt_init - All
>>> prepared ciphers are not available. Please disable attribute encryption.
>>> [27/Jul/2021:17:23:50.263786473 -0600] - ERR - attrcrypt_unwrap_key -
>>> Failed to unwrap key for cipher AES
>>> [27/Jul/2021:17:23:50.266090934 -0600] - ERR - attrcrypt_cipher_init -
>>> Symmetric key failed to unwrap with the private key; Cert might have
>>> been renewed since the key is wrapped. To recover the encrypted
>>> contents, keep the wrapped symmetric key value.
>>> [27/Jul/2021:17:23:50.470918523 -0600] - ERR - attrcrypt_unwrap_key -
>>> Failed to unwrap key for cipher 3DES
>>> [27/Jul/2021:17:23:50.472915669 -0600] - ERR - attrcrypt_cipher_init -
>>> Symmetric key failed to unwrap with the private key; Cert might have
>>> been renewed since the key is wrapped. To recover the encrypted
>>> contents, keep the wrapped symmetric key value.
>>> [27/Jul/2021:17:23:50.474282471 -0600] - ERR - attrcrypt_init - All
>>> prepared ciphers are not available. Please disable attribute encryption.
>>> [27/Jul/2021:17:23:50.891048127 -0600] - ERR - schema-compat-plugin -
>>> scheduled schema-compat-plugin tree scan in about 5 seconds after the
>>> server startup!
>>>
>>> Then ipa1 can't talk to the replicas (ipa2,ipa3,ipa5,ipa6) shown below:
>>>
>>> [27/Jul/2021:17:23:51.081696109 -0600] - ERR - set_krb5_creds - Could
>>> not get initial credentials for principal
>>> [ldap/ipa1.hpc.example.com(a)HPC.EXAMPLE.COM] in keytab
>>> [FILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for
>>> requested realm)
>>> [27/Jul/2021:17:23:51.086755379 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=meToipa4.hpc.example.com" (ipa4:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:23:51.091748474 -0600] - ERR - set_krb5_creds - Could
>>> not get initial credentials for principal
>>> [ldap/ipa1.hpc.example.com(a)HPC.EXAMPLE.COM] in keytab
>>> [FILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for
>>> requested realm)
>>> [27/Jul/2021:17:23:51.093430455 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp -
>>> agmt="cn=ipa1.hpc.example.com-to-ipa6.hpc.example.com" (ipa6:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:23:51.094725291 -0600] - ERR - schema-compat-plugin -
>>> schema-compat-plugin tree scan will start in about 5 seconds!
>>> [27/Jul/2021:17:23:51.096059194 -0600] - ERR - set_krb5_creds - Could
>>> not get initial credentials for principal
>>> [ldap/ipa1.hpc.example.com(a)HPC.EXAMPLE.COM] in keytab
>>> [FILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for
>>> requested realm)
>>> [27/Jul/2021:17:23:51.097152619 -0600] - INFO - slapd_daemon - slapd
>>> started. Listening on All Interfaces port 389 for LDAP requests
>>> [27/Jul/2021:17:23:51.098356748 -0600] - INFO - slapd_daemon -
>>> Listening on All Interfaces port 636 for LDAPS requests
>>> [27/Jul/2021:17:23:51.099577958 -0600] - INFO - slapd_daemon -
>>> Listening on /var/run/slapd-HPC-EXAMPLE-COM.socket for LDAPI requests
>>> [27/Jul/2021:17:23:51.100701349 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=caToipa3.hpc.example.com" (ipa3:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:23:51.101782194 -0600] - ERR - set_krb5_creds - Could
>>> not get initial credentials for principal
>>> [ldap/ipa1.hpc.example.com(a)HPC.EXAMPLE.COM] in keytab
>>> [FILE:/etc/dirsrv/ds.keytab]: -1765328228 (Cannot contact any KDC for
>>> requested realm)
>>> [27/Jul/2021:17:23:51.103848248 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=caToipa5.hpc.example.com" (ipa5:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:23:58.152621025 -0600] - ERR - schema-compat-plugin -
>>> Finished plugin initialization.
>>> [27/Jul/2021:17:24:21.201225830 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=meToipa2.hpc.example.com" (ipa2:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:24:21.203158794 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp -
>>> agmt="cn=ipa1.hpc.example.com-to-ipa6.hpc.example.com" (ipa6:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:24:21.204833314 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=meToipa3.hpc.example.com" (ipa3:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:24:21.206099975 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=meToipa5.hpc.example.com" (ipa5:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>> [27/Jul/2021:17:54:03.675297221 -0600] - ERR - NSMMReplicationPlugin -
>>> bind_and_check_pwp - agmt="cn=caToipa2.hpc.example.com" (ipa2:389)
-
>>> Replication bind with GSSAPI auth failed: LDAP error -1 (Can't contact
>>> LDAP server) ()
>>>
>>> After realizing I had a problem this morning, I rebooted ipa1 but it
>>> did not help syncing. I re-initialized ipa1 from ipa3, this got them
>>> all authenticating to each other and in sync.
>>>
>>> [28/Jul/2021:08:09:10.347094254 -0600] - INFO - NSMMReplicationPlugin
>>> - bind_and_check_pwp - agmt="cn=caToipa3.hpc.inl.gov" (ipa3:389):
>>> Replication bind with GSSAPI auth resumed
>>> [28/Jul/2021:08:09:10.449170075 -0600] - INFO - NSMMReplicationPlugin
>>> - bind_and_check_pwp - agmt="cn=meToipa3.hpc.inl.gov" (ipa3:389):
>>> Replication bind with GSSAPI auth resumed
>>> [....]
>>>
>>> I changed the Data Manager password with "dsconf" -- but that was
>>> between the first failure and the second. Could that be causing
>>> problems? What direction to go from here? Thank you!
>>>
>>> Scott
>>>
>>>
>>> _______________________________________________
>>> FreeIPA-users mailing list -- freeipa-users(a)lists.fedorahosted.org
>>> To unsubscribe send an email to freeipa-users-leave(a)lists.fedorahosted.org
>>> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
>>> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
>>> List Archives:
https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedoraho...
>>> Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure
>> _______________________________________________
>> FreeIPA-users mailing list -- freeipa-users(a)lists.fedorahosted.org
>> To unsubscribe send an email to freeipa-users-leave(a)lists.fedorahosted.org
>> Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
>> List Guidelines:
https://fedoraproject.org/wiki/Mailing_list_guidelines
>> List Archives:
https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedoraho...
>> Do not reply to spam on the list, report it:
https://pagure.io/fedora-infrastructure
>>