Hello Wiliam,
Thank you for you response, and sorry for my late one.
Here are more logs regarding this issue:
Kerberos client debugging:
```
$ KRB5_TRACE=/dev/stderr ldapwhoami -QY GSSAPI -H
ldap://ldap.example.net
[743517] 1610373687.808962: ccselect module realm chose cache
FILE:/tmp/krb5cc_0_jia7kOLl7r with client principal me(a)EXAMPLE.NET for server principal
ldap/ldap.example.net(a)EXAMPLE.NET
[743517] 1610373687.808963: Getting credentials me(a)EXAMPLE.NET ->
ldap/ldap.example.net(a)EXAMPLE.NET using ccache FILE:/tmp/krb5cc_0_jia7kOLl7r
[743517] 1610373687.808964: Retrieving me(a)EXAMPLE.NET ->
ldap/ldap.example.net(a)EXAMPLE.NET from FILE:/tmp/krb5cc_0_jia7kOLl7r with result:
-1765328243/Matching credential not found (filename: /tmp/krb5cc_0_jia7kOLl7r)
[743517] 1610373687.808965: Retrieving me(a)EXAMPLE.NET -> krbtgt/EXAMPLE.NET(a)EXAMPLE.NET
from FILE:/tmp/krb5cc_0_jia7kOLl7r with result: 0/Success
[743517] 1610373687.808966: Starting with TGT for client realm: me(a)EXAMPLE.NET ->
krbtgt/EXAMPLE.NET(a)EXAMPLE.NET
[743517] 1610373687.808967: Requesting tickets for ldap/ldap.example.net(a)EXAMPLE.NET,
referrals on
[743517] 1610373687.808968: Generated subkey for TGS request: aes256-cts/2501
[743517] 1610373687.808969: etypes requested in TGS request: aes256-cts, aes256-sha2,
camellia256-cts, aes128-cts, aes128-sha2, camellia128-cts
[743517] 1610373687.808971: Encoding request body and padata into FAST request
[743517] 1610373687.808972: Sending request (1565 bytes) to
EXAMPLE.NET
[743517] 1610373687.808973: Resolving hostname
krb5.example.net
[743517] 1610373687.808974: Initiating TCP connection to stream XXXX:XXXX:XXXX::XXXX:88
[743517] 1610373687.808975: Sending TCP request to stream XXXX:XXXX:XXXX::XXXX:88
[743517] 1610373687.808976: Received answer (1524 bytes) from stream
XXXX:XXXX:XXXX::XXXX:88
[743517] 1610373687.808977: Terminating TCP connection to stream XXXX:XXXX:XXXX::XXXX:88
[743517] 1610373687.808978: Response was not from master KDC
[743517] 1610373687.808979: Decoding FAST response
[743517] 1610373687.808980: FAST reply key: aes256-cts/47B8
[743517] 1610373687.808981: TGS reply is for me(a)EXAMPLE.NET ->
ldap/ldap.example.net(a)EXAMPLE.NET with session key aes256-cts/1C40
[743517] 1610373687.808982: TGS request result: 0/Success
[743517] 1610373687.808983: Received creds for desired service
ldap/ldap.example.net(a)EXAMPLE.NET
[743517] 1610373687.808984: Storing me(a)EXAMPLE.NET -> ldap/ldap.example.net(a)EXAMPLE.NET
in FILE:/tmp/krb5cc_0_jia7kOLl7r
[743517] 1610373687.808986: Creating authenticator for me(a)EXAMPLE.NET ->
ldap/ldap.example.net(a)EXAMPLE.NET, seqnum 365654266, subkey aes256-cts/D659, session key
aes256-cts/1C40
ldap_sasl_interactive_bind_s: Invalid credentials (49)
```
I really don't think the problem comes from the client side, as the provided ticket
(
me(a)EXAMPLE.NET to
ldap/ldap.example.net(a)EXAMPLE.NET) is the expected one for a
rDNS-disabled client.
On the server side, I enabled the Kerberos trace and the following debug levels according
to RHDS' documentation[1]:
- Connection management (8)
- Default (16384)
- Plug-ins (65536)
```
$ journalctl -o cat -fu dirsrv@EXAMPLE-NET
[11/Jan/2021:15:01:27.803102542 +0100] - DEBUG - connection_reset - new connection on 109
[11/Jan/2021:15:01:27.870901407 +0100] - DEBUG -
connection_table_dump_activity_to_errors_log - activity on 109r
[11/Jan/2021:15:01:27.872064228 +0100] - DEBUG - handle_pr_read_ready - read activity on
109
[11/Jan/2021:15:01:27.876039477 +0100] - DEBUG - connection_read_operation - connection 60
read 512 bytes
[11/Jan/2021:15:01:27.879751786 +0100] - DEBUG - connection_read_operation - connection 60
read 512 bytes
[11/Jan/2021:15:01:27.883218161 +0100] - DEBUG - connection_read_operation - connection 60
read 221 bytes
[11/Jan/2021:15:01:27.884108822 +0100] - DEBUG - connection_threadmain - conn 60 read
operation successfully - thread_turbo_flag 0 more_data 0 ops_initiated 1 refcnt 2 flags 0
[11/Jan/2021:15:01:27.887552392 +0100] - DEBUG - connection_check_activity_level - conn 60
activity level = 0
[11/Jan/2021:15:01:27.890966858 +0100] - DEBUG - connection_make_readable_nolock - making
readable conn 60 fd=109
GSSAPI server step 1
[11/Jan/2021:15:01:27.915025014 +0100] - DEBUG - ipa-lockout-plugin - postop returning 0:
success
[11/Jan/2021:15:01:27.919556667 +0100] - DEBUG - connection_threadmain - conn 60 check
more_data 0 thread_turbo_flag 0repl_conn_bef 0, repl_conn_now 0
[11/Jan/2021:15:01:27.922600271 +0100] - DEBUG - clear_signal - Listener got signaled
[11/Jan/2021:15:01:27.925681010 +0100] - DEBUG -
connection_table_dump_activity_to_errors_log - activity on 109r
[11/Jan/2021:15:01:27.928830961 +0100] - DEBUG - handle_pr_read_ready - read activity on
109
[11/Jan/2021:15:01:27.932079774 +0100] - DEBUG - connection_read_operation - connection 60
read 7 bytes
[11/Jan/2021:15:01:27.932853608 +0100] - DEBUG - connection_threadmain - conn 60 read
operation successfully - thread_turbo_flag 0 more_data 0 ops_initiated 2 refcnt 2 flags 0
[11/Jan/2021:15:01:27.935914149 +0100] - DEBUG - disconnect_server_nomutex_ext - Setting
conn 60 fd=109 to be disconnected: reason -4991
[11/Jan/2021:15:01:27.938827852 +0100] - DEBUG - connection_threadmain - conn 60 check
more_data 0 thread_turbo_flag 0repl_conn_bef 0, repl_conn_now 0
[11/Jan/2021:15:01:27.939658714 +0100] - DEBUG - clear_signal - Listener got signaled
[11/Jan/2021:15:01:27.941758780 +0100] - DEBUG -
connection_table_move_connection_out_of_active_list - Moved conn 109 out of active list
and freed
```
I am not sure the enabling of the Kerberos trace actually has an effect here, since the
server is on the receiving end. The only GSSAPI-related message is from the Cyrus SASL
library[2].
I'm afraid there is not much interesting here except the "Setting conn 60 fd=109
to be disconnected: reason -4991" message. I checked the "reason" code and
it's just "SLAPD_DISCONNECT_UNBIND"[2]. I would have expected it to rather
be "SLAPD_DISCONNECT_SASL_FAIL". Especially since the access logs are showing a
SASL error:
/var/log/dirsrv/slapd-EXAMPLE-NET/access
```
[11/Jan/2021:15:01:27.809864509 +0100] conn=60 fd=109 slot=109 connection from
XXX.XXX.XXX.XXX to XXX.XXX.XXX.XXX
[11/Jan/2021:15:01:27.894055412 +0100] conn=60 op=0 BIND dn="" method=sasl
version=3 mech=GSSAPI
[11/Jan/2021:15:01:27.914996481 +0100] conn=60 op=0 RESULT err=49 tag=97 nentries=0
etime=0.039387053 - SASL(-13): authentication failure: GSSAPI Failure:
gss_accept_sec_context
[11/Jan/2021:15:01:27.935890700 +0100] conn=60 op=1 UNBIND
[11/Jan/2021:15:01:27.938812067 +0100] conn=60 op=1 fd=109 closed - U1
```
The failing call to gss_accept_sec_context() also accurs in the Cyrus SASL library[4].
I gave another try using the trace logging level (1):
```
$ journalctl -o cat -fu dirsrv(a)EXAMPLE.NET | grep -Ei
'(sasl|gssapi|principal|ticket|krb|ipa01.example.net|ldap.example.net)'
[11/Jan/2021:18:15:46.846080362 +0100] - DEBUG - ids_sasl_check_bind - =>
[11/Jan/2021:18:15:46.849669244 +0100] - DEBUG - ids_sasl_mech_supported - =>
[11/Jan/2021:18:15:46.851552322 +0100] - DEBUG - ids_sasl_listmech - =>
[11/Jan/2021:18:15:46.990676325 +0100] - DEBUG - ids_sasl_listmech - sasl library mechs:
GSS-SPNEGO,GSSAPI,DIGEST-MD5,CRAM-MD5,LOGIN,PLAIN,ANONYMOUS
[11/Jan/2021:18:15:46.993696181 +0100] - DEBUG - ids_sasl_listmech - <=
[11/Jan/2021:18:15:46.996543312 +0100] - DEBUG - ids_sasl_mech_supported - <=
GSSAPI server step 1
[11/Jan/2021:18:15:47.145438011 +0100] - DEBUG - ids_sasl_log - (2): GSSAPI Error:
Unspecified GSS failure. Minor code may provide more information (Request ticket server
ldap/ldap.example.net(a)EXAMPLE.NET found in keytab but does not match server principal
ldap/ipa01.example.net@)
[11/Jan/2021:18:15:47.440693520 +0100] - DEBUG - ids_sasl_check_bind - <=
```
The following error message is quite interesting, as it confirms the Kerberos library[5]
found the matching key in the keytab, but will not use it because it is not configured to
do so:
"Request ticket server ldap/ldap.example.net(a)EXAMPLE.NET found in keytab but does not
match server principal ldap/ipa01.example.net@"
Looking at the code, it looks like this is the call stack 389ds is going through to
configure the server's identity
- main.c:main[6]
- bind.c:init_saslmechanisms[7]
- saslbind.c:ids_sasl_init[8]
- localhost.c:get_localhost_DNS[9] (set the "serverfqdn" static global
variable)
- libglobs.c:config_get_localhost[10] (retrieve the FQDN from
"nsslapd-localhost")
The "serverfqdn" variable is then used to call sasl_server_new()[11]. It would
explain why setting "nsslapd-localhost" with the alias causes authentication
against the alias principal to work, but breaks the canonical FQDN one.
This is a major issue for us, as rDNS resolution will be disabled on most of our Kerberos
clients in the future.
Would it be possible that the code was modified, affecting this behaviour, since you wrote
the documentation?
Best regards,
Julien Rische
CERN
[1]
https://access.redhat.com/documentation/en-us/red_hat_directory_server/10...
[2]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...
[3]
https://github.com/cyrusimap/cyrus-sasl/blob/cyrus-sasl-2.1/plugins/gssap...
[4]
https://github.com/cyrusimap/cyrus-sasl/blob/cyrus-sasl-2.1/plugins/gssap...
[5]
https://github.com/krb5/krb5/blob/krb5-1.18.2-final/src/lib/krb5/krb/rd_r...
[6]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...
[7]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...
[8]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...
[9]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...
[10]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...
[11]
https://pagure.io/389-ds-base/blob/389-ds-base-1.3.10/f/ldap/servers/slap...