On Thu, Aug 30, 2012 at 08:33:51AM +0200, Olaf Gellert wrote:
Hi Jakub,
thanks for your answer.
Jakub Hrozek wrote:
Maybe it would be beneficial to either reuse ldap_opt_timeout for the bind timeout value or introduce a new timeout. I filed https://fedorahosted.org/sssd/ticket/1501 to track this.
thanks.
I am far more concerned about the provider going offline without asking the secondary LDAP server. I'll try to reproduce the issue locally.
If I can help you with anything, just say what you need.
Hi Olaf,
I think I may have found your problem. In the extremely rare case when the initial connection to the LDAP server would succeed but then the bind request would time out, the SSSD would not retry the next server.
If you tell me the exact version you are running (the whole output of rpm -q sssd), I can prepare a scratch build for you to test if my patch fixes your issue.
However, I'm curious about how you could end up in a situation like this. Can you run the following test for me?
ldapsearch -x -H ldap://xxx1.domain.de \ -D "uid=abcdefg,ou=People,o=ldap,o=root" \ -w "thepassword" \ -b uid=abcdefg,ou=People,o=ldap,o=root -s base
I used the sanitized values you used in the original report, substitute them for the real ones you use, please and also use -Z or similar depending on your real configuration. The above command should trigger a similar codepath using libldap API as the SSSD does. Does it succeed in your environment? Are there maybe any interesting messages in the server log?