On Mon, Apr 21, 2014 at 10:05:58AM -0400, Stephen Gallagher wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 04/17/2014 04:13 AM, Jakub Hrozek wrote:
> On Wed, Apr 16, 2014 at 10:47:10PM -0400, Simo Sorce wrote:
>> On Wed, 2014-04-16 at 19:49 -0400, Dmitri Pal wrote:
>>
>>> I had some interesting experience during Red Hat summit. The
>>> network was significantly overloaded. The VPN was slow and
>>> probably bleeding packets on the way like crazy. Any access to
>>> internal web page took a while and was happening in multiple
>>> steps. When screen was locking it was taking about 30 sec (I
>>> have not measured but that was a feeling) to log in. I am not
>>> sure we can do much about it but the flaky network is probably
>>> going to lead to some timeouts and bad user experience.
>>
>> I think this may be a recent regression. We are never supposed to
>> wait more than a handful of seconds, but I am noticing that with
>> latest RHEL6 updates my RHEL desktop also sometimes gets stuck a
>> while on authentication (VPN). I have not experienced this in F20
>> (but my domain controller is local).
>
> Simo, if you can reproduce the error locally, would you mind
> enabling debug logs or trying out the 6.6 preview packages?
>
> I only have headless VMs with RHEL6 and I'm not sure I could
> reproduce the bug there. But it sounds like something we should
> fix, so any debug information would be welcome, at least to know
> where to start with local debugging.
>
> btw when I tried to reproduce the bug Thomas was seeing, I saw
> some blocking DNS calls in openldap's initialization path, but that
> was on F-20.
OpenLDAP isn't supposed to be calling DNS at all. That's the entire
reason we open the port ourselves now and then pass the FD to it. If
it suddenly started running DNS, that's probably a regression in the
openldap libraries.
I had a bit of time to dig into the issue today, here is a snippet of
the backtrace I'm seeing, after I started an IPA client with a faulty DNS
entry in /etc/resolv.conf
#8 0x00007fda39c9e163 in __gethostbyname_r (
name=name@entry=0x7fff2548d140 "client.example.com",
resbuf=resbuf@entry=0x7fff2548d120, buffer=0x1f33590 "\177",
buflen=buflen@entry=992,
result=result@entry=0x7fff2548d118, h_errnop=h_errnop@entry=0x7fff2548d10c)
at ../nss/getXXbyYY_r.c:266
#9 0x00007fda3bb1b3de in ldap_pvt_gethostbyname_a (
name=name@entry=0x7fff2548d140 "client.example.com",
resbuf=resbuf@entry=0x7fff2548d120, buf=buf@entry=0x7fff2548d110,
result=result@entry=0x7fff2548d118, herrno_ptr=herrno_ptr@entry=0x7fff2548d10c)
at util-int.c:350
#10 0x00007fda3bb1b5d0 in ldap_pvt_get_fqdn (name=0x7fff2548d140
"client.example.com",
name@entry=0x0) at util-int.c:748
#11 0x00007fda3bb19b47 in ldap_int_initialize (
gopts=gopts@entry=0x7fda3bd40000 <ldap_int_global_options>,
dbglvl=dbglvl@entry=0x0)
at init.c:645
#12 0x00007fda3bb1a627 in ldap_set_option (ld=0x0, option=24582, invalue=0x7fff2548d2b0)
at options.c:446
#13 0x00007fda30951cf6 in setup_tls_config (basic_opts=0x1f30450)
at src/providers/ldap/sdap.c:533
#14 0x00007fda308214b3 in ldap_id_init_internal (bectx=0x1f12b40, ops=0x1f12cb0,
pvt_data=0x7fff2548d5e8) at src/providers/ldap/ldap_init.c:146
#15 0x00007fda30821ba0 in sssm_ldap_id_init (bectx=0x1f12b40, ops=0x1f12cb0,
pvt_data=0x1f12cb8) at src/providers/ldap/ldap_init.c:199
#16 0x000000000041b227 in load_backend_module (ctx=0x1f12b40, bet_type=BET_ID,
bet_info=0x1f12ca8, default_mod_name=0x0) at src/providers/data_provider_be.c:2346
#17 0x000000000041ce4c in be_process_init (mem_ctx=0x1f0ba80,
be_domain=0x1f093f0 "localipaldap", ev=0x1f0a630, cdb=0x1f0bb90)
at src/providers/data_provider_be.c:2520
#18 0x000000000041fde6 in main (argc=3, argv=0x7fff2548e008)
at src/providers/data_provider_be.c:2743
Do you agree this is an openldap bug? I don't like that ldap_set_option
triggers a blocking DNS resolution call..