I disabled LDAP paging in sssd.conf and let the setup run for a while.
No crashes since.
It does worry me though, that some other application could crash the
server by using result paging.
On 18. 11. 2013 17:05, Rich Megginson wrote:
On 11/18/2013 07:01 AM, Mitja Mihelič wrote:
> On 15. 11. 2013 21:46, Rich Megginson wrote:
>> On 11/15/2013 02:58 AM, Mitja Mihelič wrote:
>>>
>>>
>>> On 14. 11. 2013 22:08, Rich Megginson wrote:
>>>> On 11/14/2013 08:50 AM, Mitja Mihelič wrote:
>>>>> One of the consumers has crashed again and I have attached the
>>>>> stacktrace.
>>>>> Four hous later it crashed again.
>>>>>
>>>>> I do hope there is something in the stacktraces, so that
>>>>> something can be done to prevent future crashes.
>>>>
>>>> Unfortunately, not enough. Looks like there is still some
>>>> mismatch between the version of the package and the version of the
>>>> debuginfo package.
>>>>
>>>> rpm -q 389-ds-base 389-ds-base-debuginfo openldap
>>>> openldap-debuginfo db4 db4-debuginfo nss nss-debuginfo nspr
>>>> nspr-debuginfo glibc glibc-debuginfo
>>> The suggested debuginfo packages were not installed at the time
>>> when the stacktraces were made. They are installed now. I have
>>> recreated the stacktraces and attached them.
>>
>> The crash looks related to paged searches. We have changed this
>> code somewhat in the next version. Can you try the latest version
>> in the EPEL6 testing repo? 389-ds-base-1.2.11.23-3
>>
http://port389.org/wiki/Download
> Before installing packages from the testing repo, are there any other
> changes I could do?
>
> When you mentioned a relation to paged searches, perharps this might
> be related to our usage of SSSD ona server that is querying the 389DS.
> Currently it uses paging of results, as it is enabled by default and
> page size is set to 1000 results.
> On the 389DS nsslapd-sizelimit is set to 2000.
>
> Every 5 minutes SSSD issues this search query:
> SRCH base="dc=TIER2,dc=COMPANY,dc=si" scope=2
> filter="(&(objectClass=posixAccount)(uid=*)(uidNumber=*)(gidNumber=*))"
> attrs="objectClass uid userPassword uidNumber gidNumber gecos
> homeDirectory loginShell krbprincipalname cn modifyTimestamp
> modifyTimestamp shadowLastChange shadowMin shadowMax shadowWarning
> shadowInactive shadowExpire shadowFlag krblastpwdchange
> krbpasswordexpiration pwdattribute authorizedService accountexpires
> useraccountcontrol nsAccountLock host logindisabled
> loginexpirationtime loginallowedtimemap"
>
> The first 1000 entries are returned.
> conn=1276 op=3 RESULT err=0 tag=101 nentries=1000 etime=34.129000
> notes=U,P
>
> Then the exact same search is issued again, and 999 are returned.
> conn=1276 op=4 RESULT err=4 tag=101 nentries=999 etime=1.056000 notes=U,P
>
> err=4 is understandable, since nsslapd-sizelimit = 2000.
>
> Should I disable result paging for SSSD?
You could try that, yes. The problem seems related to paging.
> Perhaps even set nsslapd-sizelimit to -1? (I would not like to do this)
>
> Regards, Mitja
>
>>
>>>
>>>>
>>>> Also, if you are seeing the message:
>>>> ber_flush skipped because the connection was marked to be closed
>>>> or abandoned
>>>>
>>>> This means you are running with the CONNS error log level, which
>>>> means you may have a lot of useful information in your errors
>>>> log. Would you be able to provide that?
>>> I can provide the error logs, but will need to anonimize our user
>>> data. How large a time time interval do you need?
>>>>
>>>>>
>>>>> The last log message in errors log was both times:
>>>>> ber_flush skipped because the connection was marked to be closed
>>>>> or abandoned
>>>>>
>>>>> The following versions 389ds packages were installed at the time:
>>>>> 389-admin-1.1.29-1.el6.x86_64
>>>>> 389-admin-console-1.1.8-1.el6.noarch
>>>>> 389-admin-console-doc-1.1.8-1.el6.noarch
>>>>> 389-adminutil-1.1.15-1.el6.x86_64
>>>>> 389-console-1.1.7-1.el6.noarch
>>>>> 389-ds-1.2.2-1.el6.noarch
>>>>> 389-ds-base-1.2.11.15-22.el6_4.x86_64
>>>>> 389-ds-base-libs-1.2.11.15-22.el6_4.x86_64
>>>>> 389-ds-console-1.2.6-1.el6.noarch
>>>>> 389-ds-console-doc-1.2.6-1.el6.noarch
>>>>> 389-dsgw-1.1.10-1.el6.x86_64
>>>>>
>>>>> Reragds, Mitja
>>>>>
>>>>>
>>>>> On 17. 07. 2013 09:52, Mitja MiheliÄ wrote:
>>>>>>
>>>>>>
>>>>>> It may be best if I removed all 389DS related data from both of
>>>>>> the consumer servers and start fresh. If they crash again I will
>>>>>> send the relevant stack traces.
>>>>>
>>>>
>>>
>>
>