On 11/18/2013 07:01 AM, Mitja Mihelič wrote:
On 15. 11. 2013 21:46, Rich Megginson wrote:
> On 11/15/2013 02:58 AM, Mitja Mihelič wrote:
>>
>>
>> On 14. 11. 2013 22:08, Rich Megginson wrote:
>>> On 11/14/2013 08:50 AM, Mitja Mihelič wrote:
>>>> One of the consumers has crashed again and I have attached the
>>>> stacktrace.
>>>> Four hous later it crashed again.
>>>>
>>>> I do hope there is something in the stacktraces, so that something
>>>> can be done to prevent future crashes.
>>>
>>> Unfortunately, not enough. Looks like there is still some mismatch
>>> between the version of the package and the version of the debuginfo
>>> package.
>>>
>>> rpm -q 389-ds-base 389-ds-base-debuginfo openldap
>>> openldap-debuginfo db4 db4-debuginfo nss nss-debuginfo nspr
>>> nspr-debuginfo glibc glibc-debuginfo
>> The suggested debuginfo packages were not installed at the time when
>> the stacktraces were made. They are installed now. I have recreated
>> the stacktraces and attached them.
>
> The crash looks related to paged searches. We have changed this code
> somewhat in the next version. Can you try the latest version in the
> EPEL6 testing repo? 389-ds-base-1.2.11.23-3
>
http://port389.org/wiki/Download
Before installing packages from the testing repo, are there any other
changes I could do?
When you mentioned a relation to paged searches, perharps this might
be related to our usage of SSSD ona server that is querying the 389DS.
Currently it uses paging of results, as it is enabled by default and
page size is set to 1000 results.
On the 389DS nsslapd-sizelimit is set to 2000.
Every 5 minutes SSSD issues this search query:
SRCH base="dc=TIER2,dc=COMPANY,dc=si" scope=2
filter="(&(objectClass=posixAccount)(uid=*)(uidNumber=*)(gidNumber=*))"
attrs="objectClass
uid userPassword uidNumber gidNumber gecos homeDirectory loginShell
krbprincipalname cn modifyTimestamp modifyTimestamp shadowLastChange
shadowMin shadowMax shadowWarning shadowInactive shadowExpire
shadowFlag krblastpwdchange krbpasswordexpiration pwdattribute
authorizedService accountexpires useraccountcontrol nsAccountLock host
logindisabled loginexpirationtime loginallowedtimemap"
The first 1000 entries are returned.
conn=1276 op=3 RESULT err=0 tag=101 nentries=1000 etime=34.129000
notes=U,P
Then the exact same search is issued again, and 999 are returned.
conn=1276 op=4 RESULT err=4 tag=101 nentries=999 etime=1.056000 notes=U,P
err=4 is understandable, since nsslapd-sizelimit = 2000.
Should I disable result paging for SSSD?
You could try that, yes. The problem seems related to paging.
Perhaps even set nsslapd-sizelimit to -1? (I would not like to do
this)
Regards, Mitja
>
>>
>>>
>>> Also, if you are seeing the message:
>>> ber_flush skipped because the connection was marked to be closed or
>>> abandoned
>>>
>>> This means you are running with the CONNS error log level, which
>>> means you may have a lot of useful information in your errors log.
>>> Would you be able to provide that?
>> I can provide the error logs, but will need to anonimize our user
>> data. How large a time time interval do you need?
>>>
>>>>
>>>> The last log message in errors log was both times:
>>>> ber_flush skipped because the connection was marked to be closed
>>>> or abandoned
>>>>
>>>> The following versions 389ds packages were installed at the time:
>>>> 389-admin-1.1.29-1.el6.x86_64
>>>> 389-admin-console-1.1.8-1.el6.noarch
>>>> 389-admin-console-doc-1.1.8-1.el6.noarch
>>>> 389-adminutil-1.1.15-1.el6.x86_64
>>>> 389-console-1.1.7-1.el6.noarch
>>>> 389-ds-1.2.2-1.el6.noarch
>>>> 389-ds-base-1.2.11.15-22.el6_4.x86_64
>>>> 389-ds-base-libs-1.2.11.15-22.el6_4.x86_64
>>>> 389-ds-console-1.2.6-1.el6.noarch
>>>> 389-ds-console-doc-1.2.6-1.el6.noarch
>>>> 389-dsgw-1.1.10-1.el6.x86_64
>>>>
>>>> Reragds, Mitja
>>>>
>>>>
>>>> On 17. 07. 2013 09:52, Mitja MiheliÄ wrote:
>>>>>
>>>>>
>>>>> It may be best if I removed all 389DS related data from both of
>>>>> the consumer servers and start fresh. If they crash again I will
>>>>> send the relevant stack traces.
>>>>
>>>
>>
>