On Wed, 07 Jan 2015 15:25:30 -0500
Dmitri Pal <dpal(a)redhat.com> wrote:
> On 01/07/2015 03:05 PM, Simo Sorce wrote:
>> On Tue, 06 Jan 2015 09:59:08 -0500
>> Dmitri Pal <dpal(a)redhat.com> wrote:
>>
>>> On 01/06/2015 05:54 AM, Jakub Hrozek wrote:
>>>> On Tue, Jan 06, 2015 at 11:31:55AM +0100, Pavel Březina wrote:
>>>>>>> *Users*
>>>>>>> Do we want also to have methods ListDomainUsers() and
>>>>>>> ListUsers() without the name filter?
>>>>>> To list all? What about using '*' for that?
>>>>> We can implement it this way internally, but exposing an easier
>>>>> way to the consumers is nice, imho.
>>>> I'm not too opposed, although I prefer minimal APIs.
>>>>
>>>>> However, do we actually want to allow to list all users? As
>>>>> Dmitri suggested we may want to require the minimum filter
>>>>> length since the number of users may be very high. The maximum
>>>>> D-Bus message is 128MiB so I think we are good there but I think
>>>>> it can be very time consuming to return all users without some
>>>>> sort of paging.
>>>> This feature is internally dependant on enumerate=true, where we
>>>> already store all standard POSIX attributes (struct passwd, struct
>>>> group) in-memory, do you think the D-Bus "enumeration"
provides
>>>> that much overhead?
>>>>
>>>> Paging would be really complex, we'd need to store the full
>>>> results in-memory per-client anyway and then pass around some
>>>> kind of cookie to resume iteration..
>>>>
>>>> In a centralized environment, I wouldn't expect the listing
>>>> commands to be used that commonly. Greeters or login managers
>>>> (gdm) would typically use the cached users instead. Some
>>>> applications (Hi, RHEV-M!) choose to display all their users in
>>>> some kind of table and then I would expect them to implement
>>>> paging themselves:
>>>>
>>>> for letter in a..z:
>>>> users = ListUsersByNameFilter($letter)
>>>>
>>>>>>> Do we want some other filter options as well?
>>>>>> In the design I wanted to keep the filtering simple. Unless we
>>>>>> receive some other requirements..
>>>>> Yes, you suggested to allow only asterisk. Implement full regular
>>>>> expression efficiently as Dmitri would be quite problematic since
>>>>> ldb doesn't support regex lookup thus we would have to do this
>>>>> ourselves and therefore we would loose indices, or am I wrong?
>>>> I guess we'd have to grab all the entries and filter them
>>>> ourselves..
>>>>
>>>> (Yes, this is the reason I chose the asterisk notation in the
>>>> first place) _______________________________________________
>>>> sssd-devel mailing list
>>>> sssd-devel(a)lists.fedorahosted.org
>>>>
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
>>> Several points:
>>>
>>> - IMO having a full regular expression support will be an overhead.
>>> - "Begins with" filtering with * to indicate the remaining part is
>>> good enough
>> reasonable
>>
>>> - I do not think we should rely on enumeration. I think we should
>>> do a lookup since these operations will be rare.
>> Nope.
>> This is the same thinking the implementers of the nss interface went
>> through. And then users started using enumeration all the time.
>>
>> It may be ok to let users programmatically force full enumeration
>> somehow. But the default should be to return only what is in cache.
>> If you do otherwise people will test applications using * with 3
>> users on the system and then fail spectacularly when there are
>> actually 100K users in the directory.
> I think there is a problem with the approach you suggest.
>
> Say there is an application that allows you to list groups starting
> with a letter. It is used to define roles for application
> administration.
>
> Assume that there is a group "Agroup". It is a new group that does
> not have users yet. But it was created for use with the application
> so that roles can be associated with this group. The admin of the
> application thus wants to start using it.
> This group will never be looked up if "A*" query will be run against
> cache because it will not end up in cache. That would force admin to
> turn full enumeration on SSSD. This is bad.
What matters is the '*' does not do enumeration, if 'ABC*' causes an
online lookup it is fine imo.
> IMO there should be a way for those queries to actually go online. We
> can, however, not process all results. We can explicitly say "first
> 10 or 20 results and that is it". It can be an argument of the call
> with the default being a value in sssd.conf.
It may make sense to think of a "ranged" interface for wildcard
lookups, where you have to explicitly provide the range you want capped
to a max length defined in sssd conf, and if you exceed that size you
get an error.
This way users are forced to think about how many result they want/can
process.
So the wildcard interface will be different from exact match interface.
Simo.
how about:
entries[] = lookup(string filter, unsigned total);
Filter can be:
- string without asterisk (then there will be an exact match search online)
- string with asterisk or just asterisk (in this case function would
search for "total" number of results only)
if total is 0 then a configured maximum value from sssd.conf will be
used. Default 30 would probably be enough.
In this case * will go online but for only 30 results max.
It is similar to -z option in the ldapsearch.
Does that make sense?
--
Thank you,
Dmitri Pal
Sr. Engineering Manager IdM portfolio
Red Hat, Inc.