On 01/07/2015 03:41 PM, Simo Sorce wrote:
> On Wed, 07 Jan 2015 15:25:30 -0500
> Dmitri Pal <dpal(a)redhat.com> wrote:
>
>> On 01/07/2015 03:05 PM, Simo Sorce wrote:
>>> On Tue, 06 Jan 2015 09:59:08 -0500
>>> Dmitri Pal <dpal(a)redhat.com> wrote:
>>>
>>>> On 01/06/2015 05:54 AM, Jakub Hrozek wrote:
>>>>> On Tue, Jan 06, 2015 at 11:31:55AM +0100, Pavel Březina wrote:
>>>>>>>> *Users*
>>>>>>>> Do we want also to have methods ListDomainUsers() and
>>>>>>>> ListUsers() without the name filter?
>>>>>>> To list all? What about using '*' for that?
>>>>>> We can implement it this way internally, but exposing an easier
>>>>>> way to the consumers is nice, imho.
>>>>> I'm not too opposed, although I prefer minimal APIs.
>>>>>
>>>>>> However, do we actually want to allow to list all users? As
>>>>>> Dmitri suggested we may want to require the minimum filter
>>>>>> length since the number of users may be very high. The maximum
>>>>>> D-Bus message is 128MiB so I think we are good there but I
>>>>>> think it can be very time consuming to return all users
>>>>>> without some sort of paging.
>>>>> This feature is internally dependant on enumerate=true, where we
>>>>> already store all standard POSIX attributes (struct passwd,
>>>>> struct group) in-memory, do you think the D-Bus
"enumeration"
>>>>> provides that much overhead?
>>>>>
>>>>> Paging would be really complex, we'd need to store the full
>>>>> results in-memory per-client anyway and then pass around some
>>>>> kind of cookie to resume iteration..
>>>>>
>>>>> In a centralized environment, I wouldn't expect the listing
>>>>> commands to be used that commonly. Greeters or login managers
>>>>> (gdm) would typically use the cached users instead. Some
>>>>> applications (Hi, RHEV-M!) choose to display all their users in
>>>>> some kind of table and then I would expect them to implement
>>>>> paging themselves:
>>>>>
>>>>> for letter in a..z:
>>>>> users = ListUsersByNameFilter($letter)
>>>>>
>>>>>>>> Do we want some other filter options as well?
>>>>>>> In the design I wanted to keep the filtering simple. Unless
we
>>>>>>> receive some other requirements..
>>>>>> Yes, you suggested to allow only asterisk. Implement full
>>>>>> regular expression efficiently as Dmitri would be quite
>>>>>> problematic since ldb doesn't support regex lookup thus we
>>>>>> would have to do this ourselves and therefore we would loose
>>>>>> indices, or am I wrong?
>>>>> I guess we'd have to grab all the entries and filter them
>>>>> ourselves..
>>>>>
>>>>> (Yes, this is the reason I chose the asterisk notation in the
>>>>> first place) _______________________________________________
>>>>> sssd-devel mailing list
>>>>> sssd-devel(a)lists.fedorahosted.org
>>>>>
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel
>>>> Several points:
>>>>
>>>> - IMO having a full regular expression support will be an
>>>> overhead.
>>>> - "Begins with" filtering with * to indicate the remaining
part
>>>> is good enough
>>> reasonable
>>>
>>>> - I do not think we should rely on enumeration. I think we should
>>>> do a lookup since these operations will be rare.
>>> Nope.
>>> This is the same thinking the implementers of the nss interface
>>> went through. And then users started using enumeration all the
>>> time.
>>>
>>> It may be ok to let users programmatically force full enumeration
>>> somehow. But the default should be to return only what is in
>>> cache. If you do otherwise people will test applications using *
>>> with 3 users on the system and then fail spectacularly when there
>>> are actually 100K users in the directory.
>> I think there is a problem with the approach you suggest.
>>
>> Say there is an application that allows you to list groups starting
>> with a letter. It is used to define roles for application
>> administration.
>>
>> Assume that there is a group "Agroup". It is a new group that does
>> not have users yet. But it was created for use with the application
>> so that roles can be associated with this group. The admin of the
>> application thus wants to start using it.
>> This group will never be looked up if "A*" query will be run
>> against cache because it will not end up in cache. That would
>> force admin to turn full enumeration on SSSD. This is bad.
> What matters is the '*' does not do enumeration, if 'ABC*' causes
an
> online lookup it is fine imo.
>
>> IMO there should be a way for those queries to actually go online.
>> We can, however, not process all results. We can explicitly say
>> "first 10 or 20 results and that is it". It can be an argument of
>> the call with the default being a value in sssd.conf.
> It may make sense to think of a "ranged" interface for wildcard
> lookups, where you have to explicitly provide the range you want
> capped to a max length defined in sssd conf, and if you exceed that
> size you get an error.
>
> This way users are forced to think about how many result they
> want/can process.
>
> So the wildcard interface will be different from exact match
> interface.
>
> Simo.
>
how about:
entries[] = lookup(string filter, unsigned total);
Filter can be:
- string without asterisk (then there will be an exact match search
online)
- string with asterisk or just asterisk (in this case function would
search for "total" number of results only)
if total is 0 then a configured maximum value from sssd.conf will be
used. Default 30 would probably be enough.
In this case * will go online but for only 30 results max.
It is similar to -z option in the ldapsearch.
Does that make sense?
How do you get the next 30 ?
Also the query may be slow no matter what you set if you make it at all
with '*'.
We can use paged serches with some servers, but not all of them support
this and some will be slow anyway.
Simo.
--
Simo Sorce * Red Hat, Inc * New York