[SSSD] Re: Design document: Enhanced NSS API

Thursday, 2 November 2017

On Thu, 2017-11-02 at 14:53 +0200, Alexander Bokovoy wrote:
...
 On to, 02 marras 2017, Simo Sorce wrote:
 > On Thu, 2017-11-02 at 13:14 +0100, Sumit Bose wrote:
 > > On Fri, Oct 27, 2017 at 08:43:28AM -0400, Simo Sorce wrote:
 > > > On Thu, 2017-10-26 at 22:14 +0200, Sumit Bose wrote:
 > > > > On Thu, Oct 26, 2017 at 02:43:29PM -0400, Simo Sorce wrote:
 > > > > > On Thu, 2017-10-26 at 12:16 +0200, Jakub Hrozek wrote:
 > > > > > > On Wed, Oct 25, 2017 at 05:39:21PM +0200, Sumit Bose
 > > > > > > wrote:
 > > > > > > > Hi,
 > > > > > > > 
 > > > > > > > please find below the design document for the
enhanced
 > > > > > > > NSS
 > > > > > > > API
 > > > > > > > which
 > > > > > > > makes e.g. the client side timeouts which where
 > > > > > > > recently
 > > > > > > > refactored
 > > > > > > > available to callers.
 > > > > > > > 
 > > > > > > > A more visual friendly version can be found at:
 > > > > > > >
https://pagure.io/fork/sbose/SSSD/docs/blob/07514ce5284
 > > > > > > > 5d47
 > > > > > > > fe6b
 > > > > > > > b327
 > > > > > > > f782a865bfa75628a/f/design_pages/enhanced_nss_api.rst
 > > > > > > > 
 > > > > > > > bye,
 > > > > > > > Sumit
 > > > > > > 
 > > > > > > LGTM!
 > > > > > > 
 > > > > > 
 > > > > > Looking at this I have some questions, if you are going to
 > > > > > create a
 > > > > > new
 > > > > > library and just need to set a timeout it seem it would be
 > > > > > a
 > > > > > much
 > > > > > better interface to use a context handle you allocate and
 > > > > > pass
 > > > > > into
 > > > > > each call, and then have getters setters to set timeouts or
 > > > > > any
 > > > > > other
 > > > > > flags that should influence the whole behavior. This will
 > > > > > allow
 > > > > > you
 > > > > > also to control how many concurrent connections you want to
 > > > > > have
 > > > > > against sssd, as each new context will create a  new socket
 > > > > > connection
 > > > > > and all.
 > > > > > 
 > > > > > In the original libnss_sss.so we could not do that because
 > > > > > the
 > > > > > glibc
 > > > > > interface does not offer any way to hold a context, but
 > > > > > there
 > > > > > is no
 > > > > > reason to continue along that line in a *new* API. And not
 > > > > > using
 > > > > > contexts in threaded applications is generally a bad idea,
 > > > > > as
 > > > > > you
 > > > > > end
 > > > > > up *requiring* te use of mutexes when that is really not
 > > > > > always
 > > > > > a
 > > > > > need
 > > > > > (separated threads can open separate connections and not
 > > > > > share
 > > > > > any
 > > > > > data
 > > > > > that require mutexes).
 > > > > 
 > > > > This sounds like a good 2.0 feature, are you interested in
 > > > > creating a
 > > > > more detailed design page for this?
 > > > 
 > > > Sure.
 > > > 
 > > > >  Currently my goal was to reuse as
 > > > > much or the current trusted code we already have.
 > > > > 
 > > > > > 
 > > > > > On the responder side I also do not see why new calls are
 > > > > > being
 > > > > > created. You clearly want a client-wide behavior, introduce
 > > > > > ONE
 > > > > > new
 > > > > > call that sets flags for the rest of the connection, and
 > > > > > just
 > > > > > reuse
 > > > > > the
 > > > > > usual commands otherwise.
 > > > > 
 > > > > The current flags, like invalidating a cached entry are per-
 > > > > request,
 > > > > only the single object address by the current request should
 > > > > be
 > > > > invalidate not all object which are requested on the same
 > > > > connection.
 > > > 
 > > > I would probably add a command to explicitly invalidate
 > > > individual
 > > > caches, this would avoid having special paths on every other
 > > > call,
 > > > resulting in cleaner code, at the cost of one more roundtrip
 > > > though, so
 > > > I guess it is a matter of figuring out what is the right
 > > > balance
 > > > here.
 > > > 
 > > > > > 
 > > > > > I do not understand what is the point of nss_truste_users
 > > > > > why a
 > > > > > force
 > > > > > reload is a privileged operation ?
 > > > > 
 > > > > Since it can force expensive operations on the backends which
 > > > > will
 > > > > hit servers I think not everybody should be allowed to do
 > > > > this.
 > > > 
 > > > You can already force this by requesting unexisting
 > > > users/groups, I
 > > > am
 > > > not convinced this necessarily needs to be a privileged
 > > > operation
 > > > as
 > > > there are already ways to cause work for SSSD.
 > > > I would rather drop it. If we really want to deal with
 > > > potential
 > > > abuse
 > > > we should introduce rate-limiting per uid, and basically slow
 > > > down
 > > > to a
 > > > halt abusing clients by giving weights and possibly quotas,
 > > > doling
 > > > out
 > > > privileges is cumbersome anyway and does not really prevent a
 > > > malicious
 > > > client to cause hard ATM.
 > > > 
 > > > IMHO nss_trusted_users gets a NACK as a concept and should be
 > > > dropped.
 > > 
 > > Of course I can remove it, but since removing it later is easier
 > > than
 > > adding it later I'd like to try to explain again why I think it
 > > would
 > > be useful.
 > > 
 > > You are right that it is already possible to send requests to the
 > > servers via SSSD. But as you said this are "only" searches for
 > > unexisting user and groups which should be handled by the indexes
 > > on
 > > the server quite efficiently and causes no disk-I/O on the
 > > client.
 > > Additionally we try to avoid accidental misuse of this with the
 > > negative cache.
 > 
 > You can also call out existing users, especially in large domains
 > if
 > you have groups big enough you can cause fast cache thrashing very
 > easily, this is already a big deal for the client. Client disk load
 > is
 > uninteresting because a process can simply write/fsync locally to
 > cause
 > disk I/O issues, and traffic towards the server is also
 > uninteresting
 > because a client can simply directly contact the server directly to
 > cause load.
 > 
 > So what we are left with is uniquely the fact a process might keep
 > the
 > sssd_be process busier and cause other processes on the system to
 > get
 > slower responses. But this is easy to deal with by simply
 > throttling
 > bad behaving users (and the admin kicking them out).
 > 
 > > The flags might trigger operations like looking up all groups a
 > > user
 > > is a member of or looking up a group with all members. While only
 > > the
 > > first might cause a more expensive operation on the server both
 > > might
 > > cause a lot of disk-I/O on the client.
 > > 
 > > Rate limitations would help to mitigate misuse as well. But a
 > > typical
 > > SSSD client would have no use for the flags so why allow it to
 > > use
 > > them?
 > 
 > They may have a need to insure a specific user/group is refreshed,
 > tools would be able to use this instead of having to be be root and
 > deleting the whole sssd cache just to refresh a user you know has
 > been
 > changed on the server. We wouldn't want people to abuse this of
 > course,
 > but it's not a bad idea.
 > 
 > > That's why I think nss_trusted_users is a good way to avoid
 > > accidental misuse in a similar way as the negative cache. But if
 > > you
 > > prefer I'll drop it.
 > 
 > I do not know that you can easily change it later, and I would
 > rather
 > use file permissions than explicit checks, that would mean exposing
 > a
 > "privileged" socket that only users that are part of a group of
 > "trusted" service can access, and this is clearly a lot more work,
 > which I am not advocating.
 > 
 > I am really torn on the need for this, it will make using this
 > feature
 > really cumbersome as you have to explicitly modify a configuration
 > file
 > and list specific users (groups ?), and this is to balance against
 > a
 > vague chance of a user causing a local DoS/slowdown but not a lot
 > more.
 > 
 > To me it sounds like a very big hammer for a very small fly.

 Right now the only user for this is dirsrv on IPA masters. If you
 throttle dirsrv plugins, you are denying SSSD clients from IPA
 clients
 from getting actual results. Throttling, thus, would be a wrong
 measure
 here.
  Note that throttling is a possible future measure if it turns out this
is needed, I am not advocating for implementing throttling now.

Simo.

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[SSSD] Re: Design document: Enhanced NSS API