On Wed, 19 Nov 2014 15:42:49 -0500
Stephen Gallagher <sgallagh(a)redhat.com> wrote:
On Thu, 2014-11-13 at 13:39 +0100, Lukas Slebodnik wrote:
> On (13/11/14 12:22), Jakub Hrozek wrote:
> >On Thu, Nov 13, 2014 at 11:17:15AM +0100, Lukas Slebodnik wrote:
> >> On (13/11/14 10:44), Jakub Hrozek wrote:
> >> >On Wed, Nov 12, 2014 at 08:04:46PM -0500, Simo Sorce wrote:
> >> >> On Wed, 12 Nov 2014 16:36:00 +0100
> >> >> Lukas Slebodnik <lslebodn(a)redhat.com> wrote:
> >> >>
> >> >> > On (12/11/14 10:00), Simo Sorce wrote:
> >> >> > >I would create a helper function to be called on return
> >> >> > >that transforms the error accordingly. This will allow
to
> >> >> > >write the code _and_ the comment once.
> >> >> > >
> >> >> > In this case, Stephan's patch is better
> >> >> >
https://bugzilla.redhat.com/attachment.cgi?id=788567
> >> >>
> >> >> Yes, this is a valid alternative.
> >> >>
> >> >> > >The comment should be changed to something like this in
> >> >> > >either case: /* When sssd is stopped return a safe error
> >> >> > >code as if sss was not configured at all in nsswitch.
This
> >> >> > >prevents bogus errors from causing issues in
applications,
> >> >> > >before sssd starts or if it fails to respond. */
> >> >> > >
> >> >> > >No need to mention that sss is by default in nsswitch,
as
> >> >> > >it is not in all distributions and it really is
> >> >> > >inconsequential, the same behaviour change hleps when
sss
> >> >> > >is not the default but is has been manually added and
sssd
> >> >> > >is stopped or not started yet (for example during boot).
> >> >> > nss-pam-ldapd has the same behaviour in the same situation.
> >> >> > Will we patch it as well? It's very likely we won't.
> >> >>
> >> >> Sorry, I do not see how that matters :)
> >> >>
> >> >> > The biggest problem is that sss is by default in nsswitch on
> >> >> > fedora/rhel>=7 due to glibc caching and problem with
GNOME,
> >> >> > a) sssd-client is installed by default on this platforms.
> >> >> > b) sssd need't be configured by default and in most
cases
> >> >> > won't be => sssd cannot run
> >> >> > c) glibc developers don't want to adjust the error
return
> >> >> > code in glibc
> >> >> >
> >> >> > As a result of this, we need to patch sssd.
> >> >> > I would say we should patch sssd just in downstream and
> >> >> > Stephan's patch works well. I tested it.
> >> >>
> >> >> Ok, then let's go with Stephen's patch.
> >> >>
> >> >> Simo.
> >> >>
> >> >
> >> >I don't personally mind whether we use Lukas' or Stephen's
> >> >patch but I would /strongly/ prefer the patch to be pushed
> >> >upstream.
> >> >
> >> I don't understant why we should push patch to upstream.
> >> glibc maintainer confirmed that sssd behaves corectly:
> >> "In all honesty, SSSD is *behaving* correctly. It returns:
> >> status==NSS_STATUS_UNAVAIL, and errno==ENOENT, to indicate "A
> >> necessary input file cannot be found."[1] I can only assume this
> >> is because it is true, the daemon is not running and the return
> >> status as indicated implies that. Such an error will be
> >> propagated up to the client, getgrent in this case, as a NULL
> >> return with errno set to ENOENT." (nss-pam-ldapd do the same
> >> with stopped nslcd)
> >>
> >> The problem is that behaviour is unacceptable with having sss in
> >> the nsswitch.conf by default (rhel>=7, fedora). I checked glibc
> >> upstream and other distributions (debian, ubuntu, opensuse) and
> >> they don't have sss in /etc/nsswitch.conf by default.
> >>
> >> The patch in upstream can hide problems with stopped sssd. Some
> >> user can rely on such behaviuour.
> >>
> >> Why we should patch sssd in upstream due to some downstream
> >> patch in different project?
> >
> >Because I think having sss in nsswitch.conf by default is the right
> >thing to do moving forward, especially when we start managing local
> >users. Handling this situation gracefully makes sense.
> >
> That's good point. On the other hand, we are no there yet.
> It's just future plan.
>
> >Let me turn the question around -- do you see a downside of
> >pushing the patch downstream?
> No. It will fix problem in downstream for users who do not use sssd.
>
> >Or do you consider it a hack, patching about glibc
> >bug? Then maybe we should raise a libc bug to skip over unavailable
> >sources by default..
> If I read bugzilla ticket(s) correctly they will not do that.
>
> I'm not sure wheter we should push patch to upstream as well.
The core of the upstream problem is a classic one: when you have
multiple operations going on and one of them "fails", how do you
report it?
In the case of glibc, all errors are suppressed except for the last
one on the list. So if SSSD wasn't the final entry on the passwd: (or
other) line, then its NSS_STATUS_UNAVAIL error would just be ignored
and replaced by the return code of the later plugin.
This (to me) seems like a fundamental bug in the way glibc handles
nsswitch. Unfortunately, we're dealing with a situation where we can't
fix the return values because of standards-compliance.
The hack I submitted was what I believed to be the best of a bad set
of choices: it works around the bug in glibc, but it's a bug that
pretty much has to remain there forever because of
standards-compliance.
That said, it's *probably* safe to include it upstream, even if it is
technically the wrong error code. It'll avoid other downstreams going
through this if they start shipping 'sss' by default.
I think we should stop thinking about the "default" case though.
This same kind of error would be reported even in a non "default" case
where sssd happens to be down (like during boot before sssd is
started), so I think it does apply in general.
Simo.
--
Simo Sorce * Red Hat, Inc * New York