getgrent.3: Add ENOENT to error list.

Message ID 541082E0.8050707@redhat.com
State Not applicable
Headers

Commit Message

Carlos O'Donell Sept. 10, 2014, 4:57 p.m. UTC
  On 09/10/2014 10:53 AM, Siddhesh Poyarekar wrote:
> On Wed, Sep 10, 2014 at 10:23:13AM -0400, Carlos O'Donell wrote:
>> It's possible to get ENOENT returned from getgrent
>> if the backend, for example say SSSD, isn't configured
>> or the daemon isn't running. The same can be said of any
>> of the NSS backend.
> 
> The daemon not running is internally a NSS_STATUS_TRYAGAIN +
> EAGAIN[1], i.e. that is what the sssd nss plugin should return to
> glibc.  glibc then should return that as a NOTFOUND, which for
> getgrent is a NULL return without errno set.  I don't see why ENOENT
> is necessary.

This is orthogonal to the discussion at hand.

At present glibc will return a NULL `struct group*' and errno set to
ENOENT if the NSS plugin returns NSS_STATUS_UNAVAIL and errno ENOENT
indicating it is incorrectly configured. This is a documented entry
in the glibc manual, and is presently how SSSD behaves (until it
gets fixed).

Wether we like it or not there is a present day distinction between
"permanently unavailable until an admin fixes it" (NSS_STATUS_UNAVAIL,ENOENT),
"temporarily unavailable" (NSS_STATUS_TRYAGAIN,EAGAIN), and the former
may be seen by the user, and may be useful to act upon by a program
that is interested in that behaviour. I do not think glibc should hide
NSS_STATUS_TRYAGAIN from the user.

To be clear ENOENT is neccessary if you want to actually detect that
something is wrong with your system and take evasive action. Simply
getting back no results is not sufficient to take corrective action.
In the case of sss however the intent of the inactive plugin is to
operate as if it had no data. At least this is what I've been told by
those working on SSSD at Red Hat.

SSSD should *not* use status==NSS_STATUS_TRYAGAIN and errno==EAGAIN
because that will simply result in EAGAIN being returned to userspace
from getgrent which is again a deviation from the entire philosophy
behind SSSD wanting `sss` in nsswitch.conf. The point is to appear
as a transparent plugin that is enabled at a later time by starting
up the daemon.

For example if you fix SSSD to use status==NSS_STATUS_TRYAGIN
errno==EAGAIN instead you get this still wrong behaviour from
this test case:


#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <grp.h>

int main(int argc, char *argv[])
{
    struct group *p_group;

    setgrent();
    while (1) {
        errno = 0;  /* initialize for  getgrent() */
        p_group = getgrent();
        if (p_group == NULL) {
            if (errno == 0) {
                    break;   /* end of groups */
            } else {
                perror("getgrent");
                /* error occurs. */
                printf("getgrent error %d \n", errno);
                endgrent();
                exit(-2);
            }
        }
        printf("getgrent() OK group(%d) = %s \n",p_group->gr_gid, p_group->gr_name);
    }

    exit(0);
}

With SSSD using status==NSS_STATUS_TRYAGAIN errno==EAGAIN:

getgrent() OK group(0) = root 
getgrent() OK group(1) = bin 
getgrent() OK group(2) = daemon 
...
getgrent: Resource temporarily unavailable
getgrent error 11 

With SSSD using status==NSS_STATUS_UNAVAIL errno==ENOENT:

getgrent() OK group(0) = root 
getgrent() OK group(1) = bin 
getgrent() OK group(2) = daemon 
...
getgrent: No such file or directory
getgrent error 2 

With SSSD using status==NSS_STATUS_NOTFOUND errno==0:
getgrent() OK group(0) = root 
getgrent() OK group(1) = bin 
getgrent() OK group(2) = daemon 
getgrent() OK group(3) = sys 
...
getgrent() OK group(185) = wildfly

Which completes successfully and is the only way it should
work for an installed SSSD nss module.

e.g.
---

Please correct me if you think something I've said is wrong
or doesn't make sense.


Cheers,
Carlos.
  

Comments

Siddhesh Poyarekar Sept. 15, 2014, 1:04 a.m. UTC | #1
On Wed, Sep 10, 2014 at 12:57:04PM -0400, Carlos O'Donell wrote:
> At present glibc will return a NULL `struct group*' and errno set to
> ENOENT if the NSS plugin returns NSS_STATUS_UNAVAIL and errno ENOENT
> indicating it is incorrectly configured. This is a documented entry
> in the glibc manual, and is presently how SSSD behaves (until it
> gets fixed).

Yes but the entry in the libc manual documents the interface between
the plugin and glibc, not the plugin and the user or glibc and the
user.

> Wether we like it or not there is a present day distinction between
> "permanently unavailable until an admin fixes it" (NSS_STATUS_UNAVAIL,ENOENT),
> "temporarily unavailable" (NSS_STATUS_TRYAGAIN,EAGAIN), and the former
> may be seen by the user, and may be useful to act upon by a program
> that is interested in that behaviour. I do not think glibc should hide
> NSS_STATUS_TRYAGAIN from the user.
> 
> To be clear ENOENT is neccessary if you want to actually detect that
> something is wrong with your system and take evasive action. Simply
> getting back no results is not sufficient to take corrective action.
> In the case of sss however the intent of the inactive plugin is to
> operate as if it had no data. At least this is what I've been told by
> those working on SSSD at Red Hat.
> 
> SSSD should *not* use status==NSS_STATUS_TRYAGAIN and errno==EAGAIN
> because that will simply result in EAGAIN being returned to userspace
> from getgrent which is again a deviation from the entire philosophy
> behind SSSD wanting `sss` in nsswitch.conf. The point is to appear
> as a transparent plugin that is enabled at a later time by starting
> up the daemon.

This seems to me to be a case for the nss subsystem to clear errno if
it does not.  I'd read the errno list as the number of ways it is
allowed to fail extraordinarily and a resource not being available is
currently not considered as an extraordinary failure by POSIX.  So in
that context it is a bug in the nss subsystem.

My point is that we'll be deviating from the standard by supporting an
extra way to fail and maybe we should get some kind of clarification
from the Austin group before simply documenting it as the truth.

Siddhesh
  

Patch

diff -urN sssd-1.11.6/src/sss_client/nss_group.c sssd-1.11.6.mod/src/sss_client/nss_group.c
--- sssd-1.11.6/src/sss_client/nss_group.c	2014-06-03 10:31:33.000000000 -0400
+++ sssd-1.11.6.mod/src/sss_client/nss_group.c	2014-09-10 12:21:52.330685026 -0400
@@ -539,6 +539,11 @@ 
     if (nret != NSS_STATUS_SUCCESS) {
         errno = errnop;
     }
+    /* Always pretend we have no data.  */
+    if (nret == NSS_STATUS_UNAVAIL) {
+	nret = NSS_STATUS_NOTFOUND;
+	errno = 0;
+    }
 
     sss_nss_unlock();
     return nret;
@@ -639,6 +644,11 @@ 
     if (nret != NSS_STATUS_SUCCESS) {
         errno = errnop;
     }
+    /* Always pretend we have no data.  */
+    if (nret == NSS_STATUS_UNAVAIL) {
+	nret = NSS_STATUS_NOTFOUND;
+	errno = 0;
+    }
 
     sss_nss_unlock();
     return nret;