Fix incorrect double-checked locking in __nss_database_lookup. [BZ #20483]

Message ID 1471553358.14544.108.camel@localhost.localdomain
State New, archived
Headers

Commit Message

Torvald Riegel Aug. 18, 2016, 8:49 p.m. UTC
  The pointer that serves as a cache for lookups (thus allowing users to
only perform lookup once) needs to be accessed using atomics (to avoid
data races) and with the appropriate memory orders (to ensure that
lookup actually happens-before any uses of the lookup's effects).  It
might have been nicer to change the interface so that a pointer is
returned that does not need to be accessed atomically, but this would
change the ABI of nss; therefore, we keep the interface of
__nss_database_lookup unchanged but fix the synchronization in that
function and all callers.

While preparing the patch, I noticed two things that seemed odd in how
grp/initgroups.c uses __nss_database_lookup.  They are marked by
comments added to the code.  My guess is we should look at these in a
follow-up.

Tested on x86_64-linux.
 
	[BZ #20483]
	* nss/nsswitch.h (__nss_database_lookup): Add documentation.
	* nss/nsswitch.c: Include <atomic.h>
	(__nss_database_lookup): Fix synchronization.
	(nss_load_all_libraries): Adapt.
	* grp/initgroups.c (internal_getgrouplist): Adapt.
	* nss/XXX-lookup.c (DB_LOOKUP_FCT): Adapt.
	* nscd/aicache.c (addhstaiX): Adapt.
	* nscd/initgrcache.c (addinitgroupsX): Adapt.
	* nscd/netgroupcache.c (addgetnetgrentX): Adapt.
	* nis/nss_compat/compat-grp.c (ni_once): New.
	(init_nss_interface): Adapt.
	* nis/nss_compat/compat-initgroups.c (init_nss_interface,
	internal_setgrent): Adapt.
	* nis/nss_compat/compat-pwd.c (ni_once): New.
	(init_nss_interface): Adapt.
	* nis/nss_compat/compat-spwd.c (ni_once): New.
	(init_nss_interface): Adapt.
	* sysdeps/posix/getaddrinfo.c (gaih_inet): Adapt.
  

Comments

Florian Weimer Sept. 12, 2016, 6:55 p.m. UTC | #1
On 08/18/2016 10:49 PM, Torvald Riegel wrote:
>  DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
>  	       void **fctp)
>  {
> -  if (DATABASE_NAME_SYMBOL == NULL
> -      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> -				DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0)
> +  /* Relaxed MO is fine because this is just about whether we have to perform
> +     the lookup; we will do another acquire-MO load next before assuming that
> +     the lookup has happened.  */
> +  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
> +      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> +				 DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0))
>      return -1;
>
> -  *ni = DATABASE_NAME_SYMBOL;
> +  /* Acquire MO as required by __nss_database_lookup.  */
> +  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
>
>    return __nss_lookup (ni, fct_name, fct2_name, fctp);
>  }

I'm not sure if this is the double-checked locking pattern we want to 
use for new code.  GCC currently cannot merge the two loads (and maybe 
it never will do so because it is beneficial to treat the separate loads 
as an optimization hint).

I still think we would be better off if we centralize this particular 
code in a single function with an interface similar to 
__nss_database_lookup.  The manual inlining seems unnecessary.

Thanks,
Florian
  
Torvald Riegel Sept. 13, 2016, 8:43 a.m. UTC | #2
On Mon, 2016-09-12 at 20:55 +0200, Florian Weimer wrote:
> On 08/18/2016 10:49 PM, Torvald Riegel wrote:
> >  DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
> >  	       void **fctp)
> >  {
> > -  if (DATABASE_NAME_SYMBOL == NULL
> > -      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> > -				DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0)
> > +  /* Relaxed MO is fine because this is just about whether we have to perform
> > +     the lookup; we will do another acquire-MO load next before assuming that
> > +     the lookup has happened.  */
> > +  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
> > +      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> > +				 DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0))
> >      return -1;
> >
> > -  *ni = DATABASE_NAME_SYMBOL;
> > +  /* Acquire MO as required by __nss_database_lookup.  */
> > +  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
> >
> >    return __nss_lookup (ni, fct_name, fct2_name, fctp);
> >  }
> 
> I'm not sure if this is the double-checked locking pattern we want to 
> use for new code.  GCC currently cannot merge the two loads (and maybe 
> it never will do so because it is beneficial to treat the separate loads 
> as an optimization hint).

I don't think the (re)load matters performance-wise, given that
DATABASE_NAME_SYMBOL is initialized once, and so the access will most
likely hit in the cache.  What may matter were if we'd often returned -1
and would have an unnecessary acquire MO in this path, though OTOH
__nss_database_lookup will acquire the lock anyway.

Loading it into a temporary first, then doing the check, and then
issuing an acquire-MO fence would be another way to avoid the extra
load.

> I still think we would be better off if we centralize this particular 
> code in a single function with an interface similar to 
> __nss_database_lookup.  The manual inlining seems unnecessary.

Doing all of that in __nss_database_lookup might be cleaner, I agree.
But my focus for this patch was just a bug fix, not refactoring; I'd
like to leave it to others to decide how to clean-up this part of glibc.
I also wasn't quite sure whether __nss_database_lookup is part of the
ABI or not.
  
Florian Weimer Sept. 13, 2016, 9:17 a.m. UTC | #3
On 09/13/2016 10:43 AM, Torvald Riegel wrote:
> On Mon, 2016-09-12 at 20:55 +0200, Florian Weimer wrote:
>> On 08/18/2016 10:49 PM, Torvald Riegel wrote:
>>>  DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
>>>  	       void **fctp)
>>>  {
>>> -  if (DATABASE_NAME_SYMBOL == NULL
>>> -      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
>>> -				DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0)
>>> +  /* Relaxed MO is fine because this is just about whether we have to perform
>>> +     the lookup; we will do another acquire-MO load next before assuming that
>>> +     the lookup has happened.  */
>>> +  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
>>> +      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
>>> +				 DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0))
>>>      return -1;
>>>
>>> -  *ni = DATABASE_NAME_SYMBOL;
>>> +  /* Acquire MO as required by __nss_database_lookup.  */
>>> +  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
>>>
>>>    return __nss_lookup (ni, fct_name, fct2_name, fctp);
>>>  }
>>
>> I'm not sure if this is the double-checked locking pattern we want to
>> use for new code.  GCC currently cannot merge the two loads (and maybe
>> it never will do so because it is beneficial to treat the separate loads
>> as an optimization hint).
>
> I don't think the (re)load matters performance-wise, given that
> DATABASE_NAME_SYMBOL is initialized once, and so the access will most
> likely hit in the cache.  What may matter were if we'd often returned -1
> and would have an unnecessary acquire MO in this path, though OTOH
> __nss_database_lookup will acquire the lock anyway.

__nss_database_lookup fails only if there is a low-level failure (like a 
failed memory allocation).  As far as I can see, syntax errors in the 
configuration file are ignored.  You always get a positive result out of 
it, so it's only called once.

> Loading it into a temporary first, then doing the check, and then
> issuing an acquire-MO fence would be another way to avoid the extra
> load.

Couldn't we do an acquire load unconditionally?  The == NULL case is 
really not worth optimizing for, it only happens during startup.

>> I still think we would be better off if we centralize this particular
>> code in a single function with an interface similar to
>> __nss_database_lookup.  The manual inlining seems unnecessary.
>
> Doing all of that in __nss_database_lookup might be cleaner, I agree.
> But my focus for this patch was just a bug fix, not refactoring; I'd
> like to leave it to others to decide how to clean-up this part of glibc.
> I also wasn't quite sure whether __nss_database_lookup is part of the
> ABI or not.

I looked at the current implementation and I already see a *ni != NULL 
in __nss_database_lookup, so I think we can just move the double-checked 
locking into __nss_database_lookup and simplify the callers.  This would 
not result in an observable behavior change, so it is okay although 
__nss_database_lookup is part of the ABI.

And the refactoring would mostly be about deleting a bunch of code 
around calls of __nss_database_lookup.  The core changes in 
__nss_database_lookup (to move the release-MO store to the end, after 
all initialization) are needed either way.

Thanks,
Florian
  
Torvald Riegel Sept. 13, 2016, 11:05 a.m. UTC | #4
On Tue, 2016-09-13 at 11:17 +0200, Florian Weimer wrote:
> On 09/13/2016 10:43 AM, Torvald Riegel wrote:
> > On Mon, 2016-09-12 at 20:55 +0200, Florian Weimer wrote:
> >> On 08/18/2016 10:49 PM, Torvald Riegel wrote:
> >>>  DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
> >>>  	       void **fctp)
> >>>  {
> >>> -  if (DATABASE_NAME_SYMBOL == NULL
> >>> -      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> >>> -				DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0)
> >>> +  /* Relaxed MO is fine because this is just about whether we have to perform
> >>> +     the lookup; we will do another acquire-MO load next before assuming that
> >>> +     the lookup has happened.  */
> >>> +  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
> >>> +      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
> >>> +				 DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0))
> >>>      return -1;
> >>>
> >>> -  *ni = DATABASE_NAME_SYMBOL;
> >>> +  /* Acquire MO as required by __nss_database_lookup.  */
> >>> +  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
> >>>
> >>>    return __nss_lookup (ni, fct_name, fct2_name, fctp);
> >>>  }
> >>
> >> I'm not sure if this is the double-checked locking pattern we want to
> >> use for new code.  GCC currently cannot merge the two loads (and maybe
> >> it never will do so because it is beneficial to treat the separate loads
> >> as an optimization hint).
> >
> > I don't think the (re)load matters performance-wise, given that
> > DATABASE_NAME_SYMBOL is initialized once, and so the access will most
> > likely hit in the cache.  What may matter were if we'd often returned -1
> > and would have an unnecessary acquire MO in this path, though OTOH
> > __nss_database_lookup will acquire the lock anyway.
> 
> __nss_database_lookup fails only if there is a low-level failure (like a 
> failed memory allocation).  As far as I can see, syntax errors in the 
> configuration file are ignored.  You always get a positive result out of 
> it, so it's only called once.
> 
> > Loading it into a temporary first, then doing the check, and then
> > issuing an acquire-MO fence would be another way to avoid the extra
> > load.
> 
> Couldn't we do an acquire load unconditionally?  The == NULL case is 
> really not worth optimizing for, it only happens during startup.

We could, but because __nss_database_lookup doesn't return the pointer,
we'd still have to do an atomic_load_relaxed (...) at the caller.  So,
not much of a gain it seems.

> >> I still think we would be better off if we centralize this particular
> >> code in a single function with an interface similar to
> >> __nss_database_lookup.  The manual inlining seems unnecessary.
> >
> > Doing all of that in __nss_database_lookup might be cleaner, I agree.
> > But my focus for this patch was just a bug fix, not refactoring; I'd
> > like to leave it to others to decide how to clean-up this part of glibc.
> > I also wasn't quite sure whether __nss_database_lookup is part of the
> > ABI or not.
> 
> I looked at the current implementation and I already see a *ni != NULL 
> in __nss_database_lookup, so I think we can just move the double-checked 
> locking into __nss_database_lookup and simplify the callers.

Same name, but that's a different ni :)  The ni in __nss_database_lookup
is the global variable;  the one in the callers is the private data the
callers use.

> This would 
> not result in an observable behavior change, so it is okay although 
> __nss_database_lookup is part of the ABI.

You can make this cleaner if __nss_database_lookup would return the
value from the exactly-once-initialization, and this is what I initially
did when preparing this patch.  But then I noticed that it looked to be
part of the ABI, and had to revert it.  (This was before PTO, so I'm
just paging this back in right now... ;) )

> And the refactoring would mostly be about deleting a bunch of code 
> around calls of __nss_database_lookup.  The core changes in 
> __nss_database_lookup (to move the release-MO store to the end, after 
> all initialization) are needed either way.

If we can assume that the callers are always in sync with the
__nss_database_lookup, then I'd agree -- but then this wouldn't be
really an ABI issue.  If the __nss_database_lookup implementation can be
older than the callers' changes, we need to be more careful.  I think I
looked into this and concluded that we couldn't improve the code much if
having to consider compatibility, but I don't remember the details of
why I thought that, sorry.

My suggestion would be to first apply this patch (if it's OK otherwise),
and then try to refactor.  We could put a comment into the first patch
highlighting that refactoring might be beneficial.
  

Patch

commit 14d627ef928f6a2a3e62e81a615a448fc813df35
Author: Torvald Riegel <triegel@redhat.com>
Date:   Wed Aug 17 22:42:04 2016 +0200

    Fix incorrect double-checked locking in __nss_database_lookup.
    
    The pointer that serves as a cache for lookups (thus allowing users to only
    perform lookup once) needs to be accessed using atomics (to avoid data
    races) and with the appropriate memory orders (to ensure that lookup
    actually happens-before any uses of the lookup's effects).  It might have
    been nicer to change the interface so that a pointer is returned that does
    not need to be accessed atomically, but this would change the ABI of nss;
    therefore, we keep the interface of __nss_database_lookup unchanged but
    fix the synchronization in that function and all callers.
    
    	[BZ #20483]
    	* nss/nsswitch.h (__nss_database_lookup): Add documentation.
    	* nss/nsswitch.c: Include <atomic.h>
    	(__nss_database_lookup): Fix synchronization.
    	(nss_load_all_libraries): Adapt.
    	* grp/initgroups.c (internal_getgrouplist): Adapt.
    	* nss/XXX-lookup.c (DB_LOOKUP_FCT): Adapt.
    	* nscd/aicache.c (addhstaiX): Adapt.
    	* nscd/initgrcache.c (addinitgroupsX): Adapt.
    	* nscd/netgroupcache.c (addgetnetgrentX): Adapt.
    	* nis/nss_compat/compat-grp.c (ni_once): New.
    	(init_nss_interface): Adapt.
    	* nis/nss_compat/compat-initgroups.c (init_nss_interface,
    	internal_setgrent): Adapt.
    	* nis/nss_compat/compat-pwd.c (ni_once): New.
    	(init_nss_interface): Adapt.
    	* nis/nss_compat/compat-spwd.c (ni_once): New.
    	(init_nss_interface): Adapt.
    	* sysdeps/posix/getaddrinfo.c (gaih_inet): Adapt.

diff --git a/grp/initgroups.c b/grp/initgroups.c
index 3242aee..c9d7cfa 100644
--- a/grp/initgroups.c
+++ b/grp/initgroups.c
@@ -78,16 +78,25 @@  internal_getgrouplist (const char *user, gid_t group, long int *size,
   /* Start is one, because we have the first group as parameter.  */
   long int start = 1;
 
-  if (__nss_initgroups_database == NULL)
+  /* Relaxed MO is fine because this is just about whether we have to perform
+     the lookup; we will do another acquire-MO load next before making use
+     of the lookup's effects.  */
+  if (atomic_load_relaxed (&__nss_initgroups_database) == NULL)
     {
       if (__nss_database_lookup ("initgroups", NULL, "",
 				 &__nss_initgroups_database) < 0)
 	{
-	  if (__nss_group_database == NULL)
+	  if (atomic_load_relaxed (&__nss_group_database) == NULL)
 	    no_more = __nss_database_lookup ("group", NULL, "compat files",
 					     &__nss_group_database);
 
-	  __nss_initgroups_database = __nss_group_database;
+	  /* Forward the happens-before for the group database to the
+	     initgroups database.  */
+	  /* XXX This seems to expect that a concurrent initgroups lookup
+	     can never produce a different result because otherwise, we
+	     would overwrite this result; why can we assume this?  */
+	  atomic_store_release (&__nss_initgroups_database,
+	      atomic_load_acquire (&__nss_group_database));
 	}
       else
 	use_initgroups_entry = true;
@@ -96,9 +105,13 @@  internal_getgrouplist (const char *user, gid_t group, long int *size,
     /* __nss_initgroups_database might have been set through
        __nss_configure_lookup in which case use_initgroups_entry was
        not set here.  */
+    /* XXX This comment and branch seem to be outdated.
+       __nss_configure_lookup does not seem to access
+       __nss_initgroups_database.  */
     use_initgroups_entry = __nss_initgroups_database != __nss_group_database;
 
-  service_user *nip = __nss_initgroups_database;
+  /* Acquire MO as required by __nss_database_lookup.  */
+  service_user *nip = atomic_load_acquire (&__nss_initgroups_database);
   while (! no_more)
     {
       long int prev_start = start;
diff --git a/nis/nss_compat/compat-grp.c b/nis/nss_compat/compat-grp.c
index 736f1df..1d54286 100644
--- a/nis/nss_compat/compat-grp.c
+++ b/nis/nss_compat/compat-grp.c
@@ -28,6 +28,8 @@ 
 #include <libc-lock.h>
 #include <kernel-features.h>
 
+static service_user *ni_once;
+/* Must be accessed while having acquired LOCK.  */
 static service_user *ni;
 static enum nss_status (*nss_setgrent) (int stayopen);
 static enum nss_status (*nss_getgrnam_r) (const char *name,
@@ -92,8 +94,11 @@  static int in_blacklist (const char *, int, ent_t *);
 static void
 init_nss_interface (void)
 {
-  if (__nss_database_lookup ("group_compat", NULL, "nis", &ni) >= 0)
+  if (__nss_database_lookup ("group_compat", NULL, "nis", &ni_once) == 0)
     {
+      /* Relaxed MO is sufficient because the return value of zero allows us
+	 to assume that lookup happened-before.  */
+      ni = atomic_load_relaxed (&ni_once);
       nss_setgrent = __nss_lookup_function (ni, "setgrent");
       nss_getgrnam_r = __nss_lookup_function (ni, "getgrnam_r");
       nss_getgrgid_r = __nss_lookup_function (ni, "getgrgid_r");
diff --git a/nis/nss_compat/compat-initgroups.c b/nis/nss_compat/compat-initgroups.c
index e65b10f..4be3aef 100644
--- a/nis/nss_compat/compat-initgroups.c
+++ b/nis/nss_compat/compat-initgroups.c
@@ -95,23 +95,25 @@  extern int __compat_have_cloexec;
 static void blacklist_store_name (const char *, ent_t *);
 static int in_blacklist (const char *, int, ent_t *);
 
-/* Initialize the NSS interface/functions. The calling function must
-   hold the lock.  */
+/* Initialize the NSS interface/functions.  */
 static void
 init_nss_interface (void)
 {
+  /* We want to perform our initialization exactly once.  To simplify this, we
+     just put everything into a critical section and use relaxed MO loads for
+     consistency wrt the use of atomics in __nss_database_lookup.  */
   __libc_lock_lock (lock);
 
-  /* Retest.  */
-  if (ni == NULL
-      && __nss_database_lookup ("group_compat", NULL, "nis", &ni) >= 0)
+  if ((atomic_load_relaxed (&ni) == NULL)
+      && (__nss_database_lookup ("group_compat", NULL, "nis", &ni) == 0))
     {
-      nss_initgroups_dyn = __nss_lookup_function (ni, "initgroups_dyn");
-      nss_getgrnam_r = __nss_lookup_function (ni, "getgrnam_r");
-      nss_getgrgid_r = __nss_lookup_function (ni, "getgrgid_r");
-      nss_setgrent = __nss_lookup_function (ni, "setgrent");
-      nss_getgrent_r = __nss_lookup_function (ni, "getgrent_r");
-      nss_endgrent = __nss_lookup_function (ni, "endgrent");
+      service_user *ni2 = atomic_load_relaxed (&ni);
+      nss_initgroups_dyn = __nss_lookup_function (ni2, "initgroups_dyn");
+      nss_getgrnam_r = __nss_lookup_function (ni2, "getgrnam_r");
+      nss_getgrgid_r = __nss_lookup_function (ni2, "getgrgid_r");
+      nss_setgrent = __nss_lookup_function (ni2, "setgrent");
+      nss_getgrent_r = __nss_lookup_function (ni2, "getgrent_r");
+      nss_endgrent = __nss_lookup_function (ni2, "endgrent");
     }
 
   __libc_lock_unlock (lock);
@@ -124,8 +126,7 @@  internal_setgrent (ent_t *ent)
 
   ent->files = true;
 
-  if (ni == NULL)
-    init_nss_interface ();
+  init_nss_interface ();
 
   if (ent->blacklist.data != NULL)
     {
diff --git a/nis/nss_compat/compat-pwd.c b/nis/nss_compat/compat-pwd.c
index 16e9404..5266c80 100644
--- a/nis/nss_compat/compat-pwd.c
+++ b/nis/nss_compat/compat-pwd.c
@@ -32,6 +32,8 @@ 
 
 #include "netgroup.h"
 
+static service_user *ni_once;
+/* Must be accessed while having acquired LOCK.  */
 static service_user *ni;
 static enum nss_status (*nss_setpwent) (int stayopen);
 static enum nss_status (*nss_getpwnam_r) (const char *name,
@@ -102,8 +104,11 @@  static int in_blacklist (const char *, int, ent_t *);
 static void
 init_nss_interface (void)
 {
-  if (__nss_database_lookup ("passwd_compat", NULL, "nis", &ni) >= 0)
+  if (__nss_database_lookup ("passwd_compat", NULL, "nis", &ni_once) == 0)
     {
+      /* Relaxed MO is sufficient because the return value of zero allows us
+	 to assume that lookup happened-before.  */
+      ni = atomic_load_relaxed (&ni_once);
       nss_setpwent = __nss_lookup_function (ni, "setpwent");
       nss_getpwnam_r = __nss_lookup_function (ni, "getpwnam_r");
       nss_getpwuid_r = __nss_lookup_function (ni, "getpwuid_r");
diff --git a/nis/nss_compat/compat-spwd.c b/nis/nss_compat/compat-spwd.c
index 4db96ef..4937fd6 100644
--- a/nis/nss_compat/compat-spwd.c
+++ b/nis/nss_compat/compat-spwd.c
@@ -32,6 +32,8 @@ 
 
 #include "netgroup.h"
 
+static service_user *ni_once;
+/* Must be accessed while having acquired LOCK.  */
 static service_user *ni;
 static enum nss_status (*nss_setspent) (int stayopen);
 static enum nss_status (*nss_getspnam_r) (const char *name, struct spwd * sp,
@@ -99,9 +101,12 @@  static int in_blacklist (const char *, int, ent_t *);
 static void
 init_nss_interface (void)
 {
-  if (__nss_database_lookup ("shadow_compat", "passwd_compat",
-			     "nis", &ni) >= 0)
+  if (__nss_database_lookup ("shadow_compat", "passwd_compat", "nis",
+			     &ni_once) == 0)
     {
+      /* Relaxed MO is sufficient because the return value of zero allows us
+	 to assume that lookup happened-before.  */
+      ni = atomic_load_relaxed (&ni_once);
       nss_setspent = __nss_lookup_function (ni, "setspent");
       nss_getspnam_r = __nss_lookup_function (ni, "getspnam_r");
       nss_getspent_r = __nss_lookup_function (ni, "getspent_r");
diff --git a/nscd/aicache.c b/nscd/aicache.c
index 32c8f57..f3c8af4 100644
--- a/nscd/aicache.c
+++ b/nscd/aicache.c
@@ -92,13 +92,17 @@  addhstaiX (struct database_dyn *db, int fd, request_header *req,
   int rc4 = 0;
   int herrno = 0;
 
-  if (hosts_database == NULL)
+  /* Relaxed MO is fine because this is just about whether we have to perform
+     the lookup; we will do another acquire-MO load next before making use
+     of the lookup's effects.  */
+  if (atomic_load_relaxed (&hosts_database) == NULL)
     no_more = __nss_database_lookup ("hosts", NULL,
 				     "dns [!UNAVAIL=return] files",
 				     &hosts_database);
   else
     no_more = 0;
-  nip = hosts_database;
+  /* Acquire MO as required by __nss_database_lookup.  */
+  nip = atomic_load_acquire (&hosts_database);
 
   /* Initialize configurations.  */
   _res_hconf_init ();
diff --git a/nscd/initgrcache.c b/nscd/initgrcache.c
index c85a751..c0a965c 100644
--- a/nscd/initgrcache.c
+++ b/nscd/initgrcache.c
@@ -84,13 +84,17 @@  addinitgroupsX (struct database_dyn *db, int fd, request_header *req,
   service_user *nip;
   int no_more;
 
-  if (group_database == NULL)
+  /* Relaxed MO is fine because this is just about whether we have to perform
+     the lookup; we will do another acquire-MO load below before making use
+     of the lookup's effects.  */
+  if (atomic_load_relaxed (&group_database) == NULL)
     no_more = __nss_database_lookup ("group", NULL,
 				     "compat [NOTFOUND=return] files",
 				     &group_database);
   else
     no_more = 0;
-  nip = group_database;
+  /* Acquire MO as required by __nss_database_lookup.  */
+  nip = atomic_load_acquire (&group_database);
 
  /* We always use sysconf even if NGROUPS_MAX is defined.  That way, the
      limit can be raised in the kernel configuration without having to
diff --git a/nscd/netgroupcache.c b/nscd/netgroupcache.c
index 08e9022..249d685 100644
--- a/nscd/netgroupcache.c
+++ b/nscd/netgroupcache.c
@@ -140,8 +140,10 @@  addgetnetgrentX (struct database_dyn *db, int fd, request_header *req,
   struct name_list *first_needed
     = alloca (sizeof (struct name_list) + group_len);
 
-  if (netgroup_database == NULL
-      && __nss_database_lookup ("netgroup", NULL, NULL, &netgroup_database))
+  /* Acquire MO as required by __nss_database_lookup.  */
+  if ((atomic_load_acquire (&netgroup_database) == NULL)
+      && (__nss_database_lookup ("netgroup", NULL, NULL, &netgroup_database)
+	  != 0))
     {
       /* No such service.  */
       cacheable = do_notfound (db, fd, req, key, &dataset, &total, &timeout,
@@ -172,7 +174,8 @@  addgetnetgrentX (struct database_dyn *db, int fd, request_header *req,
 	void *ptr;
       } setfct;
 
-      service_user *nip = netgroup_database;
+      /* Acquire MO as required by __nss_database_lookup.  */
+      service_user *nip = atomic_load_acquire (&netgroup_database);
       int no_more = __nss_lookup (&nip, "setnetgrent", NULL, &setfct.ptr);
       while (!no_more)
 	{
diff --git a/nss/XXX-lookup.c b/nss/XXX-lookup.c
index 6119468..8ea1fa3 100644
--- a/nss/XXX-lookup.c
+++ b/nss/XXX-lookup.c
@@ -65,12 +65,16 @@  internal_function
 DB_LOOKUP_FCT (service_user **ni, const char *fct_name, const char *fct2_name,
 	       void **fctp)
 {
-  if (DATABASE_NAME_SYMBOL == NULL
-      && __nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
-				DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0)
+  /* Relaxed MO is fine because this is just about whether we have to perform
+     the lookup; we will do another acquire-MO load next before assuming that
+     the lookup has happened.  */
+  if ((atomic_load_relaxed (&(DATABASE_NAME_SYMBOL)) == NULL)
+      && (__nss_database_lookup (DATABASE_NAME_STRING, ALTERNATE_NAME_STRING,
+				 DEFAULT_CONFIG, &DATABASE_NAME_SYMBOL) < 0))
     return -1;
 
-  *ni = DATABASE_NAME_SYMBOL;
+  /* Acquire MO as required by __nss_database_lookup.  */
+  *ni = atomic_load_acquire (&(DATABASE_NAME_SYMBOL));
 
   return __nss_lookup (ni, fct_name, fct2_name, fctp);
 }
diff --git a/nss/nsswitch.c b/nss/nsswitch.c
index d770650..542c15f 100644
--- a/nss/nsswitch.c
+++ b/nss/nsswitch.c
@@ -26,6 +26,7 @@ 
 #include <stdio_ext.h>
 #include <stdlib.h>
 #include <string.h>
+#include <atomic.h>
 
 #include <aliases.h>
 #include <grp.h>
@@ -102,18 +103,26 @@  static void (*nscd_init_cb) (size_t, struct traced_file *);
 #endif
 
 
-/* -1 == database not found
-    0 == database entry pointer stored */
+/* See declaration.  */
 int
 __nss_database_lookup (const char *database, const char *alternate_name,
 		       const char *defconfig, service_user **ni)
 {
+  /* NI is a cache so we can avoid acquiring the lock if we have already
+     performed the full lookup.  It is similar to pthread_once
+     synchronization when considering the requirement on callers to access
+     NI with acquire MO atomics.  */
+
   /* Prevent multiple threads to change the service table.  */
   __libc_lock_lock (lock);
 
   /* Reconsider database variable in case some other thread called
-     `__nss_configure_lookup' while we waited for the lock.  */
-  if (*ni != NULL)
+     `__nss_configure_lookup' while we waited for the lock.
+     Acquire MO so that a return value of 0 allows the caller to assume that
+     lookup happened-before; we synchronize with the release MO store at the
+     end of this function.  */
+  service_user *ni_cur = atomic_load_acquire (ni);
+  if (ni_cur != NULL)
     {
       __libc_lock_unlock (lock);
       return 0;
@@ -134,14 +143,14 @@  __nss_database_lookup (const char *database, const char *alternate_name,
 	 only requested once and so this might not be critical.  */
       for (entry = service_table->entry; entry != NULL; entry = entry->next)
 	if (strcmp (database, entry->name) == 0)
-	  *ni = entry->service;
+	  ni_cur = entry->service;
 
-      if (*ni == NULL && alternate_name != NULL)
+      if (ni_cur == NULL && alternate_name != NULL)
 	/* We haven't found an entry so far.  Try to find it with the
 	   alternative name.  */
 	for (entry = service_table->entry; entry != NULL; entry = entry->next)
 	  if (strcmp (alternate_name, entry->name) == 0)
-	    *ni = entry->service;
+	    ni_cur = entry->service;
     }
 
   /* No configuration data is available, either because nsswitch.conf
@@ -149,11 +158,11 @@  __nss_database_lookup (const char *database, const char *alternate_name,
 
      DEFCONFIG specifies the default service list for this database,
      or null to use the most common default.  */
-  if (*ni == NULL)
+  if (ni_cur == NULL)
     {
-      *ni = nss_parse_service_list (defconfig
-				    ?: "nis [NOTFOUND=return] files");
-      if (*ni != NULL)
+      ni_cur = nss_parse_service_list (defconfig
+				       ?: "nis [NOTFOUND=return] files");
+      if (ni_cur != NULL)
 	{
 	  /* Record the memory we've just allocated in defconfig_entries list,
 	     so we can free it later.  */
@@ -165,16 +174,20 @@  __nss_database_lookup (const char *database, const char *alternate_name,
 	  if (entry != NULL)
 	    {
 	      entry->next = defconfig_entries;
-	      entry->service = *ni;
+	      entry->service = ni_cur;
 	      entry->name[0] = '\0';
 	      defconfig_entries = entry;
 	    }
 	}
     }
+  /* We have finished the lookup, so publish the result.  Release MO so that
+     we synchronize with consumers of this value, which ensures that the
+     lookup happens-before any use of its result.  */
+  atomic_store_release (ni, ni_cur);
 
   __libc_lock_unlock (lock);
 
-  return *ni != NULL ? 0 : -1;
+  return ni_cur != NULL ? 0 : -1;
 }
 libc_hidden_def (__nss_database_lookup)
 
@@ -826,14 +839,21 @@  nss_new_service (name_database *database, const char *name)
 static void
 nss_load_all_libraries (const char *service, const char *def)
 {
-  service_user *ni = NULL;
-
-  if (__nss_database_lookup (service, NULL, def, &ni) == 0)
-    while (ni != NULL)
-      {
-	nss_load_library (ni);
-	ni = ni->next;
-      }
+  service_user *ni_once = NULL;
+
+  if (__nss_database_lookup (service, NULL, def, &ni_once) == 0)
+    {
+      /* This isn't strictly necessary because we are the only thread using
+	 ni_once, but use atomics nonetheless for consistency.  Relaxed MO
+	 because a return value of zero allows us to assume that lookup
+	 happened-before.  */
+      service_user *ni = atomic_load_relaxed (&ni_once);
+      while (ni != NULL)
+	{
+	  nss_load_library (ni);
+	  ni = ni->next;
+	}
+    }
 }
 
 
diff --git a/nss/nsswitch.h b/nss/nsswitch.h
index 54c8b65..fea8925 100644
--- a/nss/nsswitch.h
+++ b/nss/nsswitch.h
@@ -122,7 +122,17 @@  extern bool __nss_database_custom[NSS_DBSIDX_max];
 /* Get the data structure representing the specified database.
    If there is no configuration for this database in the file,
    parse a service list from DEFCONFIG and use that.  More
-   than one function can use the database.  */
+   than one function can use the database.
+   NI serves as a cache for a pointer to the database, so that multiple
+   threads that want to use it concurrently only need to perform the full
+   lookup once.  It must be initialized to NULL.  To access NI safely, an
+   atomic load with acquire MO is required in the general case; when a
+   non-NULL value is returned from such a load, the caller can expect that
+   lookup happened before.  Only once will __nss_database_lookup store a
+   non-NULL value in a particular NI.
+   Returns 0 iff NI indicates that lookup has been performed already, in
+   which case the caller can assume that lookup happened before.
+   Returns -1 iff the database was not found.  */
 extern int __nss_database_lookup (const char *database,
 				  const char *alternative_name,
 				  const char *defconfig, service_user **ni);
diff --git a/sysdeps/posix/getaddrinfo.c b/sysdeps/posix/getaddrinfo.c
index 09fbc83..94410cb 100644
--- a/sysdeps/posix/getaddrinfo.c
+++ b/sysdeps/posix/getaddrinfo.c
@@ -807,13 +807,17 @@  gaih_inet (const char *name, const struct gaih_service *service,
 	    }
 #endif
 
-	  if (__nss_hosts_database == NULL)
+	  /* Relaxed MO is fine because this is just about whether we have to
+	     perform the lookup; we will do another acquire-MO load below
+	     before making use of the lookup's effects.  */
+	  if (atomic_load_relaxed (&__nss_hosts_database) == NULL)
 	    no_more = __nss_database_lookup ("hosts", NULL,
 					     "dns [!UNAVAIL=return] files",
 					     &__nss_hosts_database);
 	  else
 	    no_more = 0;
-	  nip = __nss_hosts_database;
+	  /* Acquire MO as required by __nss_database_lookup.  */
+	  nip = atomic_load_acquire (&__nss_hosts_database);
 
 	  /* Initialize configurations.  */
 	  _res_hconf_init ();