[v2] Fix double-checked locking in _res_hconf_reorder_addrs [BZ #19074]

Message ID 56152831.7020707@redhat.com
State Superseded
Headers

Commit Message

Florian Weimer Oct. 7, 2015, 2:12 p.m. UTC
  On 10/07/2015 12:28 PM, Torvald Riegel wrote:

> If you should write this page, check whether you really need the acquire
> MO in the load in the critical section, or whether relaxed MO is
> sufficient ;)

Acquire/release is needed because the access to the actual data needs to
be synchronized as well.

I'm struggling with the cppmem syntax to show that.  Is there a grammar
published anywhere?

>> It's probably best to document things that work and way, and stick to
>> that, instead of using ad-hoc schemes all over the place.
> 
> I agree that it's good to follow common patterns.  Nonetheless, you
> still need to say why the pattern applies, and make sure that all
> readers see the pattern and can check that the pattern is applied
> properly.  This can be brief -- but if people in the community feel like
> they need more details in the comments, it should be provided in most
> cases, I think.

It's not double-checked locking exactly, the lock can be taken multiple
times because the initialization might yield 0, in which case it is
retried (<= 0 is used instead of < 0).  I don't know if this
intentional, but I'm not going to change that.  It also means that I
can't use __libc_once.

>>  (I think the
>> code I showed is sufficiently close to the usual double-checked locking
>> idiom, except that the guarded section may run multiple times.)
> 
> Are you referring to that the critical section may be entered
> unnecessarily?  That is common for double-checked locking.  Or are you
> referring to something else?

This part:

  if (num_ifs <= 0)
    {
      struct ifreq *ifr, *cur_ifr;
      int sd, num, i;
…
      sd = __socket (AF_INET, SOCK_DGRAM, 0);
      if (sd < 0)
	return;
…
      __libc_lock_lock (lock);
…
      if (num_ifs <= 0)
	{

The standard idiom would create the socket *after* the final if
statement in that sequence above.  Nothing is gained by doing the socket
creation in parallel.

I'm attaching a new version of the patch with some additional comments.
 I also fixed one logic bug not present in the original code and relaxed
one of the loads.

However, I'm not sure if this is the right approach.  I wonder if it is
better to separate the concurrency aspects in a helper function (and
maybe simplify it at the time), and have the actual code use the
abstraction instead.

Florian
  

Comments

Torvald Riegel Oct. 8, 2015, 2:46 p.m. UTC | #1
On Wed, 2015-10-07 at 16:12 +0200, Florian Weimer wrote:
> On 10/07/2015 12:28 PM, Torvald Riegel wrote:
> 
> > If you should write this page, check whether you really need the acquire
> > MO in the load in the critical section, or whether relaxed MO is
> > sufficient ;)
> 
> Acquire/release is needed because the access to the actual data needs to
> be synchronized as well.

Yes, but there's the lock used for the critical section too.  The
critical sections are ordered.  So have a look at whether you can
actually not read from initialization done in the initializing critical
section if you read that the initialization flag has been set.

> I'm struggling with the cppmem syntax to show that.  Is there a grammar
> published anywhere?

Not as far as I know.  You could look at the sources of the tool
though :)

The slides of my Cauldron tutorial have a selection of what is accepted.
Modeling a lock order with a release/acquire store/load pair can be
helpful.
  
Torvald Riegel Oct. 8, 2015, 3:14 p.m. UTC | #2
And now with actual comments on the patch...

On Wed, 2015-10-07 at 16:12 +0200, Florian Weimer wrote:
> diff --git a/resolv/res_hconf.c b/resolv/res_hconf.c
> index 692d948..1497b05 100644
> --- a/resolv/res_hconf.c
> +++ b/resolv/res_hconf.c
> @@ -45,6 +45,7 @@
>  #include "ifreq.h"
>  #include "res_hconf.h"
>  #include <wchar.h>
> +#include <atomic.h>
>  
>  #if IS_IN (libc)
>  # define fgets_unlocked __fgets_unlocked
> @@ -369,148 +370,191 @@ libc_freeres_ptr (
>  static struct netaddr
>  {
>    int addrtype;
>    union
>    {
>      struct
>      {
>        u_int32_t	addr;
>        u_int32_t	mask;
>      } ipv4;
>    } u;
>  } *ifaddrs);
>  # endif
>  
>  /* Reorder addresses returned in a hostent such that the first address
>     is an address on the local subnet, if there is such an address.
>     Otherwise, nothing is changed.
>  
>     Note that this function currently only handles IPv4 addresses.  */
>  
>  void
>  _res_hconf_reorder_addrs (struct hostent *hp)
>  {
>  #if defined SIOCGIFCONF && defined SIOCGIFNETMASK
>    int i, j;
> -  /* Number of interfaces.  */
> +  /* Number of interfaces.  Also serves as a flag for the
> +     double-checked locking idiom.  */
>    static int num_ifs = -1;
> -  /* We need to protect the dynamic buffer handling.  */
> +  /* Local copy of num_ifs, for non-atomic access.  */
> +  int num_ifs_local;
> +  /* We need to protect the dynamic buffer handling.  The lock is only
> +     taken during initialization, afterwards, a positive num_ifs value
> +     indicates initialization.  */

Isn't initialization only finished if this is larger than zero?  I'd
also say "held" or "acquired" instead of "taken", and would use a
semicolon instead of the first comma (or start a new sentence).

>    __libc_lock_define_initialized (static, lock);
>  
>    /* Only reorder if we're supposed to.  */
>    if ((_res_hconf.flags & HCONF_FLAG_REORDER) == 0)
>      return;
>  
>    /* Can't deal with anything but IPv4 for now...  */
>    if (hp->h_addrtype != AF_INET)
>      return;
>  
> -  if (num_ifs <= 0)
> +  num_ifs_local = atomic_load_acquire (&num_ifs);

Please add a brief comment what this is supposed to synchronize with.
Or refer to double-checked locking idiom, if it's documented in the
wiki.

> +  if (num_ifs_local <= 0)
>      {
>        struct ifreq *ifr, *cur_ifr;
>        int sd, num, i;
>        /* Save errno.  */
>        int save = errno;
>  
>        /* Initialize interface table.  */
>  
>        /* The SIOCGIFNETMASK ioctl will only work on an AF_INET socket.  */
>        sd = __socket (AF_INET, SOCK_DGRAM, 0);
>        if (sd < 0)
>  	return;
>  
>        /* Get lock.  */
>        __libc_lock_lock (lock);
>  
> -      /* Recheck, somebody else might have done the work by now.  */
> -      if (num_ifs <= 0)
> +      /* Recheck, somebody else might have done the work by now.  A
> +	 regular load is sufficient because we have the lock, and
> +	 num_ifs is only updated under the lock.  */
> +      num_ifs_local = num_ifs;

Style-wise, I'd use a relaxed MO load here, so that all atomically
accessed data is always accessed atomically (with initialization as
exception).  This makes it easier to move to C11(-like) atomic data
types in the future, because while this plain access might work, it
would become a seq_cst MO access.
Not a critical issue though.

BTW, this is the acquire load I asked about, not the initial one.  I was
talking about this being a relaxed MO load, you have it as a nonatomic
one.

> +      if (num_ifs_local <= 0)
>  	{
> +	  /* This is the only block which writes to num_ifs.  It can
> +	     be executed several times (sequentially) if
> +	     initialization does not yield any interfaces, and num_ifs
> +	     remains zero.  However, once we stored a positive value
> +	     in num_ifs below, this block cannot be entered again due
> +	     to the condition above.  */

OK.  BTW, entering the critical section without doing anything effective
(ie, no initialization) doesn't change the double-checked locking
scheme.  Aside from forward progress issues such as deadlock, empty
critical sections are not observable; likewise for ones that don't
change state observable to other threads.
This would be clearer to see if you wouldn't modify num_ifs at all if
new_num_ifs==0, but this is just a clarity-of-code thing not a
correctness problem.

>  	  int new_num_ifs = 0;
>  
>  	  /* Get a list of interfaces.  */
>  	  __ifreq (&ifr, &num, sd);
>  	  if (!ifr)
>  	    goto cleanup;
>  
>  	  ifaddrs = malloc (num * sizeof (ifaddrs[0]));
>  	  if (!ifaddrs)
>  	    goto cleanup1;
>  
>  	  /* Copy usable interfaces in ifaddrs structure.  */
>  	  for (cur_ifr = ifr, i = 0; i < num;
>  	       cur_ifr = __if_nextreq (cur_ifr), ++i)
>  	    {
>  	      union
>  	      {
>  		struct sockaddr sa;
>  		struct sockaddr_in sin;
>  	      } ss;
>  
>  	      if (cur_ifr->ifr_addr.sa_family != AF_INET)
>  		continue;
>  
>  	      ifaddrs[new_num_ifs].addrtype = AF_INET;
>  	      ss.sa = cur_ifr->ifr_addr;
>  	      ifaddrs[new_num_ifs].u.ipv4.addr = ss.sin.sin_addr.s_addr;
>  
>  	      if (__ioctl (sd, SIOCGIFNETMASK, cur_ifr) < 0)
>  		continue;
>  
>  	      ss.sa = cur_ifr->ifr_netmask;
>  	      ifaddrs[new_num_ifs].u.ipv4.mask = ss.sin.sin_addr.s_addr;
>  
>  	      /* Now we're committed to this entry.  */
>  	      ++new_num_ifs;
>  	    }
>  	  /* Just keep enough memory to hold all the interfaces we want.  */
>  	  ifaddrs = realloc (ifaddrs, new_num_ifs * sizeof (ifaddrs[0]));
>  	  assert (ifaddrs != NULL);
>  
>  	cleanup1:
>  	  __if_freereq (ifr, num);
>  
>  	cleanup:
>  	  /* Release lock, preserve error value, and close socket.  */
>  	  errno = save;
>  
> -	  num_ifs = new_num_ifs;
> +	  /* Advertise successful initialization if new_num_ifs is
> +	     positive (and no updates to ifaddrs are permitted after
> +	     that).  Otherwise, num_ifs remains unchanged, at
> +	     zero.  */
> +	  atomic_store_release (&num_ifs, new_num_ifs);

See the acquire load above.  Please document the MO choice briefly,
unless we follow a pattern documented on the wiki.

> +	  /* Keep the local copy current, to save another load.  */
> +	  num_ifs_local = new_num_ifs;
>  	}
>  
>        __libc_lock_unlock (lock);
>  
>        __close (sd);
>      }
>  
> -  if (num_ifs == 0)
> +  /* num_ifs_local cannot be negative because the if statement above
> +     covered this case.  It can still be zero if we just performed
> +     initialization, but could not find any interfaces.  */
> +  if (num_ifs_local == 0)
>      return;
>  
> +  /* At this point, access to ifaddrs is free from data races:
> +
> +     (1) num_ifs is positive, so we know the initialization code has
> +         completed in some thread because the release-store to num_ifs
> +         is dependency-ordered before the acquire-load.  (If this
> +         thread performed the initialization, it is sequence-ordered
> +         before this point.)

Not dependency-ordered, which is for consume MO and such.

I'd say instead that our goal here is that initialization happens-before
use of the initialized data. Initialization is sequenced-before the
release store. Then we have three possible executions:
(1) the release MO store synchronizes-with the acquire MO load, and the
acquire MO load is sequenced before the use of the initialized data.
(2) we load the initialization flag while being in the critical section;
if we see that initialization is done, then all of the initializing
critical section has happened before us and our use of the data. (Note
that this is a somewhat informal description; not sure whether we need
more detail; maybe in a wiki page.)
(3) thread that initialized the data also uses it, so sequenced-before
is sufficient to ensure happens-before for this case.

> +
> +     (2) The initialization of ifaddrs (both of ifaddrs itself and the
> +         data pointed to it) is sequenced before the write to the
> +         update to new_num_ifs.
> +
> +     (3) Any potential write would have to happen in the

What do you mean by "potential write"?
  

Patch

2015-10-07  Florian Weimer  <fweimer@redhat.com>

	[BZ #19074]
	* resolv/res_hconf.c (_res_hconf_reorder_addrs): Use atomics to
	load and store num_ifs.

diff --git a/NEWS b/NEWS
index a0a91b5..805c7b9 100644
--- a/NEWS
+++ b/NEWS
@@ -49,7 +49,7 @@  Version 2.22
   18533, 18534, 18536, 18539, 18540, 18542, 18544, 18545, 18546, 18547,
   18549, 18553, 18557, 18558, 18569, 18583, 18585, 18586, 18592, 18593,
   18594, 18602, 18612, 18613, 18619, 18633, 18635, 18641, 18643, 18648,
-  18657, 18676, 18694, 18696, 18887.
+  18657, 18676, 18694, 18696, 18887, 19074.
 
 * Cache information can be queried via sysconf() function on s390 e.g. with
   _SC_LEVEL1_ICACHE_SIZE as argument.

diff --git a/resolv/res_hconf.c b/resolv/res_hconf.c
index 692d948..1497b05 100644
--- a/resolv/res_hconf.c
+++ b/resolv/res_hconf.c
@@ -45,6 +45,7 @@ 
 #include "ifreq.h"
 #include "res_hconf.h"
 #include <wchar.h>
+#include <atomic.h>
 
 #if IS_IN (libc)
 # define fgets_unlocked __fgets_unlocked
@@ -369,148 +370,191 @@  libc_freeres_ptr (
 static struct netaddr
 {
   int addrtype;
   union
   {
     struct
     {
       u_int32_t	addr;
       u_int32_t	mask;
     } ipv4;
   } u;
 } *ifaddrs);
 # endif
 
 /* Reorder addresses returned in a hostent such that the first address
    is an address on the local subnet, if there is such an address.
    Otherwise, nothing is changed.
 
    Note that this function currently only handles IPv4 addresses.  */
 
 void
 _res_hconf_reorder_addrs (struct hostent *hp)
 {
 #if defined SIOCGIFCONF && defined SIOCGIFNETMASK
   int i, j;
-  /* Number of interfaces.  */
+  /* Number of interfaces.  Also serves as a flag for the
+     double-checked locking idiom.  */
   static int num_ifs = -1;
-  /* We need to protect the dynamic buffer handling.  */
+  /* Local copy of num_ifs, for non-atomic access.  */
+  int num_ifs_local;
+  /* We need to protect the dynamic buffer handling.  The lock is only
+     taken during initialization, afterwards, a positive num_ifs value
+     indicates initialization.  */
   __libc_lock_define_initialized (static, lock);
 
   /* Only reorder if we're supposed to.  */
   if ((_res_hconf.flags & HCONF_FLAG_REORDER) == 0)
     return;
 
   /* Can't deal with anything but IPv4 for now...  */
   if (hp->h_addrtype != AF_INET)
     return;
 
-  if (num_ifs <= 0)
+  num_ifs_local = atomic_load_acquire (&num_ifs);
+  if (num_ifs_local <= 0)
     {
       struct ifreq *ifr, *cur_ifr;
       int sd, num, i;
       /* Save errno.  */
       int save = errno;
 
       /* Initialize interface table.  */
 
       /* The SIOCGIFNETMASK ioctl will only work on an AF_INET socket.  */
       sd = __socket (AF_INET, SOCK_DGRAM, 0);
       if (sd < 0)
 	return;
 
       /* Get lock.  */
       __libc_lock_lock (lock);
 
-      /* Recheck, somebody else might have done the work by now.  */
-      if (num_ifs <= 0)
+      /* Recheck, somebody else might have done the work by now.  A
+	 regular load is sufficient because we have the lock, and
+	 num_ifs is only updated under the lock.  */
+      num_ifs_local = num_ifs;
+      if (num_ifs_local <= 0)
 	{
+	  /* This is the only block which writes to num_ifs.  It can
+	     be executed several times (sequentially) if
+	     initialization does not yield any interfaces, and num_ifs
+	     remains zero.  However, once we stored a positive value
+	     in num_ifs below, this block cannot be entered again due
+	     to the condition above.  */
 	  int new_num_ifs = 0;
 
 	  /* Get a list of interfaces.  */
 	  __ifreq (&ifr, &num, sd);
 	  if (!ifr)
 	    goto cleanup;
 
 	  ifaddrs = malloc (num * sizeof (ifaddrs[0]));
 	  if (!ifaddrs)
 	    goto cleanup1;
 
 	  /* Copy usable interfaces in ifaddrs structure.  */
 	  for (cur_ifr = ifr, i = 0; i < num;
 	       cur_ifr = __if_nextreq (cur_ifr), ++i)
 	    {
 	      union
 	      {
 		struct sockaddr sa;
 		struct sockaddr_in sin;
 	      } ss;
 
 	      if (cur_ifr->ifr_addr.sa_family != AF_INET)
 		continue;
 
 	      ifaddrs[new_num_ifs].addrtype = AF_INET;
 	      ss.sa = cur_ifr->ifr_addr;
 	      ifaddrs[new_num_ifs].u.ipv4.addr = ss.sin.sin_addr.s_addr;
 
 	      if (__ioctl (sd, SIOCGIFNETMASK, cur_ifr) < 0)
 		continue;
 
 	      ss.sa = cur_ifr->ifr_netmask;
 	      ifaddrs[new_num_ifs].u.ipv4.mask = ss.sin.sin_addr.s_addr;
 
 	      /* Now we're committed to this entry.  */
 	      ++new_num_ifs;
 	    }
 	  /* Just keep enough memory to hold all the interfaces we want.  */
 	  ifaddrs = realloc (ifaddrs, new_num_ifs * sizeof (ifaddrs[0]));
 	  assert (ifaddrs != NULL);
 
 	cleanup1:
 	  __if_freereq (ifr, num);
 
 	cleanup:
 	  /* Release lock, preserve error value, and close socket.  */
 	  errno = save;
 
-	  num_ifs = new_num_ifs;
+	  /* Advertise successful initialization if new_num_ifs is
+	     positive (and no updates to ifaddrs are permitted after
+	     that).  Otherwise, num_ifs remains unchanged, at
+	     zero.  */
+	  atomic_store_release (&num_ifs, new_num_ifs);
+	  /* Keep the local copy current, to save another load.  */
+	  num_ifs_local = new_num_ifs;
 	}
 
       __libc_lock_unlock (lock);
 
       __close (sd);
     }
 
-  if (num_ifs == 0)
+  /* num_ifs_local cannot be negative because the if statement above
+     covered this case.  It can still be zero if we just performed
+     initialization, but could not find any interfaces.  */
+  if (num_ifs_local == 0)
     return;
 
+  /* At this point, access to ifaddrs is free from data races:
+
+     (1) num_ifs is positive, so we know the initialization code has
+         completed in some thread because the release-store to num_ifs
+         is dependency-ordered before the acquire-load.  (If this
+         thread performed the initialization, it is sequence-ordered
+         before this point.)
+
+     (2) The initialization of ifaddrs (both of ifaddrs itself and the
+         data pointed to it) is sequenced before the write to the
+         update to new_num_ifs.
+
+     (3) Any potential write would have to happen in the
+         initialization code, but no consistent execution can re-enter
+         it after num_ifs is positive.
+
+     Consequently, all potential writes to ifaddrs (and the data it
+     points to) happened before this point.  */
+
   /* Find an address for which we have a direct connection.  */
   for (i = 0; hp->h_addr_list[i]; ++i)
     {
       struct in_addr *haddr = (struct in_addr *) hp->h_addr_list[i];
 
-      for (j = 0; j < num_ifs; ++j)
+      for (j = 0; j < num_ifs_local; ++j)
 	{
 	  u_int32_t if_addr    = ifaddrs[j].u.ipv4.addr;
 	  u_int32_t if_netmask = ifaddrs[j].u.ipv4.mask;
 
 	  if (((haddr->s_addr ^ if_addr) & if_netmask) == 0)
 	    {
 	      void *tmp;
 
 	      tmp = hp->h_addr_list[i];
 	      hp->h_addr_list[i] = hp->h_addr_list[0];
 	      hp->h_addr_list[0] = tmp;
 	      return;
 	    }
 	}
     }
 #endif /* defined(SIOCGIFCONF) && ... */
 }
 
 
 /* If HOSTNAME has a postfix matching any of the trimdomains, trim away
    that postfix.  Notice that HOSTNAME is modified inplace.  Also, the
    original code applied all trimdomains in order, meaning that the
    same domainname could be trimmed multiple times.  I believe this
    was unintentional.  */
 void