Don't rely on having LP64 in semaphores if 64b atomic ops are available.

Message ID 1421951065.4572.47.camel@triegel.csb
State Dropped
Headers

Commit Message

Torvald Riegel Jan. 22, 2015, 6:24 p.m. UTC
  This attempts to fix an issue in the new semaphore implementation for
archs that provide 64b atomic ops but do not use an LP64 data model.
This affects MIPS N32 ABI and x32 too I guess.

Steve, can you test this patch please?  I'll do the same on x86_64-linux
in the meantime.

Also, do we already have any information on which data models we
support?  I reckon LP64 and ILP32?, or others as well?  I'd like to add
a comment to atomic.h, and if we have such information already, I'd just
point to it.


2015-01-22  Torvald Riegel  <triegel@redhat.com>

	* nptl/sem_post.c (__new_sem_post): Don't rely on an LP64 data model.
	* nptl/sem_waitcommon.c (__sem_wait_cleanup, __new_sem_wait_fast,
	__new_sem_wait_slow): Likewise.
	* sysdeps/nptl/internaltypes.h (struct new_sem): Likewise.
  

Comments

H.J. Lu Jan. 22, 2015, 6:29 p.m. UTC | #1
On Thu, Jan 22, 2015 at 10:24 AM, Torvald Riegel <triegel@redhat.com> wrote:
> This attempts to fix an issue in the new semaphore implementation for
> archs that provide 64b atomic ops but do not use an LP64 data model.
> This affects MIPS N32 ABI and x32 too I guess.
>
> Steve, can you test this patch please?  I'll do the same on x86_64-linux
> in the meantime.
>
> Also, do we already have any information on which data models we
> support?  I reckon LP64 and ILP32?, or others as well?  I'd like to add
> a comment to atomic.h, and if we have such information already, I'd just
> point to it.
>
>
> 2015-01-22  Torvald Riegel  <triegel@redhat.com>
>
>         * nptl/sem_post.c (__new_sem_post): Don't rely on an LP64 data model.
>         * nptl/sem_waitcommon.c (__sem_wait_cleanup, __new_sem_wait_fast,
>         __new_sem_wait_slow): Likewise.
>         * sysdeps/nptl/internaltypes.h (struct new_sem): Likewise.

Do we really need int64 for the "data" field internally?
  
Torvald Riegel Jan. 22, 2015, 6:35 p.m. UTC | #2
On Thu, 2015-01-22 at 10:29 -0800, H.J. Lu wrote:
> On Thu, Jan 22, 2015 at 10:24 AM, Torvald Riegel <triegel@redhat.com> wrote:
> > This attempts to fix an issue in the new semaphore implementation for
> > archs that provide 64b atomic ops but do not use an LP64 data model.
> > This affects MIPS N32 ABI and x32 too I guess.
> >
> > Steve, can you test this patch please?  I'll do the same on x86_64-linux
> > in the meantime.
> >
> > Also, do we already have any information on which data models we
> > support?  I reckon LP64 and ILP32?, or others as well?  I'd like to add
> > a comment to atomic.h, and if we have such information already, I'd just
> > point to it.
> >
> >
> > 2015-01-22  Torvald Riegel  <triegel@redhat.com>
> >
> >         * nptl/sem_post.c (__new_sem_post): Don't rely on an LP64 data model.
> >         * nptl/sem_waitcommon.c (__sem_wait_cleanup, __new_sem_wait_fast,
> >         __new_sem_wait_slow): Likewise.
> >         * sysdeps/nptl/internaltypes.h (struct new_sem): Likewise.
> 
> Do we really need int64 for the "data" field internally?

First, there's an option to use either 32b or 64b atomic operations.  If
the latter are available and not significantly more expensive than 32b
atomics, using 64b is more efficient.

If we want to use 64b atomics, we need to operate on 64b data,
eventually.  We could try to use adjacent uint32 as one uint64 and
access the uint32 separately where possible, but such mixing of
overlapping atomic ops isn't defined by C11, and we shouldn't use it.
If we don't, yet still try to use adjacent uint32 as underlying storage,
the code won't become simpler.

Thus, I think using uint64 as underlying type makes sense.  Do you see
any disadvantages?
  
Joseph Myers Jan. 22, 2015, 6:40 p.m. UTC | #3
On Thu, 22 Jan 2015, Torvald Riegel wrote:

> Also, do we already have any information on which data models we
> support?  I reckon LP64 and ILP32?, or others as well?  I'd like to add
> a comment to atomic.h, and if we have such information already, I'd just
> point to it.

I believe we can assume: 8-bit char (may be signed or unsigned), 16-bit 
short, 32-bit int, long either 32-bit or 64-bit and the same width as 
pointers, long long 64-bit.  But registers (and so syscall arguments) may 
be 64-bit even with 32-bit long; architectures may vary about the 
significance of high bits of function arguments and return values.

I could imagine there being a future glibc port where intmax_t is 128-bit 
(with __int128 being a proper integer type rather than an extended type 
that looks quite like an integer type but doesn't meet all the 
requirements), although that's certainly outside the set of ABIs supported 
by glibc at present (which expect intmax_t to be 64-bit like long long).

If code needs types of a given width (e.g. manipulating bits of a 
floating-point number), it is of course good practice for it to use 
<stdint.h> types rather than embedding assumptions that e.g. int means 
32-bit; similarly, use of intptr_t / uintptr_t when casting between 
pointers and integers is to be preferred to use of long; use size_t for 
in-memory sizes (although in fact all ports have size_t the same width as 
unsigned long); and so on.

The GNU Coding Standards specifically say no support is needed for size_t 
wider than long - that applies to all GNU software, not just glibc (which 
is portable to a more restricted range of systems).
  
Steve Ellcey Jan. 22, 2015, 6:57 p.m. UTC | #4
On Thu, 2015-01-22 at 19:24 +0100, Torvald Riegel wrote:
> This attempts to fix an issue in the new semaphore implementation for
> archs that provide 64b atomic ops but do not use an LP64 data model.
> This affects MIPS N32 ABI and x32 too I guess.
> 
> Steve, can you test this patch please?  I'll do the same on x86_64-linux
> in the meantime.

I was able to build glibc with this patch (all 3 MIPS ABI's).  I haven't
run any tests, but the builds all look OK.

Steve Ellcey
sellcey@imgtec.com
  
Torvald Riegel Jan. 22, 2015, 6:59 p.m. UTC | #5
On Thu, 2015-01-22 at 18:40 +0000, Joseph Myers wrote:
> On Thu, 22 Jan 2015, Torvald Riegel wrote:
> 
> > Also, do we already have any information on which data models we
> > support?  I reckon LP64 and ILP32?, or others as well?  I'd like to add
> > a comment to atomic.h, and if we have such information already, I'd just
> > point to it.
> 
> I believe we can assume: 8-bit char (may be signed or unsigned), 16-bit 
> short, 32-bit int, long either 32-bit or 64-bit and the same width as 
> pointers, long long 64-bit.

AFAICT, this leaves just LP64 and ILP32; ILP64, LLP64, and LP32 violate
at least one of your assumptions.

> But registers (and so syscall arguments) may 
> be 64-bit even with 32-bit long; architectures may vary about the 
> significance of high bits of function arguments and return values.
> 
> I could imagine there being a future glibc port where intmax_t is 128-bit 
> (with __int128 being a proper integer type rather than an extended type 
> that looks quite like an integer type but doesn't meet all the 
> requirements), although that's certainly outside the set of ABIs supported 
> by glibc at present (which expect intmax_t to be 64-bit like long long).
> 
> If code needs types of a given width (e.g. manipulating bits of a 
> floating-point number), it is of course good practice for it to use 
> <stdint.h> types rather than embedding assumptions that e.g. int means 
> 32-bit;

Looking at other glibc code, I had assumed that using uint32_t and
uint64_t wouldn't be current practice.  Are you saying that future
clean-up towards that (e.g., using uint64_t instead of "unsigned long
long" as in this patch) would be welcome?
  
Joseph Myers Jan. 22, 2015, 7:08 p.m. UTC | #6
On Thu, 22 Jan 2015, Torvald Riegel wrote:

> > If code needs types of a given width (e.g. manipulating bits of a 
> > floating-point number), it is of course good practice for it to use 
> > <stdint.h> types rather than embedding assumptions that e.g. int means 
> > 32-bit;
> 
> Looking at other glibc code, I had assumed that using uint32_t and
> uint64_t wouldn't be current practice.  Are you saying that future
> clean-up towards that (e.g., using uint64_t instead of "unsigned long
> long" as in this patch) would be welcome?

I'd welcome it, if 64-bit types are logically what the code wants; libm 
code at least makes plenty of use of such names.  (That doesn't 
necessarily mean we want UINT64_C etc. used for constants; we'd need to 
think about whether we want to be more verbose like that.)

For code maintained in glibc as opposed to imported from elsewhere, I 
think moving from u_intN_t type names to the ISO C uintN_t type names 
(with inclusion of <stdint.h>) is also a sensible cleanup - though in 
installed headers you'd need to watch out for other symbols from 
<stdint.h> that might not be appropriate for some other header to define.
  
Torvald Riegel Jan. 23, 2015, 5:56 p.m. UTC | #7
On Thu, 2015-01-22 at 10:57 -0800, Steve Ellcey wrote:
> On Thu, 2015-01-22 at 19:24 +0100, Torvald Riegel wrote:
> > This attempts to fix an issue in the new semaphore implementation for
> > archs that provide 64b atomic ops but do not use an LP64 data model.
> > This affects MIPS N32 ABI and x32 too I guess.
> > 
> > Steve, can you test this patch please?  I'll do the same on x86_64-linux
> > in the meantime.
> 
> I was able to build glibc with this patch (all 3 MIPS ABI's).  I haven't
> run any tests, but the builds all look OK.

I have tested this on x86_64-linux and no regressions.  Will commit this
early next week unless there are any objections.
  

Patch

commit e34f47a1050638255898e392df63c82052f97f48
Author: Torvald Riegel <triegel@redhat.com>
Date:   Thu Jan 22 19:18:59 2015 +0100

    Don't rely on having LP64 in semaphores if 64b atomic ops are available.

diff --git a/nptl/sem_post.c b/nptl/sem_post.c
index 9162e4c..cd9e666 100644
--- a/nptl/sem_post.c
+++ b/nptl/sem_post.c
@@ -65,7 +65,7 @@  __new_sem_post (sem_t *sem)
      added tokens before (the release sequence includes atomic RMW operations
      by other threads).  */
   /* TODO Use atomic_fetch_add to make it scale better than a CAS loop?  */
-  unsigned long int d = atomic_load_relaxed (&isem->data);
+  unsigned long long int d = atomic_load_relaxed (&isem->data);
   do
     {
       if ((d & SEM_VALUE_MASK) == SEM_VALUE_MAX)
diff --git a/nptl/sem_waitcommon.c b/nptl/sem_waitcommon.c
index 96848d7..12c77e6 100644
--- a/nptl/sem_waitcommon.c
+++ b/nptl/sem_waitcommon.c
@@ -187,7 +187,7 @@  __sem_wait_cleanup (void *arg)
 
 #if __HAVE_64B_ATOMICS
   /* Stop being registered as a waiter.  See below for MO.  */
-  atomic_fetch_add_relaxed (&sem->data, -(1UL << SEM_NWAITERS_SHIFT));
+  atomic_fetch_add_relaxed (&sem->data, -(1ULL << SEM_NWAITERS_SHIFT));
 #else
   __sem_wait_32_finish (sem);
 #endif
@@ -228,7 +228,7 @@  __new_sem_wait_fast (struct new_sem *sem, int definitive_result)
      and the failure path of the CAS.  If the weak CAS fails and we need a
      definitive result, retry.  */
 #if __HAVE_64B_ATOMICS
-  unsigned long d = atomic_load_relaxed (&sem->data);
+  unsigned long long d = atomic_load_relaxed (&sem->data);
   do
     {
       if ((d & SEM_VALUE_MASK) == 0)
@@ -263,8 +263,8 @@  __new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime)
 #if __HAVE_64B_ATOMICS
   /* Add a waiter.  Relaxed MO is sufficient because we can rely on the
      ordering provided by the RMW operations we use.  */
-  unsigned long d = atomic_fetch_add_relaxed (&sem->data,
-      1UL << SEM_NWAITERS_SHIFT);
+  unsigned long long d = atomic_fetch_add_relaxed (&sem->data,
+      1ULL << SEM_NWAITERS_SHIFT);
 
   pthread_cleanup_push (__sem_wait_cleanup, sem);
 
@@ -304,7 +304,7 @@  __new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime)
 	      err = -1;
 	      /* Stop being registered as a waiter.  */
 	      atomic_fetch_add_relaxed (&sem->data,
-		  -(1UL << SEM_NWAITERS_SHIFT));
+		  -(1ULL << SEM_NWAITERS_SHIFT));
 	      break;
 	    }
 	  /* Relaxed MO is sufficient; see below.  */
@@ -320,7 +320,7 @@  __new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime)
 	     up-to-date value; the futex_wait or the CAS perform the real
 	     work.  */
 	  if (atomic_compare_exchange_weak_acquire (&sem->data,
-	      &d, d - 1 - (1UL << SEM_NWAITERS_SHIFT)))
+	      &d, d - 1 - (1ULL << SEM_NWAITERS_SHIFT)))
 	    {
 	      err = 0;
 	      break;
diff --git a/sysdeps/nptl/internaltypes.h b/sysdeps/nptl/internaltypes.h
index 7c0d240..b28ef93 100644
--- a/sysdeps/nptl/internaltypes.h
+++ b/sysdeps/nptl/internaltypes.h
@@ -155,7 +155,7 @@  struct new_sem
 # endif
 # define SEM_NWAITERS_SHIFT 32
 # define SEM_VALUE_MASK (~(unsigned int)0)
-  unsigned long int data;
+  unsigned long long int data;
   int private;
   int pad;
 #else