From patchwork Wed Oct 29 21:39:00 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Torvald Riegel <triegel@redhat.com>
X-Patchwork-Id: 3489
Received: (qmail 10873 invoked by alias); 29 Oct 2014 21:39:06 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 10862 invoked by uid 89); 29 Oct 2014 21:39:06 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-3.4 required=5.0 tests=AWL, BAYES_00,
	RP_MATCHES_RCVD, SPF_HELO_PASS,
	SPF_PASS autolearn=ham version=3.3.2
X-HELO: mx1.redhat.com
Subject: [PATCH 1/4] Add arch-specific configuration for C11 atomics support.
From: Torvald Riegel <triegel@redhat.com>
To: GLIBC Devel <libc-alpha@sourceware.org>
In-Reply-To: <1414617613.10085.23.camel@triegel.csb>
References: <1414617613.10085.23.camel@triegel.csb>
Date: Wed, 29 Oct 2014 22:39:00 +0100
Message-ID: <1414618740.10085.35.camel@triegel.csb>
Mime-Version: 1.0

Patch 2/4 provides two ways to implement C11 atomics: Either using the
__atomic* builtins provided by GCC 4.7 and later, or based on the
existing glibc-specific atomic operations.  This is a choice we best
make on a per-architecture basis, so I let each arch set
USE_ATOMIC_COMPILER_BUILTINS to true or false:
* True if either the existing atomic ops use the     __atomic* builtins
anyway (aarch64, mips for some GCC versions) or on x86_64, where I have
tested this for pthread_once at least (and there are other things like
GCC's libitm that have been using the builtins and have been tested
quite well).
* False otherwise, even if GCC may provide __atomic builtins on such an
architecture.  I think it is better to let the arch maintainer (or
anyone willing to test) make the decision to use the builtins.

I have not tested except on x86_64, but there for both true and false
for USE_ATOMIC_COMPILER_BUILTINS.  I'd appreciate further testing on
other archs.
I believe that once we have some more testing including on archs with
weaker HW memory models (e.g., ARM or powerpc), we can commit this patch
even if we haven't tested on each arch.  This is because most archs
don't use the builtins, and if they don't then this patch really relies
on the existing atomics to be correct.  IOW, every arch with a correct
implementation of the current atomics should continue to work.

Also, this sets __HAVE_64B_ATOMICS if the particular arch supports 64b
atomic operations.  This is useful for both the atomics implementation
as well as concurrent code that has different code paths for when 64b
atomics are available.

	* sysdeps/aarch64/bits/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Define.
	* sysdeps/alpha/bits/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/arm/bits/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/i386/i486/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/ia64/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/m68k/coldfire/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/m68k/m680x0/m68020/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/microblaze/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/mips/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/powerpc/powerpc32/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/powerpc/powerpc64/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/s390/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/sparc/sparc32/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/sparc/sparc64/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/tile/tilegx/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/tile/tilepro/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/unix/sysv/linux/hppa/bits/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h
	(__HAVE_64B_ATOMICS, USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/unix/sysv/linux/sh/bits/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.
	* sysdeps/x86_64/bits/atomic.h (__HAVE_64B_ATOMICS,
	USE_ATOMIC_COMPILER_BUILTINS): Likewise.

commit 93952165df3000cc7dd8056ff87021f29a17b7ff
Author: Torvald Riegel <triegel@redhat.com>
Date:   Sat Oct 18 01:02:59 2014 +0200

    Add arch-specific configuration for C11 atomics support.
    
    This sets __HAVE_64B_ATOMICS if provided.  It also sets
    USE_ATOMIC_COMPILER_BUILTINS to true if the existing atomic ops use the
    __atomic* builtins (aarch64, mips partially) or if this has been
    tested (x86_64); otherwise, this is set to false so that C11 atomics will
    be based on the existing atomic operations.

diff --git a/sysdeps/aarch64/bits/atomic.h b/sysdeps/aarch64/bits/atomic.h
index 456e2ec..a8d3ae7 100644
--- a/sysdeps/aarch64/bits/atomic.h
+++ b/sysdeps/aarch64/bits/atomic.h
@@ -36,6 +36,8 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 1
 
 /* Compare and exchange.
    For all "bool" routines, we return FALSE if exchange succesful.  */
diff --git a/sysdeps/alpha/bits/atomic.h b/sysdeps/alpha/bits/atomic.h
index abbbc7c..e9275e9 100644
--- a/sysdeps/alpha/bits/atomic.h
+++ b/sysdeps/alpha/bits/atomic.h
@@ -42,6 +42,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 #ifdef UP
 # define __MB		/* nothing */
diff --git a/sysdeps/arm/bits/atomic.h b/sysdeps/arm/bits/atomic.h
index 88cbe67..315b4cf 100644
--- a/sysdeps/arm/bits/atomic.h
+++ b/sysdeps/arm/bits/atomic.h
@@ -33,6 +33,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 void __arm_link_error (void);
 
 #ifdef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
diff --git a/sysdeps/i386/i486/bits/atomic.h b/sysdeps/i386/i486/bits/atomic.h
index 7c432f9..6cd3d21 100644
--- a/sysdeps/i386/i486/bits/atomic.h
+++ b/sysdeps/i386/i486/bits/atomic.h
@@ -54,6 +54,9 @@ typedef uintmax_t uatomic_max_t;
 # endif
 #endif
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \
   __sync_val_compare_and_swap (mem, oldval, newval)
diff --git a/sysdeps/ia64/bits/atomic.h b/sysdeps/ia64/bits/atomic.h
index 766cb4b..5e090b9 100644
--- a/sysdeps/ia64/bits/atomic.h
+++ b/sysdeps/ia64/bits/atomic.h
@@ -43,6 +43,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 #define __arch_compare_and_exchange_bool_8_acq(mem, newval, oldval) \
   (abort (), 0)
diff --git a/sysdeps/m68k/coldfire/bits/atomic.h b/sysdeps/m68k/coldfire/bits/atomic.h
index ec0c59a..4851999 100644
--- a/sysdeps/m68k/coldfire/bits/atomic.h
+++ b/sysdeps/m68k/coldfire/bits/atomic.h
@@ -49,6 +49,10 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+/* If we have just non-atomic operations, we can as well make them wide.  */
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* The only basic operation needed is compare and exchange.  */
 #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \
   ({ __typeof (mem) __gmemp = (mem);				      \
diff --git a/sysdeps/m68k/m680x0/m68020/bits/atomic.h b/sysdeps/m68k/m680x0/m68020/bits/atomic.h
index 0f081f1..395bac0 100644
--- a/sysdeps/m68k/m680x0/m68020/bits/atomic.h
+++ b/sysdeps/m68k/m680x0/m68020/bits/atomic.h
@@ -44,6 +44,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 #define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
   ({ __typeof (*(mem)) __ret;						      \
      __asm __volatile ("cas%.b %0,%2,%1"				      \
diff --git a/sysdeps/microblaze/bits/atomic.h b/sysdeps/microblaze/bits/atomic.h
index 77004a0..395162d 100644
--- a/sysdeps/microblaze/bits/atomic.h
+++ b/sysdeps/microblaze/bits/atomic.h
@@ -35,6 +35,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 /* Microblaze does not have byte and halfword forms of load and reserve and
    store conditional. So for microblaze we stub out the 8- and 16-bit forms.  */
diff --git a/sysdeps/mips/bits/atomic.h b/sysdeps/mips/bits/atomic.h
index 35b3786..091b199 100644
--- a/sysdeps/mips/bits/atomic.h
+++ b/sysdeps/mips/bits/atomic.h
@@ -44,6 +44,12 @@ typedef uintmax_t uatomic_max_t;
 #define MIPS_PUSH_MIPS2
 #endif
 
+#if _MIPS_SIM == _ABIO32
+#define __HAVE_64B_ATOMICS 0
+#else
+#define __HAVE_64B_ATOMICS 1
+#endif
+
 /* See the comments in <sys/asm.h> about the use of the sync instruction.  */
 #ifndef MIPS_SYNC
 # define MIPS_SYNC	sync
@@ -86,6 +92,8 @@ typedef uintmax_t uatomic_max_t;
    have no assembly alternative available and want to avoid the __sync_*
    builtins if at all possible.  */
 
+#define USE_ATOMIC_COMPILER_BUILTINS 1
+
 /* Compare and exchange.
    For all "bool" routines, we return FALSE if exchange succesful.  */
 
@@ -234,6 +242,8 @@ typedef uintmax_t uatomic_max_t;
 /* This implementation using inline assembly will be removed once glibc
    requires GCC 4.8 or later to build.  */
 
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* Compare and exchange.  For all of the "xxx" routines, we expect a
    "__prev" and a "__cmp" variable to be provided by the enclosing scope,
    in which values are returned.  */
diff --git a/sysdeps/powerpc/powerpc32/bits/atomic.h b/sysdeps/powerpc/powerpc32/bits/atomic.h
index 7613bdc..e356bcb 100644
--- a/sysdeps/powerpc/powerpc32/bits/atomic.h
+++ b/sysdeps/powerpc/powerpc32/bits/atomic.h
@@ -33,6 +33,9 @@
 # define MUTEX_HINT_REL
 #endif
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /*
  * The 32-bit exchange_bool is different on powerpc64 because the subf
  * does signed 64-bit arithmetic while the lwarx is 32-bit unsigned
diff --git a/sysdeps/powerpc/powerpc64/bits/atomic.h b/sysdeps/powerpc/powerpc64/bits/atomic.h
index 527fe7c..fc2d29f 100644
--- a/sysdeps/powerpc/powerpc64/bits/atomic.h
+++ b/sysdeps/powerpc/powerpc64/bits/atomic.h
@@ -33,6 +33,9 @@
 # define MUTEX_HINT_REL
 #endif
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* The 32-bit exchange_bool is different on powerpc64 because the subf
    does signed 64-bit arithmetic while the lwarx is 32-bit unsigned
    (a load word and zero (high 32) form) load.
diff --git a/sysdeps/s390/bits/atomic.h b/sysdeps/s390/bits/atomic.h
index 6824165..b809b5e 100644
--- a/sysdeps/s390/bits/atomic.h
+++ b/sysdeps/s390/bits/atomic.h
@@ -43,6 +43,8 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 #define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
   (abort (), (__typeof (*mem)) 0)
@@ -59,6 +61,7 @@ typedef uintmax_t uatomic_max_t;
      __archold; })
 
 #ifdef __s390x__
+# define __HAVE_64B_ATOMICS 1
 # define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \
   ({ __typeof (mem) __archmem = (mem);					      \
      __typeof (*mem) __archold = (oldval);				      \
@@ -67,6 +70,7 @@ typedef uintmax_t uatomic_max_t;
 		       : "d" ((long) (newval)), "m" (*__archmem) : "cc", "memory" );    \
      __archold; })
 #else
+# define __HAVE_64B_ATOMICS 0
 /* For 31 bit we do not really need 64-bit compare-and-exchange. We can
    implement them by use of the csd instruction. The straightforward
    implementation causes warnings so we skip the definition for now.  */
diff --git a/sysdeps/sparc/sparc32/bits/atomic.h b/sysdeps/sparc/sparc32/bits/atomic.h
index 39c2b37..1b4175d 100644
--- a/sysdeps/sparc/sparc32/bits/atomic.h
+++ b/sysdeps/sparc/sparc32/bits/atomic.h
@@ -47,6 +47,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 /* We have no compare and swap, just test and set.
    The following implementation contends on 64 global locks
diff --git a/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h b/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h
index 4835019..8441de3 100644
--- a/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h
+++ b/sysdeps/sparc/sparc32/sparcv9/bits/atomic.h
@@ -44,6 +44,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 #define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
   (abort (), (__typeof (*mem)) 0)
diff --git a/sysdeps/sparc/sparc64/bits/atomic.h b/sysdeps/sparc/sparc64/bits/atomic.h
index ad9dae1..ccb7319 100644
--- a/sysdeps/sparc/sparc64/bits/atomic.h
+++ b/sysdeps/sparc/sparc64/bits/atomic.h
@@ -44,6 +44,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 
 #define __arch_compare_and_exchange_val_8_acq(mem, newval, oldval) \
   (abort (), (__typeof (*mem)) 0)
diff --git a/sysdeps/tile/tilegx/bits/atomic.h b/sysdeps/tile/tilegx/bits/atomic.h
index ce12db0..9aa299f 100644
--- a/sysdeps/tile/tilegx/bits/atomic.h
+++ b/sysdeps/tile/tilegx/bits/atomic.h
@@ -21,6 +21,9 @@
 
 #include <arch/spr_def.h>
 
+#define __HAVE_64B_ATOMICS 1
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* Pick appropriate 8- or 4-byte instruction. */
 #define __atomic_update(mem, v, op)                                     \
   ((__typeof (*(mem))) (__typeof (*(mem) - *(mem)))                     \
diff --git a/sysdeps/tile/tilepro/bits/atomic.h b/sysdeps/tile/tilepro/bits/atomic.h
index cbbf64c..c3865be 100644
--- a/sysdeps/tile/tilepro/bits/atomic.h
+++ b/sysdeps/tile/tilepro/bits/atomic.h
@@ -21,6 +21,9 @@
 
 #include <asm/unistd.h>
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* 32-bit integer compare-and-exchange. */
 static __inline __attribute__ ((always_inline))
 int __atomic_cmpxchg_32 (volatile int *mem, int newval, int oldval)
diff --git a/sysdeps/unix/sysv/linux/hppa/bits/atomic.h b/sysdeps/unix/sysv/linux/hppa/bits/atomic.h
index e55e91b..b5cdfb6 100644
--- a/sysdeps/unix/sysv/linux/hppa/bits/atomic.h
+++ b/sysdeps/unix/sysv/linux/hppa/bits/atomic.h
@@ -44,6 +44,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* prev = *addr;
    if (prev == old)
      *addr = new;
diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h b/sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h
index cd9bae3..a8d4a33 100644
--- a/sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h
+++ b/sysdeps/unix/sysv/linux/m68k/coldfire/bits/atomic.h
@@ -36,6 +36,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* The only basic operation needed is compare and exchange.  */
 /* For ColdFire we'll have to trap into the kernel mode anyway,
    so trap from the library rather then from the kernel wrapper.  */
diff --git a/sysdeps/unix/sysv/linux/sh/bits/atomic.h b/sysdeps/unix/sysv/linux/sh/bits/atomic.h
index e819412..6508c33 100644
--- a/sysdeps/unix/sysv/linux/sh/bits/atomic.h
+++ b/sysdeps/unix/sysv/linux/sh/bits/atomic.h
@@ -44,6 +44,9 @@ typedef uintptr_t uatomicptr_t;
 typedef intmax_t atomic_max_t;
 typedef uintmax_t uatomic_max_t;
 
+#define __HAVE_64B_ATOMICS 0
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+
 /* SH kernel has implemented a gUSA ("g" User Space Atomicity) support
    for the user space atomicity. The atomicity macros use this scheme.
 
diff --git a/sysdeps/x86_64/bits/atomic.h b/sysdeps/x86_64/bits/atomic.h
index 25f73aa..7e67427 100644
--- a/sysdeps/x86_64/bits/atomic.h
+++ b/sysdeps/x86_64/bits/atomic.h
@@ -55,6 +55,12 @@ typedef uintmax_t uatomic_max_t;
 # endif
 #endif
 
+#define __HAVE_64B_ATOMICS 1
+#if __GNUC_PREREQ (4, 7)
+#define USE_ATOMIC_COMPILER_BUILTINS 1
+#else
+#define USE_ATOMIC_COMPILER_BUILTINS 0
+#endif
 
 #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \
   __sync_val_compare_and_swap (mem, oldval, newval)