Fix atomic_full_barrier on x86 and x86_64.

Message ID 1416912710.1771.176.camel@triegel.csb
State Dropped
Headers

Commit Message

Torvald Riegel Nov. 25, 2014, 10:51 a.m. UTC
  On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote:
> On Wed, 29 Oct 2014, Torvald Riegel wrote:
> 
> > So, mfence seems to have been introduced with SSE2.  Should I try to
> > test for SSE2 specifically, or rather assume SSE2 support for i786?
> 
> I think the i786 directories should be removed; config.guess will never 
> return such a processor name for GNU/Linux at least (I don't know what it 
> returns on Hurd).  The comment in sysdeps/i386/i786/Implies suggests it 
> was for PII, but PII was still family 6 (and family 15 came after family 
> 6, I don't think there were any x86 processors with family numbers 7 to 
> 14).
> 
> So, anything conditional on SSE2 should test for __SSE2__.

How does this updated patch look?  The non-SSE full barrier is what,
AFAIU, GCC emits.
  

Comments

Torvald Riegel Dec. 8, 2014, 4:38 p.m. UTC | #1
Ping.

On Tue, 2014-11-25 at 11:51 +0100, Torvald Riegel wrote:
> On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote:
> > On Wed, 29 Oct 2014, Torvald Riegel wrote:
> > 
> > > So, mfence seems to have been introduced with SSE2.  Should I try to
> > > test for SSE2 specifically, or rather assume SSE2 support for i786?
> > 
> > I think the i786 directories should be removed; config.guess will never 
> > return such a processor name for GNU/Linux at least (I don't know what it 
> > returns on Hurd).  The comment in sysdeps/i386/i786/Implies suggests it 
> > was for PII, but PII was still family 6 (and family 15 came after family 
> > 6, I don't think there were any x86 processors with family numbers 7 to 
> > 14).
> > 
> > So, anything conditional on SSE2 should test for __SSE2__.
> 
> How does this updated patch look?  The non-SSE full barrier is what,
> AFAIU, GCC emits.
>
  
Torvald Riegel Dec. 15, 2014, 9:56 p.m. UTC | #2
Ping.

On Mon, 2014-12-08 at 17:38 +0100, Torvald Riegel wrote:
> Ping.
> 
> On Tue, 2014-11-25 at 11:51 +0100, Torvald Riegel wrote:
> > On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote:
> > > On Wed, 29 Oct 2014, Torvald Riegel wrote:
> > > 
> > > > So, mfence seems to have been introduced with SSE2.  Should I try to
> > > > test for SSE2 specifically, or rather assume SSE2 support for i786?
> > > 
> > > I think the i786 directories should be removed; config.guess will never 
> > > return such a processor name for GNU/Linux at least (I don't know what it 
> > > returns on Hurd).  The comment in sysdeps/i386/i786/Implies suggests it 
> > > was for PII, but PII was still family 6 (and family 15 came after family 
> > > 6, I don't think there were any x86 processors with family numbers 7 to 
> > > 14).
> > > 
> > > So, anything conditional on SSE2 should test for __SSE2__.
> > 
> > How does this updated patch look?  The non-SSE full barrier is what,
> > AFAIU, GCC emits.
> > 
> 
> 
>
  
Mike Frysinger March 5, 2015, 6:19 p.m. UTC | #3
On 25 Nov 2014 11:51, Torvald Riegel wrote:
> --- a/sysdeps/i386/i486/bits/atomic.h
> +++ b/sysdeps/i386/i486/bits/atomic.h
> @@ -535,3 +535,12 @@ typedef uintmax_t uatomic_max_t;
>  #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask)
>  
>  #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask)
> +
> +#ifdef __SSE2__
> +# define atomic_full_barrier() __asm ("mfence" ::: "memory")
> +#else
> +# define atomic_full_barrier() \
> +    __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory")
> +#endif

so this will kick in only when glibc itself is compiled with -msse2/etc...
support.  then again, these barriers only get used by glibc internal code,
so i guess this is the best answer.  plus it only impacts x86, and it's not
like anyone really cares about that anymore ;).

lgtm
-mike
  

Patch

commit 055ecbc51899f9f2c560545b183d8cf01df3de94
Author: Torvald Riegel <triegel@redhat.com>
Date:   Wed Oct 29 10:34:36 2014 +0100

    Fix atomic_full_barrier on x86 and x86_64.
    
    	[BZ #17403]
    	* sysdeps/x86_64/bits/atomic.h: (atomic_full_barrier,
    	atomic_read_barrier, atomic_write_barrier): Define.
    	* sysdeps/i386/i486/bits/atomic.h (atomic_full_barrier,
    	atomic_read_barrier, atomic_write_barrier): Define.

diff --git a/sysdeps/i386/i486/bits/atomic.h b/sysdeps/i386/i486/bits/atomic.h
index 739d384..c77fe2e 100644
--- a/sysdeps/i386/i486/bits/atomic.h
+++ b/sysdeps/i386/i486/bits/atomic.h
@@ -535,3 +535,12 @@  typedef uintmax_t uatomic_max_t;
 #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask)
 
 #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask)
+
+#ifdef __SSE2__
+# define atomic_full_barrier() __asm ("mfence" ::: "memory")
+#else
+# define atomic_full_barrier() \
+    __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory")
+#endif
+#define atomic_read_barrier() __asm ("" ::: "memory")
+#define atomic_write_barrier() __asm ("" ::: "memory")
diff --git a/sysdeps/x86_64/bits/atomic.h b/sysdeps/x86_64/bits/atomic.h
index 99dfb50..7e67427 100644
--- a/sysdeps/x86_64/bits/atomic.h
+++ b/sysdeps/x86_64/bits/atomic.h
@@ -472,3 +472,7 @@  typedef uintmax_t uatomic_max_t;
 #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask)
 
 #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask)
+
+#define atomic_full_barrier() __asm ("mfence" ::: "memory")
+#define atomic_read_barrier() __asm ("" ::: "memory")
+#define atomic_write_barrier() __asm ("" ::: "memory")