Fix atomic_full_barrier on x86 and x86_64.
Commit Message
On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote:
> On Wed, 29 Oct 2014, Torvald Riegel wrote:
>
> > So, mfence seems to have been introduced with SSE2. Should I try to
> > test for SSE2 specifically, or rather assume SSE2 support for i786?
>
> I think the i786 directories should be removed; config.guess will never
> return such a processor name for GNU/Linux at least (I don't know what it
> returns on Hurd). The comment in sysdeps/i386/i786/Implies suggests it
> was for PII, but PII was still family 6 (and family 15 came after family
> 6, I don't think there were any x86 processors with family numbers 7 to
> 14).
>
> So, anything conditional on SSE2 should test for __SSE2__.
How does this updated patch look? The non-SSE full barrier is what,
AFAIU, GCC emits.
Comments
Ping.
On Tue, 2014-11-25 at 11:51 +0100, Torvald Riegel wrote:
> On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote:
> > On Wed, 29 Oct 2014, Torvald Riegel wrote:
> >
> > > So, mfence seems to have been introduced with SSE2. Should I try to
> > > test for SSE2 specifically, or rather assume SSE2 support for i786?
> >
> > I think the i786 directories should be removed; config.guess will never
> > return such a processor name for GNU/Linux at least (I don't know what it
> > returns on Hurd). The comment in sysdeps/i386/i786/Implies suggests it
> > was for PII, but PII was still family 6 (and family 15 came after family
> > 6, I don't think there were any x86 processors with family numbers 7 to
> > 14).
> >
> > So, anything conditional on SSE2 should test for __SSE2__.
>
> How does this updated patch look? The non-SSE full barrier is what,
> AFAIU, GCC emits.
>
Ping.
On Mon, 2014-12-08 at 17:38 +0100, Torvald Riegel wrote:
> Ping.
>
> On Tue, 2014-11-25 at 11:51 +0100, Torvald Riegel wrote:
> > On Wed, 2014-10-29 at 22:54 +0000, Joseph S. Myers wrote:
> > > On Wed, 29 Oct 2014, Torvald Riegel wrote:
> > >
> > > > So, mfence seems to have been introduced with SSE2. Should I try to
> > > > test for SSE2 specifically, or rather assume SSE2 support for i786?
> > >
> > > I think the i786 directories should be removed; config.guess will never
> > > return such a processor name for GNU/Linux at least (I don't know what it
> > > returns on Hurd). The comment in sysdeps/i386/i786/Implies suggests it
> > > was for PII, but PII was still family 6 (and family 15 came after family
> > > 6, I don't think there were any x86 processors with family numbers 7 to
> > > 14).
> > >
> > > So, anything conditional on SSE2 should test for __SSE2__.
> >
> > How does this updated patch look? The non-SSE full barrier is what,
> > AFAIU, GCC emits.
> >
>
>
>
On 25 Nov 2014 11:51, Torvald Riegel wrote:
> --- a/sysdeps/i386/i486/bits/atomic.h
> +++ b/sysdeps/i386/i486/bits/atomic.h
> @@ -535,3 +535,12 @@ typedef uintmax_t uatomic_max_t;
> #define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask)
>
> #define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask)
> +
> +#ifdef __SSE2__
> +# define atomic_full_barrier() __asm ("mfence" ::: "memory")
> +#else
> +# define atomic_full_barrier() \
> + __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory")
> +#endif
so this will kick in only when glibc itself is compiled with -msse2/etc...
support. then again, these barriers only get used by glibc internal code,
so i guess this is the best answer. plus it only impacts x86, and it's not
like anyone really cares about that anymore ;).
lgtm
-mike
commit 055ecbc51899f9f2c560545b183d8cf01df3de94
Author: Torvald Riegel <triegel@redhat.com>
Date: Wed Oct 29 10:34:36 2014 +0100
Fix atomic_full_barrier on x86 and x86_64.
[BZ #17403]
* sysdeps/x86_64/bits/atomic.h: (atomic_full_barrier,
atomic_read_barrier, atomic_write_barrier): Define.
* sysdeps/i386/i486/bits/atomic.h (atomic_full_barrier,
atomic_read_barrier, atomic_write_barrier): Define.
@@ -535,3 +535,12 @@ typedef uintmax_t uatomic_max_t;
#define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask)
#define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask)
+
+#ifdef __SSE2__
+# define atomic_full_barrier() __asm ("mfence" ::: "memory")
+#else
+# define atomic_full_barrier() \
+ __asm __volatile (LOCK_PREFIX "orl $0, (%%esp)" ::: "memory")
+#endif
+#define atomic_read_barrier() __asm ("" ::: "memory")
+#define atomic_write_barrier() __asm ("" ::: "memory")
@@ -472,3 +472,7 @@ typedef uintmax_t uatomic_max_t;
#define atomic_or(mem, mask) __arch_or_body (LOCK_PREFIX, mem, mask)
#define catomic_or(mem, mask) __arch_or_body (__arch_cprefix, mem, mask)
+
+#define atomic_full_barrier() __asm ("mfence" ::: "memory")
+#define atomic_read_barrier() __asm ("" ::: "memory")
+#define atomic_write_barrier() __asm ("" ::: "memory")