Message ID | 20170213102220.31ba7d56@kryten |
---|---|
State | Rejected |
Headers |
Received: (qmail 98207 invoked by alias); 12 Feb 2017 23:22:32 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 98186 invoked by uid 89); 12 Feb 2017 23:22:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL, BAYES_05, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=uintmax_t, sk:atomic, 2307, Hx-languages-length:1389 X-HELO: hr2.samba.org Date: Mon, 13 Feb 2017 10:22:20 +1100 From: Anton Blanchard <anton@samba.org> To: tuliom@linux.vnet.ibm.com, cseo@linux.vnet.ibm.com Cc: libc-alpha@sourceware.org Subject: [PATCH] powerpc: Use lwsync on 64bit Message-ID: <20170213102220.31ba7d56@kryten> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit |
Commit Message
Anton Blanchard
Feb. 12, 2017, 11:22 p.m. UTC
Either an isync or an lwsync can be used as an acquire barrier after a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the isync instruction has other side effects that we don't need, use lwsync. 2017-02-12 Anton Blanchard <anton@samba.org> * sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be overridden. * sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR
Comments
Anton Blanchard <anton@samba.org> writes: > Either an isync or an lwsync can be used as an acquire barrier after > a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the > isync instruction has other side effects that we don't need, use lwsync. > > 2017-02-12 Anton Blanchard <anton@samba.org> > > * sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be > overridden. > * sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR LGTM.
On 13/02/2017 11:42, Tulio Magno Quites Machado Filho wrote: > Anton Blanchard <anton@samba.org> writes: > >> Either an isync or an lwsync can be used as an acquire barrier after >> a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the >> isync instruction has other side effects that we don't need, use lwsync. >> >> 2017-02-12 Anton Blanchard <anton@samba.org> >> >> * sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be >> overridden. >> * sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR > > LGTM. > I do not think this is correct for all the primitives that use __ARCH_ACQ_INSTR. For instance __arch_atomic_exchange_32_acq is defined as #define __arch_atomic_exchange_32_acq(mem, value) \ ({ \ __typeof (*mem) __val; \ __asm __volatile ( \ "1: lwarx %0,0,%2" MUTEX_HINT_ACQ "\n" \ " stwcx. %3,0,%2\n" \ " bne- 1b\n" \ " " __ARCH_ACQ_INSTR \ : "=&r" (__val), "=m" (*mem) \ : "b" (mem), "r" (value), "m" (*mem) \ : "cr0", "memory"); \ __val; \ }) Which is analogous to C11: #include <stdatomic.h> #include <stdint.h> uint32_t foo (uint32_t *x) { return atomic_exchange_explicit (x, 0, memory_order_acquire); } And GCC 6.2.1 uses and 'isync' a memory barrier. It is also on par with [1] which defines acquire semantics to require 'isync' instructions. [1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
On 2/12/17, 9:22 PM, "Anton Blanchard" <libc-alpha-owner@sourceware.org on behalf of anton@samba.org> wrote: Either an isync or an lwsync can be used as an acquire barrier after a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the isync instruction has other side effects that we don't need, use lwsync. OK -- Carlos Eduardo Seo Software Engineer - Linux on Power Toolchain cseo@linux.vnet.ibm.com
Adhemerval Zanella <adhemerval.zanella@linaro.org> writes: > I do not think this is correct for all the primitives that use > __ARCH_ACQ_INSTR. For instance __arch_atomic_exchange_32_acq is defined as > > #define __arch_atomic_exchange_32_acq(mem, value) \ > ({ \ > __typeof (*mem) __val; \ > __asm __volatile ( \ > "1: lwarx %0,0,%2" MUTEX_HINT_ACQ "\n" \ > " stwcx. %3,0,%2\n" \ > " bne- 1b\n" \ > " " __ARCH_ACQ_INSTR \ > : "=&r" (__val), "=m" (*mem) \ > : "b" (mem), "r" (value), "m" (*mem) \ > : "cr0", "memory"); \ > __val; \ > }) > > > Which is analogous to C11: > > #include <stdatomic.h> > #include <stdint.h> > > uint32_t foo (uint32_t *x) > { > return atomic_exchange_explicit (x, 0, memory_order_acquire); > } > > And GCC 6.2.1 uses and 'isync' a memory barrier. It is also on par with [1] > which defines acquire semantics to require 'isync' instructions. > > > [1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html Good point. Per this document, it would also mess with the spinlock. Anton, are you trying to fix a more specific issue?
On Mon, 2017-02-13 at 12:23 -0200, Adhemerval Zanella wrote: > > On 13/02/2017 11:42, Tulio Magno Quites Machado Filho wrote: > > Anton Blanchard <anton@samba.org> writes: > > > >> Either an isync or an lwsync can be used as an acquire barrier after > >> a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the > >> isync instruction has other side effects that we don't need, use lwsync. > >> > >> 2017-02-12 Anton Blanchard <anton@samba.org> > >> > >> * sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be > >> overridden. > >> * sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR > > > > LGTM. > > > > I do not think this is correct for all the primitives that use > __ARCH_ACQ_INSTR. For instance __arch_atomic_exchange_32_acq is defined as > > #define __arch_atomic_exchange_32_acq(mem, value) \ > ({ \ > __typeof (*mem) __val; \ > __asm __volatile ( \ > "1: lwarx %0,0,%2" MUTEX_HINT_ACQ "\n" \ > " stwcx. %3,0,%2\n" \ > " bne- 1b\n" \ > " " __ARCH_ACQ_INSTR \ > : "=&r" (__val), "=m" (*mem) \ > : "b" (mem), "r" (value), "m" (*mem) \ > : "cr0", "memory"); \ > __val; \ > }) > > > Which is analogous to C11: > > #include <stdatomic.h> > #include <stdint.h> > > uint32_t foo (uint32_t *x) > { > return atomic_exchange_explicit (x, 0, memory_order_acquire); > } > > And GCC 6.2.1 uses and 'isync' a memory barrier. It is also on par with [1] > which defines acquire semantics to require 'isync' instructions. > > > [1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html I don't know enough about the details of powerpc to comment in detail. However, I'd like to make a few general comments: * We should strive to be as close to C11 atomics as produced by the compiler. For example, if GCC issues an isync instead of lwsync, we should be fine doing the same. If you think GCC is generating inefficient code, please take the change to GCC first and reach consensus for a change there. We'll just use GCC's atomics eventually (though MUTEX_HINT_ACQ may be a reason for a special case), so applying any change in GCC is necessary anyway to make such a change effective. * Any changes in the atomics implementation should be reviewed carefully by people deeply familiar with the HW memory models and how they map to C11. If in doubt, take your time.
diff --git a/sysdeps/powerpc/atomic-machine.h b/sysdeps/powerpc/atomic-machine.h index 0a58203..31d67fa 100644 --- a/sysdeps/powerpc/atomic-machine.h +++ b/sysdeps/powerpc/atomic-machine.h @@ -57,7 +57,9 @@ typedef uintmax_t uatomic_max_t; # define __ARCH_ACQ_INSTR "" # define __ARCH_REL_INSTR "" #else -# define __ARCH_ACQ_INSTR "isync" +# ifndef __ARCH_ACQ_INSTR +# define __ARCH_ACQ_INSTR "isync" +# endif # ifndef __ARCH_REL_INSTR # define __ARCH_REL_INSTR "sync" # endif diff --git a/sysdeps/powerpc/powerpc64/atomic-machine.h b/sysdeps/powerpc/powerpc64/atomic-machine.h index 40c308e..76c586a 100644 --- a/sysdeps/powerpc/powerpc64/atomic-machine.h +++ b/sysdeps/powerpc/powerpc64/atomic-machine.h @@ -230,6 +230,7 @@ * "light weight" sync can also be used for the release barrier. */ #ifndef UP +# define __ARCH_ACQ_INSTR "lwsync" # define __ARCH_REL_INSTR "lwsync" #endif #define atomic_write_barrier() __asm ("lwsync" ::: "memory")