From patchwork Thu Jan 22 18:24:25 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Torvald Riegel X-Patchwork-Id: 4760 Received: (qmail 11597 invoked by alias); 22 Jan 2015 18:26:10 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 11127 invoked by uid 89); 22 Jan 2015 18:25:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL, BAYES_00, SPF_HELO_PASS, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Subject: [PATCH] Don't rely on having LP64 in semaphores if 64b atomic ops are available. From: Torvald Riegel To: GLIBC Devel Cc: Steve Ellcey , "Maciej W. Rozycki" , "H.J. Lu" Date: Thu, 22 Jan 2015 19:24:25 +0100 Message-ID: <1421951065.4572.47.camel@triegel.csb> Mime-Version: 1.0 This attempts to fix an issue in the new semaphore implementation for archs that provide 64b atomic ops but do not use an LP64 data model. This affects MIPS N32 ABI and x32 too I guess. Steve, can you test this patch please? I'll do the same on x86_64-linux in the meantime. Also, do we already have any information on which data models we support? I reckon LP64 and ILP32?, or others as well? I'd like to add a comment to atomic.h, and if we have such information already, I'd just point to it. 2015-01-22 Torvald Riegel * nptl/sem_post.c (__new_sem_post): Don't rely on an LP64 data model. * nptl/sem_waitcommon.c (__sem_wait_cleanup, __new_sem_wait_fast, __new_sem_wait_slow): Likewise. * sysdeps/nptl/internaltypes.h (struct new_sem): Likewise. commit e34f47a1050638255898e392df63c82052f97f48 Author: Torvald Riegel Date: Thu Jan 22 19:18:59 2015 +0100 Don't rely on having LP64 in semaphores if 64b atomic ops are available. diff --git a/nptl/sem_post.c b/nptl/sem_post.c index 9162e4c..cd9e666 100644 --- a/nptl/sem_post.c +++ b/nptl/sem_post.c @@ -65,7 +65,7 @@ __new_sem_post (sem_t *sem) added tokens before (the release sequence includes atomic RMW operations by other threads). */ /* TODO Use atomic_fetch_add to make it scale better than a CAS loop? */ - unsigned long int d = atomic_load_relaxed (&isem->data); + unsigned long long int d = atomic_load_relaxed (&isem->data); do { if ((d & SEM_VALUE_MASK) == SEM_VALUE_MAX) diff --git a/nptl/sem_waitcommon.c b/nptl/sem_waitcommon.c index 96848d7..12c77e6 100644 --- a/nptl/sem_waitcommon.c +++ b/nptl/sem_waitcommon.c @@ -187,7 +187,7 @@ __sem_wait_cleanup (void *arg) #if __HAVE_64B_ATOMICS /* Stop being registered as a waiter. See below for MO. */ - atomic_fetch_add_relaxed (&sem->data, -(1UL << SEM_NWAITERS_SHIFT)); + atomic_fetch_add_relaxed (&sem->data, -(1ULL << SEM_NWAITERS_SHIFT)); #else __sem_wait_32_finish (sem); #endif @@ -228,7 +228,7 @@ __new_sem_wait_fast (struct new_sem *sem, int definitive_result) and the failure path of the CAS. If the weak CAS fails and we need a definitive result, retry. */ #if __HAVE_64B_ATOMICS - unsigned long d = atomic_load_relaxed (&sem->data); + unsigned long long d = atomic_load_relaxed (&sem->data); do { if ((d & SEM_VALUE_MASK) == 0) @@ -263,8 +263,8 @@ __new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime) #if __HAVE_64B_ATOMICS /* Add a waiter. Relaxed MO is sufficient because we can rely on the ordering provided by the RMW operations we use. */ - unsigned long d = atomic_fetch_add_relaxed (&sem->data, - 1UL << SEM_NWAITERS_SHIFT); + unsigned long long d = atomic_fetch_add_relaxed (&sem->data, + 1ULL << SEM_NWAITERS_SHIFT); pthread_cleanup_push (__sem_wait_cleanup, sem); @@ -304,7 +304,7 @@ __new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime) err = -1; /* Stop being registered as a waiter. */ atomic_fetch_add_relaxed (&sem->data, - -(1UL << SEM_NWAITERS_SHIFT)); + -(1ULL << SEM_NWAITERS_SHIFT)); break; } /* Relaxed MO is sufficient; see below. */ @@ -320,7 +320,7 @@ __new_sem_wait_slow (struct new_sem *sem, const struct timespec *abstime) up-to-date value; the futex_wait or the CAS perform the real work. */ if (atomic_compare_exchange_weak_acquire (&sem->data, - &d, d - 1 - (1UL << SEM_NWAITERS_SHIFT))) + &d, d - 1 - (1ULL << SEM_NWAITERS_SHIFT))) { err = 0; break; diff --git a/sysdeps/nptl/internaltypes.h b/sysdeps/nptl/internaltypes.h index 7c0d240..b28ef93 100644 --- a/sysdeps/nptl/internaltypes.h +++ b/sysdeps/nptl/internaltypes.h @@ -155,7 +155,7 @@ struct new_sem # endif # define SEM_NWAITERS_SHIFT 32 # define SEM_VALUE_MASK (~(unsigned int)0) - unsigned long int data; + unsigned long long int data; int private; int pad; #else