From patchwork Fri Jul 31 20:00:41 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 7968 X-Patchwork-Delegate: tuliom@linux.vnet.ibm.com Received: (qmail 53405 invoked by alias); 31 Jul 2015 20:00:50 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 53387 invoked by uid 89); 31 Jul 2015 20:00:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-yk0-f170.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=4d2TEiP0BVVyJ1acnmL0bwzkqrfcs+d/J/9Xa6ABGUg=; b=Jk8CbY0EcgoIRl8++j8Eelh+B4w3Rbez1zQ7wFT2CnoGH511bhbfX5tR1U2GbHZHD3 CfkErUZ9CRHJVpA5IpU5SB6enlULdrtbD3uXHnecuhI95p9Jw/LCnGuuS0WAN82wRSPU dzmPjbSLfs3+n7vlvH5vmUqp004bHZODGlDcK/+04lNebvRCbb8h6GBE1oSndE2W324y l02vevQ8nnP8wG5JdloJ167P/uS+Eec78MGLaSD4kCGDiaXOwxyGEyoxkz24xq/sL4Jt Dp0JSvRkurrfnibHVmWSOWLFPZzYehBfC6ynRMLJYTWPUzKW0rjLwTiRkHicwSK6pLsq Q9zA== X-Gm-Message-State: ALoCoQmbD2e/jnBmU4Mne78X6kyDz4LmF8/PowPhhhBz8awBkIo4d4QGcb+M5VRahiAt8NmKYV+o X-Received: by 10.170.49.197 with SMTP id 188mr5250242ykr.87.1438372845722; Fri, 31 Jul 2015 13:00:45 -0700 (PDT) Message-ID: <55BBD3E9.8040005@linaro.org> Date: Fri, 31 Jul 2015 17:00:41 -0300 From: Adhemerval Zanella User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: Tulio Magno Quites Machado Filho CC: libc-alpha@sourceware.org Subject: Re: [PATCH][BZ 18743] PowerPC: Fix a race condition when eliding a lock References: <1438274936-26493-1-git-send-email-tuliom@linux.vnet.ibm.com> <55BA703D.7010303@linaro.org> <874mkl3wtq.fsf@totoro.lan> <55BB76FA.5040703@linaro.org> <87zj2c1ij1.fsf@totoro.lan> In-Reply-To: <87zj2c1ij1.fsf@totoro.lan> On 31-07-2015 11:57, Tulio Magno Quites Machado Filho wrote: > Adhemerval Zanella writes: > >> On 30-07-2015 23:06, Tulio Magno Quites Machado Filho wrote: >>> Using pthread_rwlock_rdlock as an example, is_lock_free would be a preprocessor >>> token with the following contents: >>> >>> 128 if (ELIDE_LOCK (rwlock->__data.__rwelision, >>> 129 rwlock->__data.__lock == 0 >>> 130 && rwlock->__data.__writer == 0 >>> 131 && rwlock->__data.__nr_readers == 0)) >>> >>> And this is where the important memory access happens. >>> If one of these fields are changed by another thread after this point, but >>> before __builtin_tbegin, we're able to reproduce the first problem I mentioned >>> previously. >> >> My understanding is this fields will be changed only in the lock taken path. Let's >> say a thread t1 grab the lock (either by a transaction or by using a default lock) >> and modifies any 'rwlock->__data' fields after the ELIDE_LOCK check and before >> the transaction begin on a second thread t2. My understanding of the issue you >> are seeing is the transaction will initiate and since is_lock_free is 1 it won't >> abort and continue as the a lock taken in both threads. > > I agree with you. But this won't reproduce the bug. You have to change the > order of events. > > Let's say t1 is the first one to read rwlock->__data fields and they're all 0. > t1 will set is_lock_free to 0. I believe you mean is_lock_free will be set to '1' if all fields are '0' (since the lock will not be held). > Then t2 modifies rwlock->__data before t1 starts the transaction (as in: get a > lock, instead of a transaction). > After that, t1 starts the transaction. Here comes the problem: the memory > barrier is created with rwlock->__data saying that someone took the lock, > but is_lock_free is set to 0. In this scenario, t1 will proceed with the > transaction. > >> However my understanding is the transaction won't be able to commit since it >> will have a conflict in the 'rwlock->__data.lock' or in any memory being access >> in the critical region. > > Yes, but with the order of events I mentioned, when t1 calls > pthread_rwlock_unlock, it will run the following code: > > 38 if (ELIDE_UNLOCK (rwlock->__data.__writer == 0 > 39 && rwlock->__data.__nr_readers == 0)) > 40 return 0; > > Where ELIDE_UNLOCK is: > > 89 static inline bool > 90 __elide_unlock (int is_lock_free) > 91 { > 92 if (is_lock_free) > 93 { > 94 __builtin_tend (0); > 95 return true; > 96 } > 97 return false; > 98 } > 99 > 100 # define ELIDE_UNLOCK(is_lock_free) \ > 101 __elide_unlock (is_lock_free) > > Remember that t1 started the transaction even if rwlock->__data said someone > was holding the lock. > So now, in __elide_unlock, is_lock_free will be 1. __elide_unlock will > return false and t1 will unlock a lock acquired by t2. I see now a window where the pthread_rwlock_t internal structures might have concurrent access and the transaction is never finished. However I do not think this is a powerpc specific and thus I think it should be fixed in generic algorithm instead. Something like this: --- -- I did not fix the x86_64 code, which would require some adjustments. I checked on ppc64le and make nptl/tests passes. It would be good if you could create a testcase to stress this issue (and I do see this could be quite hard since it is very timing dependent). The problem with this approach is it couple the elide macros with rdlock fields, and I think the idea would make this header more generic for multiple elide algorithms. Suggestions? > >> Are you saying that the transaction is not being aborted >> in the critical region? > > No. I'm saying the memory access to rwlock->__data is critical and should > happen inside the transaction. Without this, we can't say the whole process > is atomic. > >> If compiler is doing memory reordering around HTM builtin it is *clearly* a bug >> since transaction begin/end acts a full memory barrier (sync). You should also >> add a compiler barrier in pthread elision code as well. > > Agreed. But I guess that's out of scope for this particular bug. > I could prepare a separate patch with these changes if you agree with that. > diff --git a/nptl/pthread_rwlock_rdlock.c b/nptl/pthread_rwlock_rdlock.c index eb7ac8d..3756bb6 100644 --- a/nptl/pthread_rwlock_rdlock.c +++ b/nptl/pthread_rwlock_rdlock.c @@ -126,9 +126,9 @@ __pthread_rwlock_rdlock (pthread_rwlock_t *rwlock) LIBC_PROBE (rdlock_entry, 1, rwlock); if (ELIDE_LOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers)) return 0; /* Make sure we are alone. */ diff --git a/nptl/pthread_rwlock_tryrdlock.c b/nptl/pthread_rwlock_tryrdlock.c index 256188a..bb199e4 100644 --- a/nptl/pthread_rwlock_tryrdlock.c +++ b/nptl/pthread_rwlock_tryrdlock.c @@ -33,9 +33,9 @@ __pthread_rwlock_tryrdlock (pthread_rwlock_t *rwlock) rwlock->__data.__shared == LLL_PRIVATE ? FUTEX_PRIVATE : FUTEX_SHARED; if (ELIDE_TRYLOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__nr_readers == 0 - && rwlock->__data.__writer, 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers, 0)) return 0; lll_lock (rwlock->__data.__lock, rwlock->__data.__shared); diff --git a/nptl/pthread_rwlock_trywrlock.c b/nptl/pthread_rwlock_trywrlock.c index 918a8f2..49c6757 100644 --- a/nptl/pthread_rwlock_trywrlock.c +++ b/nptl/pthread_rwlock_trywrlock.c @@ -28,9 +28,9 @@ __pthread_rwlock_trywrlock (pthread_rwlock_t *rwlock) int result = EBUSY; if (ELIDE_TRYLOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__nr_readers == 0 - && rwlock->__data.__writer, 1)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers, 1)) return 0; lll_lock (rwlock->__data.__lock, rwlock->__data.__shared); diff --git a/nptl/pthread_rwlock_unlock.c b/nptl/pthread_rwlock_unlock.c index bdd115d..0946d0c 100644 --- a/nptl/pthread_rwlock_unlock.c +++ b/nptl/pthread_rwlock_unlock.c @@ -35,8 +35,7 @@ __pthread_rwlock_unlock (pthread_rwlock_t *rwlock) LIBC_PROBE (rwlock_unlock, 1, rwlock); - if (ELIDE_UNLOCK (rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + if (ELIDE_UNLOCK (rwlock->__data.__writer, rwlock->__data.__nr_readers)) return 0; lll_lock (rwlock->__data.__lock, rwlock->__data.__shared); diff --git a/nptl/pthread_rwlock_wrlock.c b/nptl/pthread_rwlock_wrlock.c index 60fa909..0161876 100644 --- a/nptl/pthread_rwlock_wrlock.c +++ b/nptl/pthread_rwlock_wrlock.c @@ -98,9 +98,9 @@ __pthread_rwlock_wrlock (pthread_rwlock_t *rwlock) LIBC_PROBE (wrlock_entry, 1, rwlock); if (ELIDE_LOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers)) return 0; /* Make sure we are alone. */ diff --git a/sysdeps/generic/elide.h b/sysdeps/generic/elide.h index 80a3a03..64d9cce 100644 --- a/sysdeps/generic/elide.h +++ b/sysdeps/generic/elide.h @@ -15,11 +15,12 @@ You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, see . */ + #ifndef ELIDE_H #define ELIDE_H 1 -#define ELIDE_LOCK(adapt_count, is_lock_free) 0 -#define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) 0 -#define ELIDE_UNLOCK(is_lock_free) 0 +#define ELIDE_LOCK(adapt_count, lock, writer, readers) 0 +#define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) 0 +#define ELIDE_UNLOCK(writer, readers) 0 #endif diff --git a/sysdeps/powerpc/nptl/elide.h b/sysdeps/powerpc/nptl/elide.h index 389f5a5..b3cc50a 100644 --- a/sysdeps/powerpc/nptl/elide.h +++ b/sysdeps/powerpc/nptl/elide.h @@ -27,7 +27,7 @@ ADAPT_COUNT is a pointer to per-lock state variable. */ static inline bool -__elide_lock (uint8_t *adapt_count, int is_lock_free) +__elide_lock (uint8_t *adapt_count, int *lock, int *writer, unsigned int *readers) { if (*adapt_count > 0) { @@ -39,7 +39,10 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free) { if (__builtin_tbegin (0)) { - if (is_lock_free) + /* The compiler barrier is required because some GCC version might + reorder the lock read before the transaction init builtin. */ + asm volatile("" ::: "memory"); + if ((*lock == 0) && (*writer == 0) && (*readers == 0)) return true; /* Lock was busy. */ __builtin_tabort (_ABORT_LOCK_BUSY); @@ -66,30 +69,31 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free) return false; } -# define ELIDE_LOCK(adapt_count, is_lock_free) \ - __elide_lock (&(adapt_count), is_lock_free) +# define ELIDE_LOCK(adapt_count, lock, writer, readers) \ + __elide_lock (&(adapt_count), &(lock), &(writer), &(readers)) static inline bool -__elide_trylock (uint8_t *adapt_count, int is_lock_free, int write) +__elide_trylock (uint8_t *adapt_count, int *lock, int *writer, + unsigned int *readers, int write) { if (__elision_aconf.try_tbegin > 0) { if (write) __builtin_tabort (_ABORT_NESTED_TRYLOCK); - return __elide_lock (adapt_count, is_lock_free); + return __elide_lock (adapt_count, lock, writer, readers); } return false; } -# define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) \ - __elide_trylock (&(adapt_count), is_lock_free, write) +# define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) \ + __elide_trylock (&(adapt_count), &(lock), &(writer), &(readers), write) static inline bool -__elide_unlock (int is_lock_free) +__elide_unlock (int *writer, unsigned int *readers) { - if (is_lock_free) + if ((*writer == 0) && (*readers == 0)) { __builtin_tend (0); return true; @@ -97,14 +101,14 @@ __elide_unlock (int is_lock_free) return false; } -# define ELIDE_UNLOCK(is_lock_free) \ - __elide_unlock (is_lock_free) +# define ELIDE_UNLOCK(writer, readers) \ + __elide_unlock (&(writer), &(readers)) # else -# define ELIDE_LOCK(adapt_count, is_lock_free) 0 -# define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) 0 -# define ELIDE_UNLOCK(is_lock_free) 0 +# define ELIDE_LOCK(adapt_count, lock, writer, readers) 0 +# define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) 0 +# define ELIDE_UNLOCK(writer, readers) 0 #endif /* ENABLE_LOCK_ELISION */