From patchwork Tue Aug 25 16:47:32 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 8431 Received: (qmail 56759 invoked by alias); 25 Aug 2015 16:47:45 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 56748 invoked by uid 89); 25 Aug 2015 16:47:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-yk0-f173.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding; bh=ubhN9toFch06YRAugJa5f4jhEoLO7B3nC1RAxchsvnE=; b=eACNpiBR1PHrMYNrrpcDLY2JtMSzH7d1GwO64wMPYT3PAvP44rPvm4yvkcLgne38wm uJdBMOUyDmXMqYL4OIspd3t9ionZcsnJXp0kjS6WrUwLsqyryPAQn0/ootgdjjaAHFQZ /dGSesOeA59g5SDpalrnxVN1HcTDTfIfoShP7jXSI9CaemGGFJ0R4HRoF9LXz637gQOz docxZwHNjPqUX/jjytNuUt4bxok/z0Y/lSenjyMCKNPWihx5of87FiziWIJOGd1S2dKf Xv2gpg9Gm00bCb15pnQxTwHC9ESoCuzcP2rmSI3fiPG7IK6XXfwDeDe1Q/mVw/Piisoa N0Mw== X-Gm-Message-State: ALoCoQmWWDlTnFnZO2FTJn/TNq3hSxisZUPJzGBc9ZJ+lZ80VP6Hl9kmUA646Rem9o+vP8QrDY/2 X-Received: by 10.129.115.3 with SMTP id o3mr38409396ywc.43.1440521261208; Tue, 25 Aug 2015 09:47:41 -0700 (PDT) Subject: Re: [PATCH v2] PowerPC: Fix a race condition when eliding a lock To: Torvald Riegel , Carlos O'Donell References: <55BAEE77.9000501@redhat.com> <1440084054-32243-1-git-send-email-tuliom@linux.vnet.ibm.com> <55D628C2.3080305@redhat.com> <55D62FD3.8090109@linaro.org> <1440150735.6240.74.camel@localhost.localdomain> <55DBE899.2080908@redhat.com> <1440500727.27492.128.camel@localhost.localdomain> <55DC786C.8080108@redhat.com> <1440520639.27492.190.camel@localhost.localdomain> Cc: Tulio Magno Quites Machado Filho , libc-alpha@sourceware.org, munroesj@linux.vnet.ibm.com From: Adhemerval Zanella Message-ID: <55DC9C24.5090401@linaro.org> Date: Tue, 25 Aug 2015 13:47:32 -0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <1440520639.27492.190.camel@localhost.localdomain> On 25-08-2015 13:37, Torvald Riegel wrote: > On Tue, 2015-08-25 at 10:15 -0400, Carlos O'Donell wrote: >> As a concrete example the structure element that is accessed atomically is >> rwlock->__data.__lock. We access it atomically in lll_lock and also >> in the txnal region (is region the right word?). > > Txnal region is used by some, so I think this works. Just transaction > would work as well I think. > >> The access is part of: >> >> nptl/pthread_rwlock_wrlock.c >> >> 100 if (ELIDE_LOCK (rwlock->__data.__rwelision, >> 101 rwlock->__data.__lock == 0 >> 102 && rwlock->__data.__writer == 0 >> 103 && rwlock->__data.__nr_readers == 0)) >> 104 return 0; >> >> With the intent being for the expression to be evaluted inside of the >> transaction and thus abort if another thread has touched any of those >> fields. >> >> That entire expression expands into the usage of is_lock_free. Therefore >> shouldn't the change be at the caller site? > > That would be an option, or we should pass in a function or something. > >> e.g. >> >> 100 if (ELIDE_LOCK (rwlock->__data.__rwelision, >> 101 atomic_load_relaxed (rwlock->__data.__lock) == 0 >> 102 && rwlock->__data.__writer == 0 >> 103 && rwlock->__data.__nr_readers == 0)) >> 104 return 0; >> >> Since the powerpc elision backend doesn't know anything about the types >> that went into the evaluation of is_lock_free? >> >> If anything I think *we* have the onus to fix these cases of missing >> atomic_load_relaxed not IBM? > > Somebody should do it :) I hadn't thought too much about for whom it > would be most convenient to do. As I wrote in the review for Tulio's > patch, IMO passing in an expression into a macro that has to be > evaluated in there is pretty ugly and seems to be at least related to > the bug Tulio is fixing. Maybe we can pass in a function that is > inlined by the compiler. > I do agree that IBM just used what the Intel implementation did, and > thus didn't create that in the first place. > Indeed Intel was the first arch that enable Lock Elision and other arches just followed the implementation. And I also agree that the macro is not best approach and on initial iterations on this I proposed to instead of passing an function we can instead use the arguments address instead: diff --git a/nptl/pthread_rwlock_wrlock.c b/nptl/pthread_rwlock_wrlock.c index 60fa909..0161876 100644 --- a/nptl/pthread_rwlock_wrlock.c +++ b/nptl/pthread_rwlock_wrlock.c @@ -98,9 +98,9 @@ __pthread_rwlock_wrlock (pthread_rwlock_t *rwlock) LIBC_PROBE (wrlock_entry, 1, rwlock); if (ELIDE_LOCK (rwlock->__data.__rwelision, - rwlock->__data.__lock == 0 - && rwlock->__data.__writer == 0 - && rwlock->__data.__nr_readers == 0)) + rwlock->__data.__lock, + rwlock->__data.__writer, + rwlock->__data.__nr_readers)) diff --git a/sysdeps/generic/elide.h b/sysdeps/generic/elide.h index 80a3a03..64d9cce 100644 --- a/sysdeps/generic/elide.h +++ b/sysdeps/generic/elide.h @@ -15,11 +15,12 @@ You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, see . */ + #ifndef ELIDE_H #define ELIDE_H 1 -#define ELIDE_LOCK(adapt_count, is_lock_free) 0 -#define ELIDE_TRYLOCK(adapt_count, is_lock_free, write) 0 -#define ELIDE_UNLOCK(is_lock_free) 0 +#define ELIDE_LOCK(adapt_count, lock, writer, readers) 0 +#define ELIDE_TRYLOCK(adapt_count, lock, writer, readers, write) 0 +#define ELIDE_UNLOCK(writer, readers) 0 #endif diff --git a/sysdeps/powerpc/nptl/elide.h b/sysdeps/powerpc/nptl/elide.h index 389f5a5..b3cc50a 100644 --- a/sysdeps/powerpc/nptl/elide.h +++ b/sysdeps/powerpc/nptl/elide.h @@ -27,7 +27,7 @@ ADAPT_COUNT is a pointer to per-lock state variable. */ static inline bool -__elide_lock (uint8_t *adapt_count, int is_lock_free) +__elide_lock (uint8_t *adapt_count, int *lock, int *writer, unsigned int *readers) { if (*adapt_count > 0) { @@ -39,7 +39,10 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free) { if (__builtin_tbegin (0)) { - if (is_lock_free) + /* The compiler barrier is required because some GCC version might + reorder the lock read before the transaction init builtin. */ + asm volatile("" ::: "memory"); + if ((*lock == 0) && (*writer == 0) && (*readers == 0)) return true; /* Lock was busy. */ __builtin_tabort (_ABORT_LOCK_BUSY); @@ -66,30 +69,31 @@ __elide_lock (uint8_t *adapt_count, int is_lock_free) return false; } I do not know which is better, since it will tie the ELIDE_LOCK implementation with current internal pthread definitions.