From patchwork Wed Nov 10 18:41:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 47421 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E18BB385842A for ; Wed, 10 Nov 2021 18:42:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E18BB385842A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1636569779; bh=8r6mUl6bvuzW/d92rGPWXMHha5yoRoawTMV2DH8ShZI=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=ZHVXEBOyEOA9fRlRCPhpratL8hn+pbrZIwzzm+hTIan9wmw/PQ0pyQMZ8U/AAHn+D xkDpHWXv9cpLDguyWoBJh92viUx426ObvlFxu7F1n9ypStglsA6S0mWNFq/i3zBl2e pPKm8+PL9m8PwK1kviwggpX39/dOfG1EL6uT5YmU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by sourceware.org (Postfix) with ESMTPS id 09BC13858407 for ; Wed, 10 Nov 2021 18:41:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 09BC13858407 Received: by mail-pj1-x1033.google.com with SMTP id nh10-20020a17090b364a00b001a69adad5ebso2572365pjb.2 for ; Wed, 10 Nov 2021 10:41:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8r6mUl6bvuzW/d92rGPWXMHha5yoRoawTMV2DH8ShZI=; b=0P+3snpclIrbBVKKaLw9AkbS0oc2MJCn4DzCxm8TOJeVmgQ2AIOe+9mq4jfkfOr1LV QELQcXu8bBJ5XVmMonx7Q57eC92zYFUbqRyDWg17Zr0LZgyFHT3IE01t4xaOomrmJZsG kCHhsYk1OknhP1CE2hjiC0oBxbSTLHfMZlPvxtJ1MxlLbmr2Kqn66/bljEJ33IKqQdzX hPj//nhBFBXNAOYw+a2AalApELlb4TZqvFXgIvAGnYHdyx9mV1bllYs8O+6ONE7wcoLE 2opohSLk6rZ2OZ39I0pU389lqCeu5wfKSUnBgcI28Trz8Wh4+ITHPSYz5+Lzmv2uT7Vr sXhw== X-Gm-Message-State: AOAM533vyVIDCi126ESoPkbT6P4jApbvfVkK0MErCA0mdYvBOz0QeP1X QWDdJc/6sOCDnNVw8mpdz3E= X-Google-Smtp-Source: ABdhPJxmXGo5wnLR7QumhVC0Fc1R5+4iZYhj+cY8w5H1cm8yie4+VEx3c98s75wbkalfARq2oPinbA== X-Received: by 2002:a17:90a:1b2a:: with SMTP id q39mr1127370pjq.219.1636569715124; Wed, 10 Nov 2021 10:41:55 -0800 (PST) Received: from gnu-cfl-2.localdomain ([172.58.35.133]) by smtp.gmail.com with ESMTPSA id z8sm253154pgc.53.2021.11.10.10.41.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Nov 2021 10:41:54 -0800 (PST) Received: from gnu-cfl-2.lan (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id 45E951A0987; Wed, 10 Nov 2021 10:41:53 -0800 (PST) To: libc-alpha@sourceware.org Subject: [PATCH v5 1/3] Reduce CAS in low level locks [BZ #28537] Date: Wed, 10 Nov 2021 10:41:51 -0800 Message-Id: <20211110184153.2269857-2-hjl.tools@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211110184153.2269857-1-hjl.tools@gmail.com> References: <20211110184153.2269857-1-hjl.tools@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3029.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: Florian Weimer , Andreas Schwab , "Paul A . Clarke" , Arjan van de Ven Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" CAS instruction is expensive. From the x86 CPU's point of view, getting a cache line for writing is more expensive than reading. See Appendix A.2 Spinlock in: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf The full compare and swap will grab the cache line exclusive and cause excessive cache line bouncing. 1. Change low level locks to do an atomic load and skip CAS if compare may fail to reduce cache line bouncing on contended locks. 2. In __lll_lock, replace atomic_compare_and_exchange_bool_acq with atomic_compare_and_exchange_val_acq and pass down the result to __lll_lock_wait and __lll_lock_wait_private to avoid the redundant load there. --- nptl/lowlevellock.c | 12 ++++++------ sysdeps/nptl/lowlevellock.h | 33 +++++++++++++++++++++++---------- 2 files changed, 29 insertions(+), 16 deletions(-) diff --git a/nptl/lowlevellock.c b/nptl/lowlevellock.c index 8c740529c4..d1965c01ca 100644 --- a/nptl/lowlevellock.c +++ b/nptl/lowlevellock.c @@ -22,30 +22,30 @@ #include void -__lll_lock_wait_private (int *futex) +__lll_lock_wait_private (int *futex, int futex_value) { - if (atomic_load_relaxed (futex) == 2) + if (futex_value == 2) goto futex; while (atomic_exchange_acquire (futex, 2) != 0) { futex: - LIBC_PROBE (lll_lock_wait_private, 1, futex); + LIBC_PROBE (lll_lock_wait_private, 2, futex, futex_value); futex_wait ((unsigned int *) futex, 2, LLL_PRIVATE); /* Wait if *futex == 2. */ } } libc_hidden_def (__lll_lock_wait_private) void -__lll_lock_wait (int *futex, int private) +__lll_lock_wait (int *futex, int futex_value, int private) { - if (atomic_load_relaxed (futex) == 2) + if (futex_value == 2) goto futex; while (atomic_exchange_acquire (futex, 2) != 0) { futex: - LIBC_PROBE (lll_lock_wait, 1, futex); + LIBC_PROBE (lll_lock_wait, 2, futex, futex_value); futex_wait ((unsigned int *) futex, 2, private); /* Wait if *futex == 2. */ } } diff --git a/sysdeps/nptl/lowlevellock.h b/sysdeps/nptl/lowlevellock.h index 4d95114ed3..05260eb706 100644 --- a/sysdeps/nptl/lowlevellock.h +++ b/sysdeps/nptl/lowlevellock.h @@ -66,7 +66,12 @@ 0. Otherwise leave lock unchanged and return non-zero to indicate that the lock was not acquired. */ #define __lll_trylock(lock) \ - __glibc_unlikely (atomic_compare_and_exchange_bool_acq ((lock), 1, 0)) + (__extension__ ({ \ + __typeof (*(lock)) __lock_value = atomic_load_relaxed (lock); \ + (__lock_value != 0 \ + || __glibc_unlikely (atomic_compare_and_exchange_bool_acq ((lock), \ + 1, 0)));\ + })) #define lll_trylock(lock) \ __lll_trylock (&(lock)) @@ -74,11 +79,16 @@ return 0. Otherwise leave lock unchanged and return non-zero to indicate that the lock was not acquired. */ #define lll_cond_trylock(lock) \ - __glibc_unlikely (atomic_compare_and_exchange_bool_acq (&(lock), 2, 0)) - -extern void __lll_lock_wait_private (int *futex); + (__extension__ ({ \ + __typeof (lock) __lock_value = atomic_load_relaxed (&(lock)); \ + (__lock_value != 0 \ + || __glibc_unlikely (atomic_compare_and_exchange_bool_acq (&(lock),\ + 2, 0)));\ + })) + +extern void __lll_lock_wait_private (int *futex, int futex_value); libc_hidden_proto (__lll_lock_wait_private) -extern void __lll_lock_wait (int *futex, int private); +extern void __lll_lock_wait (int *futex, int futex_value, int private); libc_hidden_proto (__lll_lock_wait) /* This is an expression rather than a statement even though its value is @@ -95,13 +105,15 @@ libc_hidden_proto (__lll_lock_wait) ((void) \ ({ \ int *__futex = (futex); \ - if (__glibc_unlikely \ - (atomic_compare_and_exchange_bool_acq (__futex, 1, 0))) \ + int __futex_value = atomic_load_relaxed (futex); \ + if (__futex_value != 0 \ + || ((__futex_value = atomic_compare_and_exchange_val_acq \ + (__futex, 1, 0) != 0))) \ { \ if (__builtin_constant_p (private) && (private) == LLL_PRIVATE) \ - __lll_lock_wait_private (__futex); \ + __lll_lock_wait_private (futex, __futex_value); \ else \ - __lll_lock_wait (__futex, private); \ + __lll_lock_wait (futex, __futex_value, private); \ } \ })) #define lll_lock(futex, private) \ @@ -120,7 +132,8 @@ libc_hidden_proto (__lll_lock_wait) ({ \ int *__futex = (futex); \ if (__glibc_unlikely (atomic_exchange_acq (__futex, 2) != 0)) \ - __lll_lock_wait (__futex, private); \ + __lll_lock_wait (__futex, atomic_load_relaxed (__futex), \ + private); \ })) #define lll_cond_lock(futex, private) __lll_cond_lock (&(futex), private) From patchwork Wed Nov 10 18:41:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 47420 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B65F93857C60 for ; Wed, 10 Nov 2021 18:42:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B65F93857C60 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1636569737; bh=Te2kDB5bSGuJgqf1hGqPBvZBcSssSE6oJdEN0jJb8YE=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=e0bLVyX7+IyoCXs7M7thHbPp/8MJPQq1eAwixwxFO65Orq+cQgrUq9adRk3IKiFW/ u6mG6DS6VPW8ZxpB+eWxA2WLNUh8hAiKo2kKgN83F1B/aodKSRd/KSwr6N8i6yKDsE rJ9DAeS7JFFTM94NjGgnKARtU+XM2IzhKgbhp9Bs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id 95F113858400 for ; Wed, 10 Nov 2021 18:41:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 95F113858400 Received: by mail-pf1-x42d.google.com with SMTP id c4so3471520pfj.2 for ; Wed, 10 Nov 2021 10:41:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Te2kDB5bSGuJgqf1hGqPBvZBcSssSE6oJdEN0jJb8YE=; b=M6p3yQePo5sghAPN+KOa+NhaoHWbjXcOCgXxgSw56UhaPic7OtkgmuRyXFEaEjy/8O X3KXVRqaqACSgpMCTmMOJdwTEasfgzFCh/E0djT6goaDTN58ERkUPZOMvFLypr+QeVWN tWcCoFyBCGzWnyUVelB0YSaoAN/bpjhpvxqYTKlHhCWSxUCFlmZeplzuOqumqKgCXZ/1 8YqIGGnSJn0qYZmUrBPvHDG/29zp8b+v1C1znjmd0FIYNbc62gzaBeJuaNFBpnIhbHWa 1NjwyQTakX7bUQdp9o76oHrHEnCRVdJZrwqYmAS7VdVKRbODHUHi8gUxim8Ew4fcB24x 9X1w== X-Gm-Message-State: AOAM531OdJMtoucEKAwzoRlsdpgZExxAmf5gjC7V62uLQSctkVF9mMmQ k3IcEKFL4om/TfoUqhtbno+LdoAZJ5M= X-Google-Smtp-Source: ABdhPJyxQnG6diryNi2dEIfZY95/G4R8ocqIWm0RmL/ngnnKPrQydAcO0qN8O+bMLeMaZRaiNUUKLg== X-Received: by 2002:aa7:83c9:0:b0:481:1d47:3362 with SMTP id j9-20020aa783c9000000b004811d473362mr1180589pfn.5.1636569714737; Wed, 10 Nov 2021 10:41:54 -0800 (PST) Received: from gnu-cfl-2.localdomain ([2607:fb90:a63f:468b:b937:402b:0:c66]) by smtp.gmail.com with ESMTPSA id x20sm301430pjp.48.2021.11.10.10.41.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Nov 2021 10:41:54 -0800 (PST) Received: from gnu-cfl-2.lan (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id 516B61A0EC0; Wed, 10 Nov 2021 10:41:53 -0800 (PST) To: libc-alpha@sourceware.org Subject: [PATCH v5 2/3] Reduce CAS in __pthread_mutex_lock_full [BZ #28537] Date: Wed, 10 Nov 2021 10:41:52 -0800 Message-Id: <20211110184153.2269857-3-hjl.tools@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211110184153.2269857-1-hjl.tools@gmail.com> References: <20211110184153.2269857-1-hjl.tools@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3031.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: Florian Weimer , Andreas Schwab , "Paul A . Clarke" , Arjan van de Ven Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Change __pthread_mutex_lock_full to do an atomic load and skip CAS if compare may fail to reduce cache line bouncing on contended locks. --- nptl/pthread_mutex_lock.c | 38 +++++++++++++++++++++++++------------- 1 file changed, 25 insertions(+), 13 deletions(-) diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c index 2bd41767e0..d7e8efedd2 100644 --- a/nptl/pthread_mutex_lock.c +++ b/nptl/pthread_mutex_lock.c @@ -223,13 +223,13 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) newval |= (oldval & FUTEX_WAITERS) | assume_other_futex_waiters; #endif - newval - = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, - newval, oldval); - - if (newval != oldval) + int val = atomic_load_relaxed (&mutex->__data.__lock); + if (val != oldval + || ((val = atomic_compare_and_exchange_val_acq + (&mutex->__data.__lock, newval, oldval)) + != oldval)) { - oldval = newval; + oldval = val; continue; } @@ -411,11 +411,15 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) # ifdef NO_INCR newval |= FUTEX_WAITERS; # endif + oldval = atomic_load_relaxed (&mutex->__data.__lock); + if (oldval != 0) + goto locked_mutex; oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, newval, 0); if (oldval != 0) { + locked_mutex:; /* The mutex is locked. The kernel will now take care of everything. */ int private = (robust @@ -554,6 +558,10 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) ceilval = ceiling << PTHREAD_MUTEX_PRIO_CEILING_SHIFT; oldprio = ceiling; + oldval = atomic_load_relaxed (&mutex->__data.__lock); + if (oldval != ceilval) + goto ceilval_failed; + oldval = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, #ifdef NO_INCR @@ -568,10 +576,13 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) do { - oldval - = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, - ceilval | 2, - ceilval | 1); + oldval = atomic_load_relaxed (&mutex->__data.__lock); + ceilval_failed: + if (oldval == (ceilval | 1)) + oldval + = atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, + ceilval | 2, + ceilval | 1); if ((oldval & PTHREAD_MUTEX_PRIO_CEILING_MASK) != ceilval) break; @@ -581,9 +592,10 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) ceilval | 2, PTHREAD_MUTEX_PSHARED (mutex)); } - while (atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, - ceilval | 2, ceilval) - != ceilval); + while (atomic_load_relaxed (&mutex->__data.__lock) != ceilval + || (atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, + ceilval | 2, ceilval) + != ceilval)); } while ((oldval & PTHREAD_MUTEX_PRIO_CEILING_MASK) != ceilval); From patchwork Wed Nov 10 18:41:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 47422 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DA2133857817 for ; Wed, 10 Nov 2021 18:43:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DA2133857817 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1636569821; bh=W0Ct5TDF9L4m8yRlpYWKFp7IiRLl28T0/ZAdxktyRmQ=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=SB13Z0ePZQy5dTTVQIPPY8l/A4ASHmoW8ZV/1OLzsNfZka7TZYMt2G9awE/9ayNKk u7t+H7w723bUMqB6WcM3G3dxVfOTRL5kWheNcS6BqdF2tcLj57RfO2utNaCpvJKfAP +n0r/jvis3pRcsE/aGJCx/HtxMD5JfftNaE1YALU= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by sourceware.org (Postfix) with ESMTPS id 0AC4D385840C for ; Wed, 10 Nov 2021 18:41:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0AC4D385840C Received: by mail-pj1-x1032.google.com with SMTP id np3so2293304pjb.4 for ; Wed, 10 Nov 2021 10:41:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=W0Ct5TDF9L4m8yRlpYWKFp7IiRLl28T0/ZAdxktyRmQ=; b=Cz7FTC/U+qdEksHUTmRhBLtGF5h2n7qT15mw2lPZ/HJnGJZsp92noAVWk315BHz0ru cbh/GNu6u2Pe+HMFbGVzdGVbq8odm/YV/1xCNjCIL9T4vUQ00fve802rvrpz7b8dynHU /tgp6aA9S1abqxgyo40nBnPD5nxFXryhBcRR3FJQkZWJE4jrxXkHk9vUsoJfaTdrPWcp 8LJWhi5fGNDxWSnpr3HjIuRuGhBAU+oBk0EB4Ymb2FJ3bQnT42Exv466CsnlhEVSQ1gO +QwVFFOcDU7NDp560CUaMfA3R4GPsnwZAxX2/mj1wPrR95iJIbO4O8wciCtcA1bd1O+6 JGVw== X-Gm-Message-State: AOAM533mklHAVoJWHfVl68uvoreTzescF6GXu/+H6CeN0OHfWc94fm38 eCoIhrV69KppSodgc4Ghqs8= X-Google-Smtp-Source: ABdhPJzesP06D93MsufP+bYU52zgXqzpu0HzOLJNqPysxTS9B8N6rKhozHjxc1OIH6cwMoeqE3aRpg== X-Received: by 2002:a17:902:e84e:b0:141:e3f2:36c5 with SMTP id t14-20020a170902e84e00b00141e3f236c5mr1197813plg.74.1636569716154; Wed, 10 Nov 2021 10:41:56 -0800 (PST) Received: from gnu-cfl-2.localdomain ([172.58.35.133]) by smtp.gmail.com with ESMTPSA id d12sm377615pfh.165.2021.11.10.10.41.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Nov 2021 10:41:54 -0800 (PST) Received: from gnu-cfl-2.lan (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id 535B01A0EC3; Wed, 10 Nov 2021 10:41:53 -0800 (PST) To: libc-alpha@sourceware.org Subject: [PATCH v5 3/3] Optimize CAS in __pthread_mutex_lock_full [BZ #28537] Date: Wed, 10 Nov 2021 10:41:53 -0800 Message-Id: <20211110184153.2269857-4-hjl.tools@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211110184153.2269857-1-hjl.tools@gmail.com> References: <20211110184153.2269857-1-hjl.tools@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3029.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: Florian Weimer , Andreas Schwab , "Paul A . Clarke" , Arjan van de Ven Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" 1. Do an atomic load and skip CAS if compare may fail to reduce cache line bouncing on contended locks. 2. Replace atomic_compare_and_exchange_bool_acq with atomic_compare_and_exchange_val_acq to avoid the extra load. --- nptl/pthread_mutex_lock.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c index d7e8efedd2..24ff1772cd 100644 --- a/nptl/pthread_mutex_lock.c +++ b/nptl/pthread_mutex_lock.c @@ -297,12 +297,13 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex) meantime. */ if ((oldval & FUTEX_WAITERS) == 0) { - if (atomic_compare_and_exchange_bool_acq (&mutex->__data.__lock, - oldval | FUTEX_WAITERS, - oldval) - != 0) + int val = atomic_load_relaxed (&mutex->__data.__lock); + if (val != oldval + || ((val = atomic_compare_and_exchange_val_acq + (&mutex->__data.__lock, oldval | FUTEX_WAITERS, + oldval)) != oldval)) { - oldval = mutex->__data.__lock; + oldval = val; continue; } oldval |= FUTEX_WAITERS;