From patchwork Wed Nov 3 19:11:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 47021 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9F9473858005 for ; Wed, 3 Nov 2021 19:12:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9F9473858005 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1635966729; bh=4BMmd2Pd9X08p/nhFVTROn0FBD0ci7M8u+pb/2rgc/U=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=nxMu34HKs0OoqE1eOkO9Y5vpA3JqIQC/8hhjyneq5Y3JJiDemLvfE+R1qxtKfJMRL t9lwKuH1nxYRXtGFmkXCNm/S+nHKw+e0Sj0+t3EsPwC6sLo8Wgok5frdV9Ldwq/puW ZJ/RMs3hIAT9vDPmwX/+7TmPnThiNVmgYgF119mw= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id 2B18E3858410 for ; Wed, 3 Nov 2021 19:11:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2B18E3858410 Received: by mail-pl1-x62e.google.com with SMTP id y1so3120405plk.10 for ; Wed, 03 Nov 2021 12:11:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=4BMmd2Pd9X08p/nhFVTROn0FBD0ci7M8u+pb/2rgc/U=; b=zSJV0wtPUPmPHFFt3WeW9RQVhjespaJj+k0mo4uvbrOunG9+G0bkH9NFd2xDalFhzF I2hFh0G9iLdzJlyefgva/v52Y1nU0NsDtf9ghEfoivUeVh1NCoeU2C3OIAOGuNa0Cr9K GnjVqKwlY9O4O+rgPl52sq9fB394La3sqyY/FH6ofTuiXrASdpXL8f52KhkUcHBujwdo BxTB34P7aRmppS3InyYJrD/u98sBdDUPwoldSvsXoZ304tBgDZ3ZfHE2ZfksVgvJZShh 5d/Fg8Jq9G732Q77n8qE0bwiquFvQ+xnib574xIlFlubu0eL2VU4a6Kquaf4dUu+6Igu BHBA== X-Gm-Message-State: AOAM533EeuR5gto/yaaK766NUPpf/NKRXjNVgD9LygpsM8M5bC4xPrch gZmWflq3DFX/yBqknfiBtE0T8W6ZbHU= X-Google-Smtp-Source: ABdhPJxhVbZJdiTrajOL7bzj3e26mLD0RCE/kpVU0fBelUrzl8bIfOD6qNG9chsUzhXS8OV+9rfHHw== X-Received: by 2002:a17:902:f691:b0:142:2f7a:308e with SMTP id l17-20020a170902f69100b001422f7a308emr175911plg.19.1635966707169; Wed, 03 Nov 2021 12:11:47 -0700 (PDT) Received: from gnu-cfl-2.localdomain ([172.58.35.133]) by smtp.gmail.com with ESMTPSA id h3sm3500235pfi.207.2021.11.03.12.11.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Nov 2021 12:11:46 -0700 (PDT) Received: from gnu-cfl-2.lan (localhost [IPv6:::1]) by gnu-cfl-2.localdomain (Postfix) with ESMTP id D1AF11A0142 for ; Wed, 3 Nov 2021 12:11:45 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v2] x86: Optimize atomic_compare_and_exchange_[val|bool]_acq [BZ #28537] Date: Wed, 3 Nov 2021 12:11:45 -0700 Message-Id: <20211103191145.1398847-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.33.1 MIME-Version: 1.0 X-Spam-Status: No, score=-3030.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From the CPU's point of view, getting a cache line for writing is more expensive than reading. See Appendix A.2 Spinlock in: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/xeon-lock-scaling-analysis-paper.pdf The full compare and swap will grab the cache line exclusive and cause excessive cache line bouncing. Load the current memory value first, which should be atomic, check and return immediately if writing cache line may fail to reduce cache line bouncing on contended locks. This fixes BZ# 28537. --- sysdeps/x86/atomic-machine.h | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/sysdeps/x86/atomic-machine.h b/sysdeps/x86/atomic-machine.h index 2692d94a92..7342f5abdf 100644 --- a/sysdeps/x86/atomic-machine.h +++ b/sysdeps/x86/atomic-machine.h @@ -73,9 +73,18 @@ typedef uintmax_t uatomic_max_t; #define ATOMIC_EXCHANGE_USES_CAS 0 #define atomic_compare_and_exchange_val_acq(mem, newval, oldval) \ - __sync_val_compare_and_swap (mem, oldval, newval) + ({ __typeof (*(mem)) oldmem = *(mem), ret; \ + ret = (oldmem == (oldval) \ + ? __sync_val_compare_and_swap (mem, oldval, newval) \ + : oldmem); \ + ret; }) #define atomic_compare_and_exchange_bool_acq(mem, newval, oldval) \ - (! __sync_bool_compare_and_swap (mem, oldval, newval)) + ({ __typeof (*(mem)) oldmem = *(mem); \ + int ret; \ + ret = (oldmem == (oldval) \ + ? !__sync_bool_compare_and_swap (mem, oldval, newval) \ + : 1); \ + ret; }) #define __arch_c_compare_and_exchange_val_8_acq(mem, newval, oldval) \