From patchwork Fri Jan 10 21:02:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 104531 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 044163858C35 for ; Fri, 10 Jan 2025 21:05:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 044163858C35 Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=NfstKLTW X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTP id 7B6D13858401 for ; Fri, 10 Jan 2025 21:02:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7B6D13858401 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7B6D13858401 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1736542974; cv=none; b=PgxQKNYY8A/aGfBkLef4cU75L7sPGpmZyzCLDXgF9iM90McxejOsiTgU/a6wRN3mPrOhnGQ8vqHa/jtob45DMIGM0m9bdxPcC6TCrFaNzDCbcT9F3sdMut3rg7LcnhOk2oy+vRh6zdutnF33eWYlTbuWfba7eOttNeDDyfNeOR4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1736542974; c=relaxed/simple; bh=L8ry/EAbeSxP9UjrqUI+gjRkWiPEdKm9CTvdbs3CMgA=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=uuxfxsFJ/Mm9f1w3XRh8VlDQaZevngqZ8p63XhJ6Ff66slXhLsqd1W1gD+QJTFsGvMKDVpTFUFLQXmLbtQUHe2Saa4IYedsh6Ffee+zRJD8Kzd6w2KdQIg+sWmC1lKccwx8x+xjvIkj7S5EPt6sxTseQ0eDKMsUWfeaN+KUrIfs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7B6D13858401 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736542973; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6H5QsZy1bk7LdAoy4+cwj3wLkmpFYac3UvKVziIDGak=; b=NfstKLTWU7YdyjYFpTDIqWRoaEGafR/VDrDaJ1TB3d8xP84rww8aqWVzxQVbqobaol4SxV 3BhXGdijAUt6non6FaB7RDWwyF8gdltkXJJD87yhmZ6xhzmMrGNH4+OKVkM3YIcDEjW5U7 WmFoHCEnBwZQ0ViZgAezJ2nd07T/f0E= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-663-q6VIRBjmOXqwqDJMepTazg-1; Fri, 10 Jan 2025 16:02:51 -0500 X-MC-Unique: q6VIRBjmOXqwqDJMepTazg-1 X-Mimecast-MFC-AGG-ID: q6VIRBjmOXqwqDJMepTazg Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E5FD519560B7; Fri, 10 Jan 2025 21:02:49 +0000 (UTC) Received: from localhost (unknown [10.42.28.9]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id EBE91195E3D9; Fri, 10 Jan 2025 21:02:48 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH] libstdc++: Create new base class of std::barrier for non-dependent code Date: Fri, 10 Jan 2025 21:02:37 +0000 Message-ID: <20250110210247.828338-1-jwakely@redhat.com> In-Reply-To: <20250110124917.750787-1-jwakely@redhat.com> References: <20250110124917.750787-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: V5gAQqBOi6tzBDKgfNXNPSvu-elmT3pPHiEuuR9T4Sw_1736542971 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org This moves all non-dependent state and logic for std::barrier into a new non-template base class, to avoid template bloat. This would permit moving the _M_arrive function into the library instead of the header. libstdc++-v3/ChangeLog: * include/std/barrier (__tree_barrier_base): New class. (__tree_barrier): Move non-dependent code into __tree_barrier_base and derive from it. --- Tested x86_64-linux. libstdc++-v3/include/std/barrier | 176 +++++++++++++++++-------------- 1 file changed, 94 insertions(+), 82 deletions(-) diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier index 9c1de411f9ce..56270c99e056 100644 --- a/libstdc++-v3/include/std/barrier +++ b/libstdc++-v3/include/std/barrier @@ -81,77 +81,102 @@ It looks different from literature pseudocode for two main reasons: enum class __barrier_phase_t : unsigned char { }; - template - class __tree_barrier + struct __tree_barrier_base + { + static constexpr ptrdiff_t + max() noexcept + { return __PTRDIFF_MAX__ - 1; } + + protected: + using __atomic_phase_ref_t = std::__atomic_ref<__barrier_phase_t>; + using __atomic_phase_const_ref_t = std::__atomic_ref; + static constexpr auto __phase_alignment = + __atomic_phase_ref_t::required_alignment; + + using __tickets_t = std::array<__barrier_phase_t, 64>; + struct alignas(64) /* naturally-align the heap state */ __state_t { - using __atomic_phase_ref_t = std::__atomic_ref<__barrier_phase_t>; - using __atomic_phase_const_ref_t = std::__atomic_ref; - static constexpr auto __phase_alignment = - __atomic_phase_ref_t::required_alignment; + alignas(__phase_alignment) __tickets_t __tickets; + }; - using __tickets_t = std::array<__barrier_phase_t, 64>; - struct alignas(64) /* naturally-align the heap state */ __state_t - { - alignas(__phase_alignment) __tickets_t __tickets; - }; + ptrdiff_t _M_expected; + __atomic_base<__state_t*> _M_state{nullptr}; + __atomic_base _M_expected_adjustment{0}; + alignas(__phase_alignment) __barrier_phase_t _M_phase{}; - ptrdiff_t _M_expected; - __atomic_base<__state_t*> _M_state{nullptr}; - __atomic_base _M_expected_adjustment{0}; + explicit constexpr + __tree_barrier_base(ptrdiff_t __expected) + : _M_expected(__expected) + { + __glibcxx_assert(__expected >= 0 && __expected <= max()); + + if (!std::is_constant_evaluated()) + _M_state.store(_M_alloc_state().release(), memory_order_release); + } + + unique_ptr<__state_t[]> + _M_alloc_state() + { + size_t const __count = (_M_expected + 1) >> 1; + return std::make_unique<__state_t[]>(__count); + } + + bool + _M_arrive(__barrier_phase_t __old_phase, size_t __current) + { + const auto __old_phase_val = static_cast(__old_phase); + const auto __half_step = + static_cast<__barrier_phase_t>(__old_phase_val + 1); + const auto __full_step = + static_cast<__barrier_phase_t>(__old_phase_val + 2); + + size_t __current_expected = _M_expected; + __current %= ((_M_expected + 1) >> 1); + + __state_t* const __state = _M_state.load(memory_order_relaxed); + + for (int __round = 0; ; ++__round) + { + if (__current_expected <= 1) + return true; + size_t const __end_node = ((__current_expected + 1) >> 1), + __last_node = __end_node - 1; + for ( ; ; ++__current) + { + if (__current == __end_node) + __current = 0; + auto __expect = __old_phase; + __atomic_phase_ref_t __phase(__state[__current] + .__tickets[__round]); + if (__current == __last_node && (__current_expected & 1)) + { + if (__phase.compare_exchange_strong(__expect, __full_step, + memory_order_acq_rel)) + break; // I'm 1 in 1, go to next __round + } + else if (__phase.compare_exchange_strong(__expect, __half_step, + memory_order_acq_rel)) + { + return false; // I'm 1 in 2, done with arrival + } + else if (__expect == __half_step) + { + if (__phase.compare_exchange_strong(__expect, __full_step, + memory_order_acq_rel)) + break; // I'm 2 in 2, go to next __round + } + } + __current_expected = __last_node + 1; + __current >>= 1; + } + } + }; + + template + class __tree_barrier : public __tree_barrier_base + { [[no_unique_address]] _CompletionF _M_completion; - alignas(__phase_alignment) __barrier_phase_t _M_phase{}; - - bool - _M_arrive(__barrier_phase_t __old_phase, size_t __current) - { - const auto __old_phase_val = static_cast(__old_phase); - const auto __half_step = - static_cast<__barrier_phase_t>(__old_phase_val + 1); - const auto __full_step = - static_cast<__barrier_phase_t>(__old_phase_val + 2); - - size_t __current_expected = _M_expected; - __current %= ((_M_expected + 1) >> 1); - - __state_t* const __state = _M_state.load(memory_order_relaxed); - - for (int __round = 0; ; ++__round) - { - if (__current_expected <= 1) - return true; - size_t const __end_node = ((__current_expected + 1) >> 1), - __last_node = __end_node - 1; - for ( ; ; ++__current) - { - if (__current == __end_node) - __current = 0; - auto __expect = __old_phase; - __atomic_phase_ref_t __phase(__state[__current] - .__tickets[__round]); - if (__current == __last_node && (__current_expected & 1)) - { - if (__phase.compare_exchange_strong(__expect, __full_step, - memory_order_acq_rel)) - break; // I'm 1 in 1, go to next __round - } - else if (__phase.compare_exchange_strong(__expect, __half_step, - memory_order_acq_rel)) - { - return false; // I'm 1 in 2, done with arrival - } - else if (__expect == __half_step) - { - if (__phase.compare_exchange_strong(__expect, __full_step, - memory_order_acq_rel)) - break; // I'm 2 in 2, go to next __round - } - } - __current_expected = __last_node + 1; - __current >>= 1; - } - } - // _GLIBCXX_RESOLVE_LIB_DEFECTS // 3898. Possibly unintended preconditions for completion functions void _M_invoke_completion() noexcept { _M_completion(); } @@ -159,22 +184,10 @@ It looks different from literature pseudocode for two main reasons: public: using arrival_token = __barrier_phase_t; - static constexpr ptrdiff_t - max() noexcept - { return __PTRDIFF_MAX__ - 1; } - constexpr __tree_barrier(ptrdiff_t __expected, _CompletionF __completion) - : _M_expected(__expected), _M_completion(std::move(__completion)) - { - __glibcxx_assert(__expected >= 0 && __expected <= max()); - - if (!std::is_constant_evaluated()) - { - size_t const __count = (_M_expected + 1) >> 1; - _M_state.store(new __state_t[__count], memory_order_release); - } - } + : __tree_barrier_base(__expected), _M_completion(std::move(__completion)) + { } [[nodiscard]] arrival_token arrive(ptrdiff_t __update) @@ -191,8 +204,7 @@ It looks different from literature pseudocode for two main reasons: if (__cur == 0 && !_M_state.load(memory_order_relaxed)) [[unlikely]] { - size_t const __count = (_M_expected + 1) >> 1; - auto __p = make_unique<__state_t[]>(__count); + auto __p = _M_alloc_state(); __state_t* __val = nullptr; if (_M_state.compare_exchange_strong(__val, __p.get(), memory_order_seq_cst,