From patchwork Fri Jan 7 13:41:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 49692 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 831DB3857C7E for ; Fri, 7 Jan 2022 13:42:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 831DB3857C7E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1641562941; bh=8WK7o67MZjq7KDG4HPwaOq+vfdUZxtqWu2UNU1cmI/c=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=XyApfIxhV4b4Ov4DdgZmvjY60tgWm/MkjxnrdXhN5hbitJUUg5a6ElbXkt+EjGVq9 q38D1onU2rFpm82M2QZio5Uz73bTWF5tLLnqh0JEaMKodfowerDEpwwkqQpnIeGSm9 GX7a+33Xm6fyt671tcJW9ffUZqHWrxKdux/Ei208= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 24302385802B for ; Fri, 7 Jan 2022 13:42:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 24302385802B Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-221-7Z78IDvtPeGyMjMoFwLpYA-1; Fri, 07 Jan 2022 08:41:56 -0500 X-MC-Unique: 7Z78IDvtPeGyMjMoFwLpYA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AC11610168C3; Fri, 7 Jan 2022 13:41:55 +0000 (UTC) Received: from oldenburg.str.redhat.com (unknown [10.39.192.102]) by smtp.corp.redhat.com (Postfix) with ESMTPS id EBDF71079F34; Fri, 7 Jan 2022 13:41:54 +0000 (UTC) To: libc-alpha@sourceware.org Subject: [PATCH] elf: Simplify software TM implementation in _dl_find_object Date: Fri, 07 Jan 2022 14:41:53 +0100 Message-ID: <87zgo73boe.fsf@oldenburg.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Florian Weimer via Libc-alpha From: Florian Weimer Reply-To: Florian Weimer Cc: Szabolcs Nagy Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" With the current set of fences, the version update at the start of the TM write operation is redundant. Also use relaxed MO stores during the dlclose update, and skip any version changes there. Suggested-by: Szabolcs Nagy --- Initial testing looks good, but I'll keep the aarch64 and powerpc64le tests running in a loop for a while. elf/dl-find_object.c | 50 ++++++++++++++++---------------------------------- 1 file changed, 16 insertions(+), 34 deletions(-) diff --git a/elf/dl-find_object.c b/elf/dl-find_object.c index 0eb8607edd..7b906c8f1d 100644 --- a/elf/dl-find_object.c +++ b/elf/dl-find_object.c @@ -123,9 +123,9 @@ struct dlfo_mappings_segment /* To achieve async-signal-safety, two copies of the data structure are used, so that a signal handler can still use this data even if - dlopen or dlclose modify the other copy. The the MSB in - _dlfo_loaded_mappings_version determines which array element is the - currently active region. */ + dlopen or dlclose modify the other copy. The the least significant + bit in _dlfo_loaded_mappings_version determines which array element + is the currently active region. */ static struct dlfo_mappings_segment *_dlfo_loaded_mappings[2]; /* Returns the number of actually used elements in all segments @@ -248,7 +248,7 @@ _dlfo_read_start_version (void) { /* Acquire MO load synchronizes with the fences at the beginning and end of the TM update region in _dlfo_mappings_begin_update, - _dlfo_mappings_end_update, _dlfo_mappings_end_update_no_switch. */ + _dlfo_mappings_end_update. */ return __atomic_wide_counter_load_acquire (&_dlfo_loaded_mappings_version); } @@ -261,16 +261,13 @@ _dlfo_read_version_locked (void) } /* Update the version to reflect that an update is happening. This - does not change the bit that controls the active segment chain. - Returns the index of the currently active segment chain. */ -static inline unsigned int + does not change the bit that controls the active segment chain. */ +static inline void _dlfo_mappings_begin_update (void) { - /* The store synchronizes with loads in _dlfo_read_start_version + /* The fence synchronizes with loads in _dlfo_read_start_version (also called from _dlfo_read_success). */ atomic_thread_fence_release (); - return __atomic_wide_counter_fetch_add_relaxed - (&_dlfo_loaded_mappings_version, 2); } /* Installs the just-updated version as the active version. */ @@ -282,15 +279,6 @@ _dlfo_mappings_end_update (void) atomic_thread_fence_release (); __atomic_wide_counter_fetch_add_relaxed (&_dlfo_loaded_mappings_version, 1); } -/* Completes an in-place update without switching versions. */ -static inline void -_dlfo_mappings_end_update_no_switch (void) -{ - /* The store synchronizes with loads in _dlfo_read_start_version - (also called from _dlfo_read_success). */ - atomic_thread_fence_release (); - __atomic_wide_counter_fetch_add_relaxed (&_dlfo_loaded_mappings_version, 2); -} /* Return true if the read was successful, given the start version. */ @@ -302,10 +290,11 @@ _dlfo_read_success (uint64_t start_version) the TM region are not ordered past the version check below. */ atomic_thread_fence_acquire (); - /* Synchronizes with stores in _dlfo_mappings_begin_update, - _dlfo_mappings_end_update, _dlfo_mappings_end_update_no_switch. - It is important that all stores from the last update have been - visible, and stores from the next updates are not. + /* Synchronizes with the fences in _dlfo_mappings_begin_update, + _dlfo_mappings_end_update. It is important that all stores from + the last update have become visible, and stores from the next + update to this version are not before the version number is + updated. Unlike with seqlocks, there is no check for odd versions here because we have read the unmodified copy (confirmed to be @@ -839,17 +828,10 @@ _dl_find_object_dlclose (struct link_map *map) issues around __libc_freeres. */ return; - /* The update happens in-place, but given that we do not use - atomic accesses on the read side, update the version around - the update to trigger re-validation in concurrent - readers. */ - _dlfo_mappings_begin_update (); - - /* Mark as closed. */ - obj->map_end = obj->map_start; - obj->map = NULL; - - _dlfo_mappings_end_update_no_switch (); + /* Mark as closed. This does not change the overall data + structure, so no TM cycle is needed. */ + atomic_store_relaxed (&obj->map_end, obj->map_start); + atomic_store_relaxed (&obj->map, NULL); return; } }