From patchwork Tue Jan 3 21:06:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 62675 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0D3523858D39 for ; Tue, 3 Jan 2023 21:07:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0D3523858D39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1672780033; bh=NC6WHyNhuE2/xhsOK+iTwrmxLpmIRbP5RH6BK/BXPq0=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=wjfomcn43Bns2W6sLALLVnLWd2QQbEWnHM5dJ7Twcm1UnnMMcJscXolZ8HkNIkqw0 5H6RLK1PKAQGkwah/BI1ZsVrafA9CVr/dMm3CcaifZPvkI7a904my1VISjNTTXhjdB i/Ay/0AO9Ej46mhN9323kDWCBnFqRHNqgXB9BTdw= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by sourceware.org (Postfix) with ESMTPS id C2B713858D1E for ; Tue, 3 Jan 2023 21:06:51 +0000 (GMT) Received: by mail-pg1-x52e.google.com with SMTP id 78so20935864pgb.8 for ; Tue, 03 Jan 2023 13:06:51 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NC6WHyNhuE2/xhsOK+iTwrmxLpmIRbP5RH6BK/BXPq0=; b=EpsJiL0TEX+SJgjurXzq2dgeKPISEfwvBW3/BgM3P2NT9M8y/2FNCl0Ji9CONvxoGC jd+D7us+rt0XfCf9Pn4rkWj5vOIIfwsB6Y+1GIUb0S82BFYoPWK72rckRcQs9rN+nwnl uTr6zRNusZ5TYeuZZk+GN5DyTCOwbD7OlZdaz7RPleECvxWsyuyyDdbrrWBFlqLqztet LNwdUj+0tjcqFq1uMJ800xTbAXZbYQil4PxB4xLABA192531xiIYp/B0SN4err+fCtqY W+y2bCPuFjQDKUsTvDUp2E31i9371iwIqSMVObToNl4a1VxRNOalHEeDfEE52oFAepM2 4toA== X-Gm-Message-State: AFqh2kr/R29GW6dSBPN4dq5Wo/j42qTZPxK88Sytz7EO0e64TIEdZXis 8dQeWudH27/2gA/S92k1QIq29TkPTjU= X-Google-Smtp-Source: AMrXdXuo/r50sguaj7pZJwEuBPp+dg72sLVNaRa137V1JKAcfChOz3f3cttqLe6WWSq4sueTTUTM/Q== X-Received: by 2002:a05:6a00:3247:b0:576:65f5:c60a with SMTP id bn7-20020a056a00324700b0057665f5c60amr40279957pfb.27.1672780010653; Tue, 03 Jan 2023 13:06:50 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.56.31.27]) by smtp.gmail.com with ESMTPSA id y14-20020aa79aee000000b0056da63c8515sm16802939pfp.91.2023.01.03.13.06.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jan 2023 13:06:50 -0800 (PST) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id BA311740059; Tue, 3 Jan 2023 13:06:48 -0800 (PST) To: libc-alpha@sourceware.org Cc: Noah Goldstein Subject: [PATCH v2] x86: Check minimum/maximum of non_temporal_threshold [BZ #29953] Date: Tue, 3 Jan 2023 13:06:48 -0800 Message-Id: <20230103210648.2569652-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.39.0 MIME-Version: 1.0 X-Spam-Status: No, score=-3023.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" The minimum non_temporal_threshold is 0x4040. non_temporal_threshold may be set to less than the minimum value when the shared cache size isn't available (e.g., in an emulator) or by the tunable. Add checks for minimum and maximum of non_temporal_threshold. This fixes BZ #29953. --- sysdeps/x86/dl-cacheinfo.h | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index e9f3382108..637b5a022d 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -861,6 +861,18 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) share of the cache, it has a substantial risk of negatively impacting the performance of other threads running on the chip. */ unsigned long int non_temporal_threshold = shared * 3 / 4; + /* SIZE_MAX >> 4 because memmove-vec-unaligned-erms right-shifts the value of + 'x86_non_temporal_threshold' by `LOG_4X_MEMCPY_THRESH` (4) and it is best + if that operation cannot overflow. Minimum of 0x4040 (16448) because the + L(large_memset_4x) loops need 64-byte to cache align and enough space for + at least 1 iteration of 4x PAGE_SIZE unrolled loop. Both values are + reflected in the manual. */ + unsigned long int maximum_non_temporal_threshold = SIZE_MAX >> 4; + unsigned long int minimum_non_temporal_threshold = 0x4040; + if (non_temporal_threshold < minimum_non_temporal_threshold) + non_temporal_threshold = minimum_non_temporal_threshold; + else if (non_temporal_threshold > maximum_non_temporal_threshold) + non_temporal_threshold = maximum_non_temporal_threshold; #if HAVE_TUNABLES /* NB: The REP MOVSB threshold must be greater than VEC_SIZE * 8. */ @@ -915,8 +927,8 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) shared = tunable_size; tunable_size = TUNABLE_GET (x86_non_temporal_threshold, long int, NULL); - /* NB: Ignore the default value 0. */ - if (tunable_size != 0) + if (tunable_size > minimum_non_temporal_threshold + && tunable_size <= maximum_non_temporal_threshold) non_temporal_threshold = tunable_size; tunable_size = TUNABLE_GET (x86_rep_movsb_threshold, long int, NULL); @@ -931,14 +943,9 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) TUNABLE_SET_WITH_BOUNDS (x86_data_cache_size, data, 0, SIZE_MAX); TUNABLE_SET_WITH_BOUNDS (x86_shared_cache_size, shared, 0, SIZE_MAX); - /* SIZE_MAX >> 4 because memmove-vec-unaligned-erms right-shifts the value of - 'x86_non_temporal_threshold' by `LOG_4X_MEMCPY_THRESH` (4) and it is best - if that operation cannot overflow. Minimum of 0x4040 (16448) because the - L(large_memset_4x) loops need 64-byte to cache align and enough space for - at least 1 iteration of 4x PAGE_SIZE unrolled loop. Both values are - reflected in the manual. */ TUNABLE_SET_WITH_BOUNDS (x86_non_temporal_threshold, non_temporal_threshold, - 0x4040, SIZE_MAX >> 4); + minimum_non_temporal_threshold, + maximum_non_temporal_threshold); TUNABLE_SET_WITH_BOUNDS (x86_rep_movsb_threshold, rep_movsb_threshold, minimum_rep_movsb_threshold, SIZE_MAX); TUNABLE_SET_WITH_BOUNDS (x86_rep_stosb_threshold, rep_stosb_threshold, 1,