From patchwork Fri Apr 30 18:24:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 43199 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2F55C385780F; Fri, 30 Apr 2021 18:24:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2F55C385780F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1619807098; bh=KlTefParD+RSUeEaB1nR0rHWOT/4k/2BguAONL1UkU0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Ou/tEi/mC/iNiDZyqGLjuIaNFfItMROzz0tsOy3SVy1zHKhw3h3rCYEhguvRe86UX BJ0so/PBBZsQgq3BNPaoDXccfGyM20DEZ/OcgA9EvrGe9JD26rqBVX2Oje2GA2kwX3 bXNz1QiHjTGrj+Qo7R65q2L8Hb5ikxZYsphM0m9Q= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pg1-x534.google.com (mail-pg1-x534.google.com [IPv6:2607:f8b0:4864:20::534]) by sourceware.org (Postfix) with ESMTPS id BDD13385803C for ; Fri, 30 Apr 2021 18:24:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BDD13385803C Received: by mail-pg1-x534.google.com with SMTP id p2so34523645pgh.4 for ; Fri, 30 Apr 2021 11:24:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=KlTefParD+RSUeEaB1nR0rHWOT/4k/2BguAONL1UkU0=; b=qHiafO/E6k8ADgzy9Zu8hKR2T3dnpqxx8qjB2oKUh7UY1Y58lVZZ74Cqm4Ej90GZDk NCDx0upgD9eC+BEDp1amQcv7Fz9yH1h/JdHiM6NpznGlcVpFKtbxIYLAwtqWLLegNyHl lhLHpi4GQktPcrDsJ9u3KlHWaJxpXutQB9GFczVPh/zPPDfYuPRgKgV5vk5ii4EQ+6yb 8UsJIeWXcPkTmO0MJWASvv7SrCXPMzpW05kBge5QMRd4ryaz9t+Zc0EXLoOq9pUeS5K3 JAcXjtX5GRJWfFK0PgHGQvcJowu+8rQIqvc7I/SKOFhkZKY+zaTLzPu44iv9LbmhJAq6 LOxw== X-Gm-Message-State: AOAM5315qZgmWCCPA0La6o4VAWUACeiwrv56iKO1g6/OaRf3hq+3verV JU+sV73f5/bJYuC24tGP0JDAIPvu6tmEbg== X-Google-Smtp-Source: ABdhPJxmTdDlL5SkgmMz/J3maQa/DBBSKRBfZHM+9GU5GaTLZCugirAPgcNkqVbxh/UsD1zMQcHdSg== X-Received: by 2002:a62:754d:0:b029:289:11e7:4103 with SMTP id q74-20020a62754d0000b029028911e74103mr4677981pfc.25.1619807094524; Fri, 30 Apr 2021 11:24:54 -0700 (PDT) Received: from gnu-cfl-2.localdomain ([172.58.35.177]) by smtp.gmail.com with ESMTPSA id d132sm2766198pfd.136.2021.04.30.11.24.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 30 Apr 2021 11:24:54 -0700 (PDT) Received: from gnu-tgl-2.localdomain (gnu-tgl-2 [192.168.1.34]) by gnu-cfl-2.localdomain (Postfix) with ESMTPS id BBAE6C036C for ; Fri, 30 Apr 2021 11:24:52 -0700 (PDT) Received: from gnu-tgl-2.lan (localhost [IPv6:::1]) by gnu-tgl-2.localdomain (Postfix) with ESMTP id 95879300033 for ; Fri, 30 Apr 2021 11:24:42 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH] x86: Set rep_movsb_threshold to 2112 on processors with FSRM Date: Fri, 30 Apr 2021 11:24:42 -0700 Message-Id: <20210430182442.3612464-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-3035.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" The glibc memcpy benchmark on Intel Core i7-1065G7 (Ice Lake) showed that REP MOVSB became faster after 2112 bytes: Vector + REP MOVSB REP MOVSB length=2112, align1=0, align2=0: 24.20 24.40 length=2112, align1=1, align2=0: 26.07 23.13 length=2112, align1=0, align2=1: 27.18 28.13 length=2112, align1=1, align2=1: 26.23 25.16 length=2176, align1=0, align2=0: 23.18 22.52 length=2176, align1=2, align2=0: 25.45 22.52 length=2176, align1=0, align2=2: 27.14 27.82 length=2176, align1=2, align2=2: 22.73 25.56 length=2240, align1=0, align2=0: 24.62 24.25 length=2240, align1=3, align2=0: 29.77 27.15 length=2240, align1=0, align2=3: 35.55 29.93 length=2240, align1=3, align2=3: 34.49 25.15 length=2304, align1=0, align2=0: 34.75 26.64 length=2304, align1=4, align2=0: 32.09 22.63 length=2304, align1=0, align2=4: 28.43 31.24 Use REP MOVSB for data size > 2112 bytes in memcpy on processors with fast short REP MOVSB (FSRM). * sysdeps/x86/dl-cacheinfo.h (dl_init_cacheinfo): Set rep_movsb_threshold to 2112 on processors with fast short REP MOVSB (FSRM). --- sysdeps/x86/dl-cacheinfo.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index d9944250fc..3f04fb5019 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -871,7 +871,10 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) if (CPU_FEATURE_USABLE_P (cpu_features, AVX512F) && !CPU_FEATURE_PREFERRED_P (cpu_features, Prefer_No_AVX512)) { - rep_movsb_threshold = 2048 * (64 / 16); + if (CPU_FEATURE_USABLE_P (cpu_features, FSRM)) + rep_movsb_threshold = 2112; + else + rep_movsb_threshold = 2048 * (64 / 16); #if HAVE_TUNABLES minimum_rep_movsb_threshold = 64 * 8; #endif @@ -879,7 +882,10 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) else if (CPU_FEATURE_PREFERRED_P (cpu_features, AVX_Fast_Unaligned_Load)) { - rep_movsb_threshold = 2048 * (32 / 16); + if (CPU_FEATURE_USABLE_P (cpu_features, FSRM)) + rep_movsb_threshold = 2112; + else + rep_movsb_threshold = 2048 * (32 / 16); #if HAVE_TUNABLES minimum_rep_movsb_threshold = 32 * 8; #endif