[0/1] Optimizing memcpy for AMD Zen architecture.
Message ID | 20201022045005.17371-1-sajan.karumanchi@amd.com |
---|---|
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A48E5396EC90; Thu, 22 Oct 2020 04:50:40 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2086.outbound.protection.outlook.com [40.107.223.86]) by sourceware.org (Postfix) with ESMTPS id 335FD3857C57 for <libc-alpha@sourceware.org>; Thu, 22 Oct 2020 04:50:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 335FD3857C57 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=amd.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=Sajan.Karumanchi@amd.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ihHRcSnB2lYj0/6/uveY+dqf61NNmleuG89KuwkWr69GhUrCpzqvDS94ULboUqLFRojVpJ2LQ1Lcdlsvh0O1CwQPD9GyY/68mCEJTFYqrnoUHXtX+7co+nBAJ00n2Qzaq4G6XlVn+PK4QZAdJaU2/dc+Sa8san+LLUxJB82D48R+YPjhTbhR76dNW+D4L23LSH65sUq7ks2aDqqayEjXPBmTPdm0aLhedGD2stTdjzoG99A54R+di+cOH0yTnwuBeqzvXIObLjFtxmagyQ6Jh5vftVJvPQEmsDav34ESs43cmQAM+i11okHPL84Ze66GobbPo0biEpMRIqDEu6X4lQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=F96oapVOmaoR+KYJbw9bpndH4pXqVplHGrMsVpXbewM=; b=Up9HNVa2/EKadN5oRA5VldDWZTrD1G2eRiFbsE8KtGN/N1LHZxqu1poG8OpdkuEMmiF1+fT9XyyjYE6fIFiTkNdW5huWL5+7i79pWZeI8C29a4j6cOpL2Pr9aqjQgPBimOa5BYYq2OKg2/1OgK25fHGM+MR195IXeDJJ1VL+9aAZupjHHxO1W6ONHewjNbdLC6yq9wRRXutd5QmtvdFjmvp34deTh8Tbe5sxZT5HsJsvnDVUzE9vP0MALTUo/rkYNl69KN0okebH9eRIV5vLNJieTTZlOnRvzdl62qhyubIeJTtKfBKD5JgVrHAgj4IvRQRljVUEDqlxDxCS+nOEDg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector2-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=F96oapVOmaoR+KYJbw9bpndH4pXqVplHGrMsVpXbewM=; b=iDYqSeat3rUUEgAAa0it3VDp0kx2vYHQheC7NU1NXtC9U2BKKS9QsCAoyoosWk+4BzRbrCuKxQIW8KvXKHV8G2TPvfPfSLw/lPzlA94VWb/3+UvwdTMYQiYVGhC8dSGBnZraqdBBKj6rHUM18htSSkVYh9WAA4cvgHfzeKRhUgY= Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=amd.com; Received: from BY5PR12MB4067.namprd12.prod.outlook.com (2603:10b6:a03:212::17) by BYAPR12MB3094.namprd12.prod.outlook.com (2603:10b6:a03:db::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.18; Thu, 22 Oct 2020 04:50:31 +0000 Received: from BY5PR12MB4067.namprd12.prod.outlook.com ([fe80::2d32:272f:bf1b:6d24]) by BY5PR12MB4067.namprd12.prod.outlook.com ([fe80::2d32:272f:bf1b:6d24%9]) with mapi id 15.20.3499.018; Thu, 22 Oct 2020 04:50:31 +0000 From: sajan.karumanchi@amd.com To: libc-alpha@sourceware.org, carlos@redhat.com, fweimer@redhat.com Subject: [PATCH 0/1] Optimizing memcpy for AMD Zen architecture. Date: Thu, 22 Oct 2020 10:20:04 +0530 Message-Id: <20201022045005.17371-1-sajan.karumanchi@amd.com> X-Mailer: git-send-email 2.17.1 Content-Type: text/plain X-Originating-IP: [165.204.156.251] X-ClientProxiedBy: MAXPR0101CA0008.INDPRD01.PROD.OUTLOOK.COM (2603:1096:a00:c::18) To BY5PR12MB4067.namprd12.prod.outlook.com (2603:10b6:a03:212::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from amd.com (165.204.156.251) by MAXPR0101CA0008.INDPRD01.PROD.OUTLOOK.COM (2603:1096:a00:c::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.18 via Frontend Transport; Thu, 22 Oct 2020 04:50:29 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 761196da-87f4-4167-7823-08d87646011a X-MS-TrafficTypeDiagnostic: BYAPR12MB3094: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: <BYAPR12MB30942E46B4F53521E1F10BE3891D0@BYAPR12MB3094.namprd12.prod.outlook.com> X-MS-Oob-TLC-OOBClassifiers: OLM:7691; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uDpfDZgbdCfhFbFkvYnjPvi9jqQ8iVW1r9i5xIPQWh0ihHWAV185JxEIksiUEaC46aqz0LQh0Lpg1BjUozR0HMxLxMwUdSnchB0mmwhRwySnqJO/xwszb+GvfeUyubo0nDWhjTqGXrGGkKNDey3E52VkHAsZQikjADk/wFenNQSNQqpu53iX0elF4WUSc+7bm6Cc7JaeIGLKNXMfjzOeWgVg8QPt2JfslDXmPqvJnvlitKOlMYNmi+DIozAgkYcqi0sBsSndqSlLhkLGqieWJvMcuE1c8YGzhk40JnEKUzZj382WnfLLtbjNb0BbNIcmNRPhbxP/XhmV4g2UlfBsrxVP5gP+lONVzz1vcoZVARhe4gWNXr5XcyyU6K6NWAQVYAtsCIb/hT7ptC/wA8m9H//Bcmjpm/hiQgexmlTLukmKRwgMwPuL1bG+0E0KFl2F7hbsPsdQ1GtCfoy0SfT3ew== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY5PR12MB4067.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(346002)(39860400002)(366004)(136003)(376002)(66556008)(8886007)(66476007)(966005)(16526019)(52116002)(83380400001)(66946007)(8676002)(6666004)(186003)(4326008)(55016002)(9686003)(478600001)(5660300002)(7696005)(86362001)(26005)(2616005)(36756003)(2906002)(1076003)(316002)(8936002)(956004)(15519875007); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: kvoA7wb8TmTEetCKpmlXqcKVRfFIYYU3gjknPTGBxwRxnXTQrXCaUkZ2u56kSiNzkD439+hG8kdwms6WnPJD240tRJq1N6KPJMEdHhz6cFJUB6wXR0AjpwwFL/QSPF02l8FLcqetJk+dtr5satnVrF5tGe6PA9X4P/EAQ6g/jukF6NuM+30GZ2MMbUHdcSR3L08PzLYW7UlqRuiUBOLniE3C4DUh4Sd7or0ksmqF3M2lYSSMfq0q20MamIrG+K+T5x3+w5KbET3IALJNHx+E66umcjhdQ47FkY7xUQnhxfl8D9eFc4gItcN2AajLMAlO/f4ebaGZe5HTTy4oU8qw4msjMzMHiqFhQSfZwPj/yRn4SVt3JZSIdsvh1Ha7w0uWsj2BsAkl7aesUWcpcX/63iS9lLwb2TQPA/UoOerRo1aoFRaQfY1hqSB8riHFE3WlThC1oWBQU4WZ4X4XA/RKuxr9TAKhlaVhnMusX38EiHmVKk19g7mNpq3RklyckIf1HoZlBa5zLuEhBEzRNVm+2KlPElQbxnKgDyzcoohgUJk9s2Id3oLKC1JDAN3qqaYUNz+ju4P7JPOVPqYrwhFNgi9J2zNsl9AMX3SPeVhuYL102mrLty7wqtRYI+jRJUXdUCP5dyBzGgdPccqiF91i4g== X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 761196da-87f4-4167-7823-08d87646011a X-MS-Exchange-CrossTenant-AuthSource: BY5PR12MB4067.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Oct 2020 04:50:31.2977 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: IhGfQlp9AP9SUFWkeMRmDaH4CO62nqPlgHHgyIG72E8d4f62fQNLwsvo+KGSPyCfEFGbYlI44OGczMfExgQyjw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB3094 X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> Cc: Sajan Karumanchi <sajan.karumanchi@amd.com>, premachandra.mallappa@amd.com Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Message
Karumanchi, Sajan
Oct. 22, 2020, 4:50 a.m. UTC
From: Sajan Karumanchi <sajan.karumanchi@amd.com>
Modifying the shareable cache '__x86_shared_cache_size', which is a
factor in computing the non-temporal threshold parameter
'__x86_shared_non_temporal_threshold' to optimize memcpy for AMD Zen
architecture.
In the existing implementation, the shareable cache is computed as 'L3
per thread, L2 per core'.
Recomputing this shareable cache as 'L3 per CCX'(Core-Complex) has
brought performance gains of ~44% for memory sizes greater than 16MB.
The patch I posted earlier: 'Tuning NT Threshold parameter for AMD
machines'
https://sourceware.org/pipermail/libc-alpha/2020-August/117080.html
and the recent patch committed by Patrick McGehearty: 'Reversing
calculation of __x86_shared_non_temporal_threshold', both have
regression problems on AMD Zen machines for memory ranges of 1MB to 8MB
as per the large bench variant results.
This patch addresses the regression problem on AMD Zen machines.
The below link will show the performance results chart comparison of
'Master' branch and 'AMD' patch against the 2.32 stable release.
https://i.imgur.com/0ZJAwes.png
Summary: On master branch we see a regression for memoery sizes below
8MB with performance drop of upto 99%, whereas AMD patch has performance
gains for 16MB and above with no regressions.
Note: The benchmarking is done by isolating all the cpu cores in a CCX,
configuring them to fixed frequency mode and routing the IRQs to other
cpu cores.
Then the large bench tests were run by pinning to one of the isolated
cores for 1000 iterations and the performance computation is done by
taking average of these iterations.
Sajan Karumanchi (1):
x86: Optimizing memcpy for AMD Zen architecture.
sysdeps/x86/cacheinfo.h | 31 +++++++++++++++++++++++++------
1 file changed, 25 insertions(+), 6 deletions(-)