Message ID | 20201022045005.17371-2-sajan.karumanchi@amd.com |
---|---|
State | Committed |
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8D987396ECD6; Thu, 22 Oct 2020 04:50:43 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2086.outbound.protection.outlook.com [40.107.223.86]) by sourceware.org (Postfix) with ESMTPS id B42013857C57 for <libc-alpha@sourceware.org>; Thu, 22 Oct 2020 04:50:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B42013857C57 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=amd.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=Sajan.Karumanchi@amd.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=js5gJbiKWZl3fXZ9M7RW91WDKRFeqyprf/5cpGrEjt+sMRS9QP/RJs/GsaiNv8AZRcXpk7XcUINQZIp0sfhBAh8jEwsQSU0ejvZP/QmjR4P18CSYLmd6v4PA2PK1a4ZJl5rD4oMCdLCcvw8fYqvYCFNlnXrEq7gqCO4kl84BRAS7LA173kpGidNpHkS4kcFpbnxBwXDtm+Jvw6ZPP1M3tuTbBQ3ilUQ9fPHLcRjf8Et+hJKgdA3ppMJ8yr+U09UfO9CDJ3YzwaXElIFhR0En56CsefPvZIey9stlBFKirikqMJWeKK6e/GLoJUDKIb0ETTSykG3mWM1aj1CydEhZiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1vFUXOIKUsiqqRhi/emnMnk1c2vsfqokLk+eY2/B2dA=; b=CYIG4Km8LvSI+QSFvAsE48Vbqexbjo5dag5WTR6oXY2gVDfUd4SG8zpGR0+reYFNoNnvViG4hzYxYsn/bhTfGABT9mzUALHWIHXwcEOiD//sGZLZzpDjqTR7QcTXPX35iGe3pyRU440kujyoVUlobBaCgw80mz7zy2sxJy9rpvGnFNUnj3+/EH5qr4vqVoHTD6h8ru4d7oEbw3OKurBrbz53HWsoK9gLAV+Nn/jq7Ey3lWco8Ko4iN/YoLRZaR1ZL1tQFSrPTBTHl+T+zByZ6JzWr1HglW2qJ1hCP+C3Idc05eK0HOtSCEslCIse2lb1nU6dzxzQnLQxTzpSxfMgMg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector2-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1vFUXOIKUsiqqRhi/emnMnk1c2vsfqokLk+eY2/B2dA=; b=N5YU6m9B5WgefEbqRjNnf1CYCLC1+FVPnldt6Gwa67bkqveccxWSI73WGXwOETHPiU83aKHXRjNZpC96GE7ttDEJ+4CH0bMByLsYI85tQG96xYEeczMKJChoA1YU1IVyUl4zw/AY3M78HTKlGApCbhRrI3R9vAZVkTOWMGPdAOc= Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=amd.com; Received: from BY5PR12MB4067.namprd12.prod.outlook.com (2603:10b6:a03:212::17) by BYAPR12MB3094.namprd12.prod.outlook.com (2603:10b6:a03:db::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.18; Thu, 22 Oct 2020 04:50:33 +0000 Received: from BY5PR12MB4067.namprd12.prod.outlook.com ([fe80::2d32:272f:bf1b:6d24]) by BY5PR12MB4067.namprd12.prod.outlook.com ([fe80::2d32:272f:bf1b:6d24%9]) with mapi id 15.20.3499.018; Thu, 22 Oct 2020 04:50:33 +0000 From: sajan.karumanchi@amd.com To: libc-alpha@sourceware.org, carlos@redhat.com, fweimer@redhat.com Subject: [PATCH 1/1] x86: Optimizing memcpy for AMD Zen architecture. Date: Thu, 22 Oct 2020 10:20:05 +0530 Message-Id: <20201022045005.17371-2-sajan.karumanchi@amd.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201022045005.17371-1-sajan.karumanchi@amd.com> References: <20201022045005.17371-1-sajan.karumanchi@amd.com> Content-Type: text/plain X-Originating-IP: [165.204.156.251] X-ClientProxiedBy: MAXPR0101CA0008.INDPRD01.PROD.OUTLOOK.COM (2603:1096:a00:c::18) To BY5PR12MB4067.namprd12.prod.outlook.com (2603:10b6:a03:212::17) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from amd.com (165.204.156.251) by MAXPR0101CA0008.INDPRD01.PROD.OUTLOOK.COM (2603:1096:a00:c::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3499.18 via Frontend Transport; Thu, 22 Oct 2020 04:50:31 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 28394d11-d84f-4f67-de77-08d87646024e X-MS-TrafficTypeDiagnostic: BYAPR12MB3094: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: <BYAPR12MB3094646D158863BBC3C79E03891D0@BYAPR12MB3094.namprd12.prod.outlook.com> X-MS-Oob-TLC-OOBClassifiers: OLM:3276; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: CMcNv92Dv/bLlmgSsEgAVzj+EMi4JAcl0AIIH0ht0s6lqxyHmXkHlNEofaipQHNdYPUbITL0tAV5Nlg+OHmvdMhw+Mtg/+o/6g/vzIkEgX2W6VXm0oZD+yDpRsUYkof9xbDFbEYobii2M/tYmKoXO0hmbBJvQP9QQIPPYUij1FrvGpPiz5zNtiyyCm4a+d4OCvKr22WJGkQPOsieVuPL5vQkUQLhIbZah9lCowX6tELb4GCi+NPronP3ugPrTUFw/aUITp1RQVMpgLY4Uo4ZjDXFnGoQPcvkpDaQ6KOgtQUfyT30bgMCWLXqvlsom1f+S0XpElQ8kUrj0k6zlfEO0A== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY5PR12MB4067.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(396003)(346002)(39860400002)(366004)(136003)(376002)(66556008)(8886007)(66476007)(16526019)(52116002)(83380400001)(66946007)(8676002)(6666004)(186003)(4326008)(55016002)(9686003)(478600001)(5660300002)(7696005)(86362001)(26005)(2616005)(36756003)(2906002)(1076003)(316002)(8936002)(956004); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData: Nn9i9Vlu91orMNvdQHwtXNvAoQEMqr6cY9BaOgpUZlD50zL/+WONBB38z6aQEaIhAf8aJpLKah5FIJ96X/fNI3P/eRVspp6CnW60mX4hFbo7/Jqd7W3QqmVGQAn0XCpDVXrFmiAjyoch2zKJjO8/OpeBt99hY3q6smFxk3UropxO8VWgFuof92TViZluB/rEDKJrlqGfH3hw4ZCefJNlJoSI1iaaRrJvbdfNdkfVPK+dZGfzwUSuIzx1JfyxLDKmkfr52oQrEOs1Dwlmat3oYG+0ujiRpm5ybBjVlOWoBWbt0nwc2D9HVdqbyl9hbZgUEiHTjCEQBwiuAFlNWDfQPlDdCdUHXtJSfCmEFd82DR6xBeCIxHywG6EEJ7sld8F5W8mzWn6NDiodt3ItAfIRSHw+RCSZCDaIXZyfafHE2LBz3igLhD7FQtQk9jUMk8Lel8Dvf3b77Sh1BAC8p+bU4VqWl+pxQnB0VbKhtFqU41kEdxk5roQU5JlweucI7eIh1UJsrr3+XG4ky1YI9Au3XtOWpZgH24gdiJ0DgbNd8mw8Hdsx7lXPFjNO+FrBO9JGvZmjEqPrznFJU3N7aSFRBqyKzEcmNQ2kYKnF19XjP7q2Xycgjax0yDh7UcbgKXQgE/5iM5ud8EwNvnGR8gO5dA== X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-Network-Message-Id: 28394d11-d84f-4f67-de77-08d87646024e X-MS-Exchange-CrossTenant-AuthSource: BY5PR12MB4067.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Oct 2020 04:50:33.2721 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: p0wqhqiw7orfCjuFn0HUL0v8cBby9DhPQ30n5YK9O3tss4XX9K1+kLuHSCnANKVR11uuCJxtrDTU2yPoNKtgng== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR12MB3094 X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> Cc: Sajan Karumanchi <sajan.karumanchi@amd.com>, premachandra.mallappa@amd.com Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series |
Optimizing memcpy for AMD Zen architecture.
|
|
Commit Message
Karumanchi, Sajan
Oct. 22, 2020, 4:50 a.m. UTC
From: Sajan Karumanchi <sajan.karumanchi@amd.com> Modifying the shareable cache '__x86_shared_cache_size', which is a factor in computing the non-temporal threshold parameter '__x86_shared_non_temporal_threshold' to optimize memcpy for AMD Zen architectures. In the existing implementation, the shareable cache is computed as 'L3 per thread, L2 per core'. Recomputing this shareable cache as 'L3 per CCX(Core-Complex)' has brought in performance gains. As per the large bench variant results, this patch also addresses the regression problem on AMD Zen architectures. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com> Signed-off-by: Premachandra Mallappa <premachandra.mallappa@amd.com> Signed-off-by: Sajan Karumanchi <sajan.karumanchi@amd.com> --- sysdeps/x86/cacheinfo.h | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-)
Comments
* sajan karumanchi: > diff --git a/sysdeps/x86/cacheinfo.h b/sysdeps/x86/cacheinfo.h > index 7f342fdc23..d6d6877702 100644 > --- a/sysdeps/x86/cacheinfo.h > +++ b/sysdeps/x86/cacheinfo.h > @@ -303,6 +303,8 @@ init_cacheinfo (void) > data = handle_amd (_SC_LEVEL1_DCACHE_SIZE); > long int core = handle_amd (_SC_LEVEL2_CACHE_SIZE); > shared = handle_amd (_SC_LEVEL3_CACHE_SIZE); > + unsigned int eax; > + unsigned int threads_per_ccx = 0; > > /* Get maximum extended function. */ > __cpuid (0x80000000, max_cpuid_ex, ebx, ecx, edx); > @@ -320,7 +322,7 @@ init_cacheinfo (void) > threads = 1 << ((ecx >> 12) & 0x0f); > } > > - if (threads == 0) > + if (threads == 0 || cpu_features->basic.family >= 0x17) > { > /* If APIC ID width is not available, use logical > processor count. */ > @@ -335,13 +337,27 @@ init_cacheinfo (void) > if (threads > 0) > shared /= threads; > > - /* Account for exclusive L2 and L3 caches. */ > - shared += core; > - } > + /* Get shared cache per ccx for Zen architectures */ > + if (cpu_features->basic.family >= 0x17) > + { > + /* Get number of threads share the L3 cache in CCX */ > + __cpuid_count(0x8000001D, 0x3, eax, ebx, ecx, edx); > + threads_per_ccx = ((eax >> 14) & 0xfff) + 1; > + shared = shared * threads_per_ccx; > + } > + else > + { > + /* Account for exclusive L2 and L3 caches. */ > + shared += core; > + } > + } > } Although not visible in the patch, these changes a properly guarded by an arch_kind_amd check, as expected. Beyond that, I can't comment on the substance of the patch, but I'd like to request the follow style changes: * Move the definitions of eax and threads_per_ccx closer to their usage site. Initialize threads_per_ccx directly with its final variable. (The separate variable is nice for documentation purposes.) * Add a space after __cpuid_count (to follow GNU style). * Add ". " (period and two spaces) at the end of all new comments. * Remove Signed-off-by. glibc does not use DCO <https://developercertificate.org/>. I assume this patch is covered by AMD's copyright assignment instead. I can make these changes for you and push this, or you can post a new patch. Thanks, Florian
On Wed, Oct 21, 2020 at 9:51 PM <sajan.karumanchi@amd.com> wrote: > > From: Sajan Karumanchi <sajan.karumanchi@amd.com> > > Modifying the shareable cache '__x86_shared_cache_size', which is a > factor in computing the non-temporal threshold parameter > '__x86_shared_non_temporal_threshold' to optimize memcpy for AMD Zen > architectures. > In the existing implementation, the shareable cache is computed as 'L3 > per thread, L2 per core'. Recomputing this shareable cache as 'L3 per > CCX(Core-Complex)' has brought in performance gains. > As per the large bench variant results, this patch also addresses the > regression problem on AMD Zen architectures. > > Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com> > Signed-off-by: Premachandra Mallappa <premachandra.mallappa@amd.com> > Signed-off-by: Sajan Karumanchi <sajan.karumanchi@amd.com> > --- > sysdeps/x86/cacheinfo.h | 31 +++++++++++++++++++++++++------ > 1 file changed, 25 insertions(+), 6 deletions(-) > > diff --git a/sysdeps/x86/cacheinfo.h b/sysdeps/x86/cacheinfo.h > index 7f342fdc23..d6d6877702 100644 > --- a/sysdeps/x86/cacheinfo.h > +++ b/sysdeps/x86/cacheinfo.h > @@ -303,6 +303,8 @@ init_cacheinfo (void) > data = handle_amd (_SC_LEVEL1_DCACHE_SIZE); > long int core = handle_amd (_SC_LEVEL2_CACHE_SIZE); > shared = handle_amd (_SC_LEVEL3_CACHE_SIZE); > + unsigned int eax; > + unsigned int threads_per_ccx = 0; > > /* Get maximum extended function. */ > __cpuid (0x80000000, max_cpuid_ex, ebx, ecx, edx); > @@ -320,7 +322,7 @@ init_cacheinfo (void) > threads = 1 << ((ecx >> 12) & 0x0f); > } > > - if (threads == 0) > + if (threads == 0 || cpu_features->basic.family >= 0x17) > { > /* If APIC ID width is not available, use logical > processor count. */ > @@ -335,13 +337,27 @@ init_cacheinfo (void) > if (threads > 0) > shared /= threads; > > - /* Account for exclusive L2 and L3 caches. */ > - shared += core; > - } > + /* Get shared cache per ccx for Zen architectures */ > + if (cpu_features->basic.family >= 0x17) > + { > + /* Get number of threads share the L3 cache in CCX */ > + __cpuid_count(0x8000001D, 0x3, eax, ebx, ecx, edx); > + threads_per_ccx = ((eax >> 14) & 0xfff) + 1; > + shared = shared * threads_per_ccx; > + } > + else > + { > + /* Account for exclusive L2 and L3 caches. */ > + shared += core; > + } > + } > } > > if (cpu_features->data_cache_size != 0) > - data = cpu_features->data_cache_size; > + { > + if (data == 0 || cpu_features->basic.kind != arch_kind_amd) > + data = cpu_features->data_cache_size; > + } This looks wrong. cpu_features->data_cache_size is set by GLIBC tunables: -- Tunable: glibc.cpu.x86_data_cache_size The ‘glibc.cpu.x86_data_cache_size’ tunable allows the user to set data cache size in bytes for use in memory and string routines. This tunable is specific to i386 and x86-64. Why is it ignored? > if (data > 0) > { > @@ -354,7 +370,10 @@ init_cacheinfo (void) > } > > if (cpu_features->shared_cache_size != 0) > - shared = cpu_features->shared_cache_size; > + { > + if (shared == 0 || cpu_features->basic.kind != arch_kind_amd) > + shared = cpu_features->shared_cache_size; > + } This looks wrong. cpu_features->shared_cache_size is set by GLIBC tunables: -- Tunable: glibc.cpu.x86_shared_cache_size The ‘glibc.cpu.x86_shared_cache_size’ tunable allows the user to set shared cache size in bytes for use in memory and string routines. Why is it ignored? > if (shared > 0) > { > -- > 2.17.1 > I think they should be reverted.
* H. J. Lu: >> if (cpu_features->data_cache_size != 0) >> - data = cpu_features->data_cache_size; >> + { >> + if (data == 0 || cpu_features->basic.kind != arch_kind_amd) >> + data = cpu_features->data_cache_size; >> + } > > This looks wrong. cpu_features->data_cache_size is set by > GLIBC tunables: > > -- Tunable: glibc.cpu.x86_data_cache_size > The ‘glibc.cpu.x86_data_cache_size’ tunable allows the user to set > data cache size in bytes for use in memory and string routines. > > This tunable is specific to i386 and x86-64. > > Why is it ignored? So we should revert those two hunks only? Thanks, Florian
On Wed, Oct 28, 2020 at 7:29 AM Florian Weimer <fweimer@redhat.com> wrote: > > * H. J. Lu: > > >> if (cpu_features->data_cache_size != 0) > >> - data = cpu_features->data_cache_size; > >> + { > >> + if (data == 0 || cpu_features->basic.kind != arch_kind_amd) > >> + data = cpu_features->data_cache_size; > >> + } > > > > This looks wrong. cpu_features->data_cache_size is set by > > GLIBC tunables: > > > > -- Tunable: glibc.cpu.x86_data_cache_size > > The ‘glibc.cpu.x86_data_cache_size’ tunable allows the user to set > > data cache size in bytes for use in memory and string routines. > > > > This tunable is specific to i386 and x86-64. > > > > Why is it ignored? > > So we should revert those two hunks only? > Yes.
[AMD Public Use] Hi H. J .Lu, Thanks for pointing it out and sorry for my ignorance of the Glibc tunables. I overlooked this tunables part, as it was once reviewed by you for another patch which did not make through. Patch: https://sourceware.org/pipermail/libc-alpha/2020-August/117081.html Review: https://sourceware.org/pipermail/libc-alpha/2020-September/117385.html Thanks & Regards, Sajan K. -----Original Message----- From: H.J. Lu <hjl.tools@gmail.com> Sent: Wednesday, October 28, 2020 8:14 PM To: Florian Weimer <fweimer@redhat.com> Cc: Karumanchi, Sajan <Sajan.Karumanchi@amd.com>; GNU C Library <libc-alpha@sourceware.org>; Carlos O'Donell <carlos@redhat.com>; Mallappa, Premachandra <Premachandra.Mallappa@amd.com> Subject: Re: [PATCH 1/1] x86: Optimizing memcpy for AMD Zen architecture. [CAUTION: External Email] On Wed, Oct 28, 2020 at 7:29 AM Florian Weimer <fweimer@redhat.com> wrote: > > * H. J. Lu: > > >> if (cpu_features->data_cache_size != 0) > >> - data = cpu_features->data_cache_size; > >> + { > >> + if (data == 0 || cpu_features->basic.kind != arch_kind_amd) > >> + data = cpu_features->data_cache_size; > >> + } > > > > This looks wrong. cpu_features->data_cache_size is set by GLIBC > > tunables: > > > > -- Tunable: glibc.cpu.x86_data_cache_size > > The ‘glibc.cpu.x86_data_cache_size’ tunable allows the user to set > > data cache size in bytes for use in memory and string routines. > > > > This tunable is specific to i386 and x86-64. > > > > Why is it ignored? > > So we should revert those two hunks only? > Yes. -- H.J.
diff --git a/sysdeps/x86/cacheinfo.h b/sysdeps/x86/cacheinfo.h index 7f342fdc23..d6d6877702 100644 --- a/sysdeps/x86/cacheinfo.h +++ b/sysdeps/x86/cacheinfo.h @@ -303,6 +303,8 @@ init_cacheinfo (void) data = handle_amd (_SC_LEVEL1_DCACHE_SIZE); long int core = handle_amd (_SC_LEVEL2_CACHE_SIZE); shared = handle_amd (_SC_LEVEL3_CACHE_SIZE); + unsigned int eax; + unsigned int threads_per_ccx = 0; /* Get maximum extended function. */ __cpuid (0x80000000, max_cpuid_ex, ebx, ecx, edx); @@ -320,7 +322,7 @@ init_cacheinfo (void) threads = 1 << ((ecx >> 12) & 0x0f); } - if (threads == 0) + if (threads == 0 || cpu_features->basic.family >= 0x17) { /* If APIC ID width is not available, use logical processor count. */ @@ -335,13 +337,27 @@ init_cacheinfo (void) if (threads > 0) shared /= threads; - /* Account for exclusive L2 and L3 caches. */ - shared += core; - } + /* Get shared cache per ccx for Zen architectures */ + if (cpu_features->basic.family >= 0x17) + { + /* Get number of threads share the L3 cache in CCX */ + __cpuid_count(0x8000001D, 0x3, eax, ebx, ecx, edx); + threads_per_ccx = ((eax >> 14) & 0xfff) + 1; + shared = shared * threads_per_ccx; + } + else + { + /* Account for exclusive L2 and L3 caches. */ + shared += core; + } + } } if (cpu_features->data_cache_size != 0) - data = cpu_features->data_cache_size; + { + if (data == 0 || cpu_features->basic.kind != arch_kind_amd) + data = cpu_features->data_cache_size; + } if (data > 0) { @@ -354,7 +370,10 @@ init_cacheinfo (void) } if (cpu_features->shared_cache_size != 0) - shared = cpu_features->shared_cache_size; + { + if (shared == 0 || cpu_features->basic.kind != arch_kind_amd) + shared = cpu_features->shared_cache_size; + } if (shared > 0) {