From patchwork Tue Aug 1 15:20:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: develop--- via Libc-alpha X-Patchwork-Id: 73432 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 52097385842A for ; Tue, 1 Aug 2023 15:22:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 52097385842A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1690903338; bh=5OHqQ7w/ySTGRhb0oW3IQspLJU62+pE88aOiVecqmCo=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=twR8mL3Ew7bZSK+byYcUZydw1ZhhmMzrze1bnePS2ujYKFtPy09YcsBVu4Lf96vYd LnT69T64NxcM2EX8Eq/PwcaojUDGIOos0eDa8Q3dfBBpdmD2AbH5zk1EbNQQlIWFYK 3TstIVwOrIRH52bfWUCDD0XdjmqQAM0zdlMsNw2Y= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id C46DA3858423 for ; Tue, 1 Aug 2023 15:21:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C46DA3858423 Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-56433b1b12dso1745745a12.1 for ; Tue, 01 Aug 2023 08:21:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690903313; x=1691508113; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=5OHqQ7w/ySTGRhb0oW3IQspLJU62+pE88aOiVecqmCo=; b=BVM0GEZn+f9CAqsXeeu1mMBFmcNPkBQJvHJTjbDxGt5Q6lcSaC8zQvH7mf24+FAOBU TDOnf9MQAoEAR4IDJVMTu1wyezs09375dgWxt7rbwGFj0qPhKGIift8IWl75+3jZ84b3 YMdYkC5L+PMCQvuQev2snYYNnYDgDxFP/2USWBqZCZlUhBMhrXcmZ5dvzVHLyJPxl201 k+g2J+5HIx5eO8S5nquFuwImrN/iQn4bi3ptoRJw8bxh0Q7ghFw7y1pyTX1YYmOUvhZB fcIYB9N2Yj2DlHbtJtg8mi1oyqcvUeRogcGmd34dDQfisU/Omprf2Uzv7YKBu6e6nI8n 0xnw== X-Gm-Message-State: ABy/qLY5YwZTcYYIItytVvNyQH41bdpI9ZIcbShQXsDt3orPoS1ks08x 279uba1KXZPKasLRWzRZsWPcTOIfSov3GQ== X-Google-Smtp-Source: APBJJlEREjgTSQpCxndh3Z7vY7WK0oSmLtiKvWcnp/OvAvnuM+VSvrSJPMwo28hWQScbk8PYu6WDBQ== X-Received: by 2002:a17:90b:4f81:b0:267:f893:d562 with SMTP id qe1-20020a17090b4f8100b00267f893d562mr11348374pjb.8.1690903313432; Tue, 01 Aug 2023 08:21:53 -0700 (PDT) Received: from lib-genoa-01.amd.com ([165.204.156.251]) by smtp.gmail.com with ESMTPSA id jh3-20020a170903328300b001b50cbc0b4fsm10653363plb.111.2023.08.01.08.21.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 08:21:53 -0700 (PDT) X-Google-Original-From: sajan.karumanchi@amd.com To: fweimer@redhat.com Subject: [PATCH v2] x86: Fix for cache computation on AMD legacy cpus. Date: Tue, 1 Aug 2023 15:20:55 +0000 Message-Id: <20230801152055.1230013-2-sajan.karumanchi@amd.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230801152055.1230013-1-sajan.karumanchi@amd.com> References: <20230602131907.1375745-1-sajan.karumanchi@amd.com> <20230801152055.1230013-1-sajan.karumanchi@amd.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "sajan.karumanchi--- via Libc-alpha" From: develop--- via Libc-alpha Reply-To: fweimer@redhat.com Cc: sajan.karumanchi@gmail.com, libc-alpha@sourceware.org, premachandra.mallappa@amd.com, carlos@redhat.com, Sajan Karumanchi Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" From: Sajan Karumanchi Some legacy AMD CPUs and hypervisors have the _cpuid_ '0x8000_001D' set to Zero, thus resulting in zeroed-out computed cache values. This patch reintroduces the old way of cache computation as a fail-safe option to handle these exceptions. Fixed 'level4_cache_size' value through handle_amd(). Reviewed-by: Premachandra Mallappa Tested-by: Florian Weimer diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index cd4d0351ae..285773039f 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -315,40 +315,206 @@ handle_amd (int name) { unsigned int eax; unsigned int ebx; - unsigned int ecx; + unsigned int ecx = 0; unsigned int edx; - unsigned int count = 0x1; + unsigned int max_cpuid = 0; + unsigned int fn = 0; /* No level 4 cache (yet). */ if (name > _SC_LEVEL3_CACHE_LINESIZE) return 0; - if (name >= _SC_LEVEL3_CACHE_SIZE) - count = 0x3; - else if (name >= _SC_LEVEL2_CACHE_SIZE) - count = 0x2; - else if (name >= _SC_LEVEL1_DCACHE_SIZE) - count = 0x0; + __cpuid (0x80000000, max_cpuid, ebx, ecx, edx); + + if (max_cpuid >= 0x8000001D) + /* Use __cpuid__ '0x8000_001D' to compute cache details. */ + { + unsigned int count = 0x1; + + if (name >= _SC_LEVEL3_CACHE_SIZE) + count = 0x3; + else if (name >= _SC_LEVEL2_CACHE_SIZE) + count = 0x2; + else if (name >= _SC_LEVEL1_DCACHE_SIZE) + count = 0x0; + + __cpuid_count (0x8000001D, count, eax, ebx, ecx, edx); + + if (ecx != 0) + { + switch (name) + { + case _SC_LEVEL1_ICACHE_ASSOC: + case _SC_LEVEL1_DCACHE_ASSOC: + case _SC_LEVEL2_CACHE_ASSOC: + case _SC_LEVEL3_CACHE_ASSOC: + return ((ebx >> 22) & 0x3ff) + 1; + case _SC_LEVEL1_ICACHE_LINESIZE: + case _SC_LEVEL1_DCACHE_LINESIZE: + case _SC_LEVEL2_CACHE_LINESIZE: + case _SC_LEVEL3_CACHE_LINESIZE: + return (ebx & 0xfff) + 1; + case _SC_LEVEL1_ICACHE_SIZE: + case _SC_LEVEL1_DCACHE_SIZE: + case _SC_LEVEL2_CACHE_SIZE: + case _SC_LEVEL3_CACHE_SIZE: + return (((ebx >> 22) & 0x3ff) + 1) * ((ebx & 0xfff) + 1) * (ecx + 1); + default: + __builtin_unreachable (); + } + return -1; + } + } + + /* Legacy cache computation for CPUs prior to Bulldozer family. + This is also a fail-safe mechanism for some hypervisors that + accidentally configure __cpuid__ '0x8000_001D' to Zero. */ - __cpuid_count (0x8000001D, count, eax, ebx, ecx, edx); + fn = 0x80000005 + (name >= _SC_LEVEL2_CACHE_SIZE); + + if (max_cpuid < fn) + return 0; + + __cpuid (fn, eax, ebx, ecx, edx); + + if (name < _SC_LEVEL1_DCACHE_SIZE) + { + name += _SC_LEVEL1_DCACHE_SIZE - _SC_LEVEL1_ICACHE_SIZE; + ecx = edx; + } switch (name) { - case _SC_LEVEL1_ICACHE_ASSOC: - case _SC_LEVEL1_DCACHE_ASSOC: - case _SC_LEVEL2_CACHE_ASSOC: + case _SC_LEVEL1_DCACHE_SIZE: + return (ecx >> 14) & 0x3fc00; + + case _SC_LEVEL1_DCACHE_ASSOC: + ecx >>= 16; + if ((ecx & 0xff) == 0xff) + { + /* Fully associative. */ + return (ecx << 2) & 0x3fc00; + } + return ecx & 0xff; + + case _SC_LEVEL1_DCACHE_LINESIZE: + return ecx & 0xff; + + case _SC_LEVEL2_CACHE_SIZE: + return (ecx & 0xf000) == 0 ? 0 : (ecx >> 6) & 0x3fffc00; + + case _SC_LEVEL2_CACHE_ASSOC: + switch ((ecx >> 12) & 0xf) + { + case 0: + case 1: + case 2: + case 4: + return (ecx >> 12) & 0xf; + case 6: + return 8; + case 8: + return 16; + case 10: + return 32; + case 11: + return 48; + case 12: + return 64; + case 13: + return 96; + case 14: + return 128; + case 15: + return ((ecx >> 6) & 0x3fffc00) / (ecx & 0xff); + default: + return 0; + } + + case _SC_LEVEL2_CACHE_LINESIZE: + return (ecx & 0xf000) == 0 ? 0 : ecx & 0xff; + + case _SC_LEVEL3_CACHE_SIZE: + { + long int total_l3_cache = 0, l3_cache_per_thread = 0; + unsigned int threads = 0; + const struct cpu_features *cpu_features; + + if ((edx & 0xf000) == 0) + return 0; + + total_l3_cache = (edx & 0x3ffc0000) << 1; + cpu_features = __get_cpu_features (); + + /* Figure out the number of logical threads that share L3. */ + if (max_cpuid >= 0x80000008) + { + /* Get width of APIC ID. */ + __cpuid (0x80000008, eax, ebx, ecx, edx); + threads = (ecx & 0xff) + 1; + } + + if (threads == 0) + { + /* If APIC ID width is not available, use logical + processor count. */ + __cpuid (0x00000001, eax, ebx, ecx, edx); + if ((edx & (1 << 28)) != 0) + threads = (ebx >> 16) & 0xff; + } + + /* Cap usage of highest cache level to the number of + supported threads. */ + if (threads > 0) + l3_cache_per_thread = total_l3_cache/threads; + + /* Get shared cache per ccx for Zen architectures. */ + if (cpu_features->basic.family >= 0x17) + { + long int l3_cache_per_ccx = 0; + /* Get number of threads share the L3 cache in CCX. */ + __cpuid_count (0x8000001D, 0x3, eax, ebx, ecx, edx); + unsigned int threads_per_ccx = ((eax >> 14) & 0xfff) + 1; + l3_cache_per_ccx = l3_cache_per_thread * threads_per_ccx; + return l3_cache_per_ccx; + } + else + { + return l3_cache_per_thread; + } + } + case _SC_LEVEL3_CACHE_ASSOC: - return ecx ? ((ebx >> 22) & 0x3ff) + 1 : 0; - case _SC_LEVEL1_ICACHE_LINESIZE: - case _SC_LEVEL1_DCACHE_LINESIZE: - case _SC_LEVEL2_CACHE_LINESIZE: + switch ((edx >> 12) & 0xf) + { + case 0: + case 1: + case 2: + case 4: + return (edx >> 12) & 0xf; + case 6: + return 8; + case 8: + return 16; + case 10: + return 32; + case 11: + return 48; + case 12: + return 64; + case 13: + return 96; + case 14: + return 128; + case 15: + return ((edx & 0x3ffc0000) << 1) / (edx & 0xff); + default: + return 0; + } + case _SC_LEVEL3_CACHE_LINESIZE: - return ecx ? (ebx & 0xfff) + 1 : 0; - case _SC_LEVEL1_ICACHE_SIZE: - case _SC_LEVEL1_DCACHE_SIZE: - case _SC_LEVEL2_CACHE_SIZE: - case _SC_LEVEL3_CACHE_SIZE: - return ecx ? (((ebx >> 22) & 0x3ff) + 1) * ((ebx & 0xfff) + 1) * (ecx + 1): 0; + return (edx & 0xf000) == 0 ? 0 : edx & 0xff; + default: __builtin_unreachable (); } @@ -703,7 +869,6 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) data = handle_amd (_SC_LEVEL1_DCACHE_SIZE); core = handle_amd (_SC_LEVEL2_CACHE_SIZE); shared = handle_amd (_SC_LEVEL3_CACHE_SIZE); - shared_per_thread = shared; level1_icache_size = handle_amd (_SC_LEVEL1_ICACHE_SIZE); level1_icache_linesize = handle_amd (_SC_LEVEL1_ICACHE_LINESIZE); @@ -716,13 +881,20 @@ dl_init_cacheinfo (struct cpu_features *cpu_features) level3_cache_size = shared; level3_cache_assoc = handle_amd (_SC_LEVEL3_CACHE_ASSOC); level3_cache_linesize = handle_amd (_SC_LEVEL3_CACHE_LINESIZE); + level4_cache_size = handle_amd (_SC_LEVEL4_CACHE_SIZE); if (shared <= 0) - /* No shared L3 cache. All we have is the L2 cache. */ - shared = core; + { + /* No shared L3 cache. All we have is the L2 cache. */ + shared = core; + } + else if (cpu_features->basic.family < 0x17) + { + /* Account for exclusive L2 and L3 caches. */ + shared += core; + } - if (shared_per_thread <= 0) - shared_per_thread = shared; + shared_per_thread = shared; } cpu_features->level1_icache_size = level1_icache_size;