From patchwork Wed Sep 20 20:44:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 76465 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 093BE385C6D6 for ; Wed, 20 Sep 2023 20:45:12 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-ed1-x531.google.com (mail-ed1-x531.google.com [IPv6:2a00:1450:4864:20::531]) by sourceware.org (Postfix) with ESMTPS id 7BB0B3858D1E for ; Wed, 20 Sep 2023 20:44:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7BB0B3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-533352059fdso97102a12.0 for ; Wed, 20 Sep 2023 13:44:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695242697; x=1695847497; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=E56/AOjycdt2qAvfOMCAf7xftjebqVaIqi+BtLH7+0w=; b=dvPK0ssVxj+mqnH+1/BedYT4zCBuK+wiN/n3hXxVUk5d8dRgS5aF4zxzCcU+kT62OW 0yvYSanMFXjfi6Hag4pYQriHE6cpwnsxYgNJJgwKk91MLFjTUJ5KiVR9Ft8VnrXsutTG LwAmiaR1jzvEb+LMJZdnOjSfiIEA6utTCgK5LpyFy9NLSYx5vAJN5vioB0bwEnDxCtEx pSzwDokSeNWWgjuWrMdge56nrLYrbsz6Utk6QbEq4JUK6RspvTGbDRxNEwLcLkdCpadN zYz3aKbIsLt9on1f3sLmwmtAJdrlvAp46/3+1kcoc36r9uk+b+fCqXzYWFcaEROV0cKo H5Xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695242697; x=1695847497; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E56/AOjycdt2qAvfOMCAf7xftjebqVaIqi+BtLH7+0w=; b=kZ13N9zwEg/LPNdQVdrXR/b9VmCnwqQvOotBHIV3S5ShaoyeR41VBdT2rUC77wH8qa 9z2I8W0+ikdQ/hE3aJ2WjVR9i4RPBukCSINf9YLijVzpvkozp26gyff1CsT4t995qgkQ MZFn9/f6+E1Y8xtKj2+DfZzpY277TeTKUE4v3tGw5gU+WM2ct24NKKn+i2FCcnc4xglw acoAUkx+aQ9WBgGeqcRKLRrEf7x+oh6zLXJCTxMAtTC+CsuYjbn5Yuy1wzeq+PWYZPuD 4+5fv2dpHQFEOBC7BDzvv2Xx4OglcBkCCRe2f6kpPx5NP3rsnUeOelhK+5Ve8H8zVt8j kmzg== X-Gm-Message-State: AOJu0YzxjbTLMv94h/rCMGcYjS2BiGUKHbA+O4ePz07NkNUi6VEAhrkc ULc7VinBa6Lem/dS5ZCagVLAKWEi5ZA= X-Google-Smtp-Source: AGHT+IFCzVOZy/f/PJvPx3Rh5QdycZxD0P8PbK0rh5tVuu8X3rsHFGxNRMRabDNKPVHBrNPMJluJ/g== X-Received: by 2002:aa7:d90d:0:b0:532:d90e:68a with SMTP id a13-20020aa7d90d000000b00532d90e068amr3224872edr.17.1695242697189; Wed, 20 Sep 2023 13:44:57 -0700 (PDT) Received: from noahgold-DESK.intel.com ([192.55.55.56]) by smtp.gmail.com with ESMTPSA id cy18-20020a0564021c9200b005231e1780aasm5585939edb.91.2023.09.20.13.44.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 13:44:56 -0700 (PDT) From: Noah Goldstein To: libc-alpha@sourceware.org Cc: goldstein.w.n@gmail.com, hjl.tools@gmail.com, carlos@systemhalted.org Subject: x86: Add support for AVX10 preset and vec size in cpu-features Date: Wed, 20 Sep 2023 15:44:50 -0500 Message-Id: <20230920204451.1086900-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230824184851.3204976-1-goldstein.w.n@gmail.com> References: <20230824184851.3204976-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, UNWANTED_LANGUAGE_BODY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org This commit add support for the new AVX10 cpu features: https://cdrdv2-public.intel.com/784267/355989-intel-avx10-spec.pdf We add checks for: - `AVX10`: Check if AVX10 is present. - `AVX10_{X,Y,Z}MM`: Check if a given vec class has AVX10 support. `make check` passes and cpuid output was checked against GNR/DMR on an emulator. --- manual/platform.texi | 12 ++++++++++++ sysdeps/x86/bits/platform/x86.h | 14 ++++++++++++-- sysdeps/x86/cpu-features.c | 25 +++++++++++++++++++++++++ sysdeps/x86/include/cpu-features.h | 27 ++++++++++++++++++++++++++- sysdeps/x86/tst-get-cpu-features.c | 8 ++++++++ 5 files changed, 83 insertions(+), 3 deletions(-) diff --git a/manual/platform.texi b/manual/platform.texi index 2a2d557067..478b6fdcdf 100644 --- a/manual/platform.texi +++ b/manual/platform.texi @@ -222,6 +222,18 @@ Leaf (EAX = 23H). @item @code{AVX} -- The AVX instruction extensions. +@item +@code{AVX10} -- The AVX10 instruction extensions. + +@item +@code{AVX10_XMM} -- Whether AVX10 includes xmm registers. + +@item +@code{AVX10_YMM} -- Whether AVX10 includes ymm registers. + +@item +@code{AVX10_ZMM} -- Whether AVX10 includes zmm registers. + @item @code{AVX2} -- The AVX2 instruction extensions. diff --git a/sysdeps/x86/bits/platform/x86.h b/sysdeps/x86/bits/platform/x86.h index 88ca071aa7..1e23d53ba2 100644 --- a/sysdeps/x86/bits/platform/x86.h +++ b/sysdeps/x86/bits/platform/x86.h @@ -30,7 +30,8 @@ enum CPUID_INDEX_80000008, CPUID_INDEX_7_ECX_1, CPUID_INDEX_19, - CPUID_INDEX_14_ECX_0 + CPUID_INDEX_14_ECX_0, + CPUID_INDEX_24_ECX_0 }; struct cpuid_feature @@ -312,6 +313,7 @@ enum x86_cpu_AVX_NE_CONVERT = x86_cpu_index_7_ecx_1_edx + 5, x86_cpu_AMX_COMPLEX = x86_cpu_index_7_ecx_1_edx + 8, x86_cpu_PREFETCHI = x86_cpu_index_7_ecx_1_edx + 14, + x86_cpu_AVX10 = x86_cpu_index_7_ecx_1_edx + 19, x86_cpu_APX_F = x86_cpu_index_7_ecx_1_edx + 21, x86_cpu_index_19_ebx @@ -325,5 +327,13 @@ enum = (CPUID_INDEX_14_ECX_0 * 8 * 4 * sizeof (unsigned int) + cpuid_register_index_ebx * 8 * sizeof (unsigned int)), - x86_cpu_PTWRITE = x86_cpu_index_14_ecx_0_ebx + 4 + x86_cpu_PTWRITE = x86_cpu_index_14_ecx_0_ebx + 4, + + x86_cpu_index_24_ecx_0_ebx + = (CPUID_INDEX_24_ECX_0 * 8 * 4 * sizeof (unsigned int) + + cpuid_register_index_ebx * 8 * sizeof (unsigned int)), + + x86_cpu_AVX10_XMM = x86_cpu_index_24_ecx_0_ebx + 16, + x86_cpu_AVX10_YMM = x86_cpu_index_24_ecx_0_ebx + 17, + x86_cpu_AVX10_ZMM = x86_cpu_index_24_ecx_0_ebx + 18, }; diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index badf088874..0bf923d48b 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -115,11 +115,18 @@ update_active (struct cpu_features *cpu_features) CPU_FEATURE_SET_ACTIVE (cpu_features, SHSTK); #endif + enum + { + os_xmm = 1, + os_ymm = 2, + os_zmm = 4 + } os_vector_size = os_xmm; /* Can we call xgetbv? */ if (CPU_FEATURES_CPU_P (cpu_features, OSXSAVE)) { unsigned int xcrlow; unsigned int xcrhigh; + CPU_FEATURE_SET_ACTIVE (cpu_features, AVX10); asm ("xgetbv" : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0)); /* Is YMM and XMM state usable? */ if ((xcrlow & (bit_YMM_state | bit_XMM_state)) @@ -128,6 +135,7 @@ update_active (struct cpu_features *cpu_features) /* Determine if AVX is usable. */ if (CPU_FEATURES_CPU_P (cpu_features, AVX)) { + os_vector_size |= os_ymm; CPU_FEATURE_SET (cpu_features, AVX); /* The following features depend on AVX being usable. */ /* Determine if AVX2 is usable. */ @@ -166,6 +174,7 @@ update_active (struct cpu_features *cpu_features) | bit_ZMM16_31_state)) == (bit_Opmask_state | bit_ZMM0_15_state | bit_ZMM16_31_state)) { + os_vector_size |= os_zmm; /* Determine if AVX512F is usable. */ if (CPU_FEATURES_CPU_P (cpu_features, AVX512F)) { @@ -210,6 +219,22 @@ update_active (struct cpu_features *cpu_features) } } + if (CPU_FEATURES_CPU_P (cpu_features, AVX10) + && cpu_features->basic.max_cpuid >= 0x24) + { + __cpuid_count ( + 0x24, 0, cpu_features->features[CPUID_INDEX_24_ECX_0].cpuid.eax, + cpu_features->features[CPUID_INDEX_24_ECX_0].cpuid.ebx, + cpu_features->features[CPUID_INDEX_24_ECX_0].cpuid.ecx, + cpu_features->features[CPUID_INDEX_24_ECX_0].cpuid.edx); + if (os_vector_size & os_xmm) + CPU_FEATURE_SET_ACTIVE (cpu_features, AVX10_XMM); + if (os_vector_size & os_ymm) + CPU_FEATURE_SET_ACTIVE (cpu_features, AVX10_YMM); + if (os_vector_size & os_zmm) + CPU_FEATURE_SET_ACTIVE (cpu_features, AVX10_ZMM); + } + /* Are XTILECFG and XTILEDATA states usable? */ if ((xcrlow & (bit_XTILECFG_state | bit_XTILEDATA_state)) == (bit_XTILECFG_state | bit_XTILEDATA_state)) diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu-features.h index eb30d342a6..2d7427a6c0 100644 --- a/sysdeps/x86/include/cpu-features.h +++ b/sysdeps/x86/include/cpu-features.h @@ -29,7 +29,7 @@ enum { - CPUID_INDEX_MAX = CPUID_INDEX_14_ECX_0 + 1 + CPUID_INDEX_MAX = CPUID_INDEX_24_ECX_0 + 1 }; enum @@ -319,6 +319,7 @@ enum #define bit_cpu_AVX_NE_CONVERT (1u << 5) #define bit_cpu_AMX_COMPLEX (1u << 8) #define bit_cpu_PREFETCHI (1u << 14) +#define bit_cpu_AVX10 (1u << 19) #define bit_cpu_APX_F (1u << 21) /* CPUID_INDEX_19. */ @@ -332,6 +333,13 @@ enum /* EBX. */ #define bit_cpu_PTWRITE (1u << 4) +/* CPUID_INDEX_24_ECX_0. */ + +/* EBX. */ +#define bit_cpu_AVX10_XMM (1u << 16) +#define bit_cpu_AVX10_YMM (1u << 17) +#define bit_cpu_AVX10_ZMM (1u << 18) + /* CPUID_INDEX_1. */ /* ECX. */ @@ -563,6 +571,7 @@ enum #define index_cpu_AVX_NE_CONVERT CPUID_INDEX_7_ECX_1 #define index_cpu_AMX_COMPLEX CPUID_INDEX_7_ECX_1 #define index_cpu_PREFETCHI CPUID_INDEX_7_ECX_1 +#define index_cpu_AVX10 CPUID_INDEX_7_ECX_1 #define index_cpu_APX_F CPUID_INDEX_7_ECX_1 /* CPUID_INDEX_19. */ @@ -576,6 +585,13 @@ enum /* EBX. */ #define index_cpu_PTWRITE CPUID_INDEX_14_ECX_0 +/* CPUID_INDEX_24_ECX_0. */ + +/* EBX. */ +#define index_cpu_AVX10_XMM CPUID_INDEX_24_ECX_0 +#define index_cpu_AVX10_YMM CPUID_INDEX_24_ECX_0 +#define index_cpu_AVX10_ZMM CPUID_INDEX_24_ECX_0 + /* CPUID_INDEX_1. */ /* ECX. */ @@ -809,6 +825,7 @@ enum #define reg_AVX_NE_CONVERT edx #define reg_AMX_COMPLEX edx #define reg_PREFETCHI edx +#define reg_AVX10 edx #define reg_APX_F edx /* CPUID_INDEX_19. */ @@ -822,6 +839,14 @@ enum /* EBX. */ #define reg_PTWRITE ebx +/* CPUID_INDEX_24_ECX_0. */ + +/* EBX. */ +#define reg_AVX10_XMM ebx +#define reg_AVX10_YMM ebx +#define reg_AVX10_ZMM ebx + + /* PREFERRED_FEATURE_INDEX_1. First define the bitindex values sequentially, then define the bit_arch* and index_arch_* lookup constants. */ diff --git a/sysdeps/x86/tst-get-cpu-features.c b/sysdeps/x86/tst-get-cpu-features.c index b27fa7324a..44edd18df2 100644 --- a/sysdeps/x86/tst-get-cpu-features.c +++ b/sysdeps/x86/tst-get-cpu-features.c @@ -219,6 +219,7 @@ do_test (void) CHECK_CPU_FEATURE_PRESENT (AVX_NE_CONVERT); CHECK_CPU_FEATURE_PRESENT (AMX_COMPLEX); CHECK_CPU_FEATURE_PRESENT (PREFETCHI); + CHECK_CPU_FEATURE_PRESENT (AVX10); CHECK_CPU_FEATURE_PRESENT (APX_F); CHECK_CPU_FEATURE_PRESENT (AESKLE); CHECK_CPU_FEATURE_PRESENT (WIDE_KL); @@ -391,11 +392,18 @@ do_test (void) CHECK_CPU_FEATURE_ACTIVE (AVX_NE_CONVERT); CHECK_CPU_FEATURE_ACTIVE (AMX_COMPLEX); CHECK_CPU_FEATURE_ACTIVE (PREFETCHI); + CHECK_CPU_FEATURE_ACTIVE (AVX10); CHECK_CPU_FEATURE_ACTIVE (APX_F); CHECK_CPU_FEATURE_ACTIVE (AESKLE); CHECK_CPU_FEATURE_ACTIVE (WIDE_KL); CHECK_CPU_FEATURE_ACTIVE (PTWRITE); + if (CPU_FEATURE_ACTIVE (AVX10)) + { + CHECK_CPU_FEATURE_ACTIVE (AVX10_XMM); + CHECK_CPU_FEATURE_ACTIVE (AVX10_YMM); + CHECK_CPU_FEATURE_ACTIVE (AVX10_ZMM); + } return 0; }