[v1] ppc64le: Add optimized __memcmpeq for POWER10
Checks
Commit Message
Hi
This patch adds an optimized __memcmpeq implementation for POWER10.
>8>8>8>8>8>8>8>8>8>8
From 7c0eb2fbd5a850dfabfdb77af6bda4b5bc8ada98 Mon Sep 17 00:00:00 2001
From: Sachin Monga <smonga@linux.ibm.com>
Date: Mon, 25 May 2026 00:58:04 -0500
Subject: [PATCH v1] ppc64le: Add optimized __memcmpeq for POWER10
__memcmpeq (added in glibc 2.35) was previously an alias to memcmp on
POWER10 via strong_alias. However, in the multiarch IFUNC path, this
caused __memcmpeq to resolve to the generic C memcmp.c implementation
rather than the optimized POWER10 memcmp.S, leaving a significant
performance gap.
Unlike memcmp, __memcmpeq only needs to return zero or nonzero with
no requirement on the sign or magnitude for unequal inputs, allowing
a simpler and faster implementation.
Performance on POWER10 :
1) __memcmpeq (generic) -> __memcmpeq_power10
The primary motivation - __memcmpeq was resolving to generic C
in the multiarch path.
- Small data (< 8B to < 512B) : ~52% - 82% improvement.
- Bulk (< 16MB to < 256MB) : ~25% - 32% improvement.
- Large (1GB) : ~33% improvement
2) memcmp_power10 (optimized .S) -> __memcmpeq_power10:
Comparing dedicated __memcmpeq against the optimized memcmp
it previously aliased to.
- Small data (< 8B to < 256B) : No improvement observed.
Real-world workloads predominantly operate on larger buffers
- >= 512B : ~9% improvement.
- 16MB - 128MB : ~25% - 32% improvement.
- 256MB : ~3% improvement.
- Large (1GB) : On par.
---
This patch is reg tested.
sysdeps/powerpc/powerpc64/le/power10/memcmp.S | 2 -
.../powerpc/powerpc64/le/power10/memcmpeq.S | 156 ++++++++++++++++++
sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +-
.../powerpc64/multiarch/ifunc-impl-list.c | 20 +++
.../powerpc64/multiarch/memcmp-ppc64.c | 7 +-
.../powerpc64/multiarch/memcmpeq-power10.S | 28 ++++
.../powerpc/powerpc64/multiarch/memcmpeq.c | 57 +++++++
7 files changed, 266 insertions(+), 6 deletions(-)
create mode 100644 sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S
create mode 100644 sysdeps/powerpc/powerpc64/multiarch/memcmpeq-power10.S
create mode 100644 sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c
+#endif
Comments
On 25/05/26 05:09, Sachin Monga wrote:
> Hi
>
> This patch adds an optimized __memcmpeq implementation for POWER10.
>
>>8>8>8>8>8>8>8>8>8>8
>
> From 7c0eb2fbd5a850dfabfdb77af6bda4b5bc8ada98 Mon Sep 17 00:00:00 2001
> From: Sachin Monga <smonga@linux.ibm.com>
> Date: Mon, 25 May 2026 00:58:04 -0500
> Subject: [PATCH v1] ppc64le: Add optimized __memcmpeq for POWER10
>
> __memcmpeq (added in glibc 2.35) was previously an alias to memcmp on
> POWER10 via strong_alias. However, in the multiarch IFUNC path, this
> caused __memcmpeq to resolve to the generic C memcmp.c implementation
> rather than the optimized POWER10 memcmp.S, leaving a significant
> performance gap.
Is it really worth to duplicate most of the memcmp implementation for this
specific optimization? A simpler solution would add a ifunc variant that
returns the already in place memcmp variants.
>
> Unlike memcmp, __memcmpeq only needs to return zero or nonzero with
> no requirement on the sign or magnitude for unequal inputs, allowing
> a simpler and faster implementation.
>
> Performance on POWER10 :
>
> 1) __memcmpeq (generic) -> __memcmpeq_power10
> The primary motivation - __memcmpeq was resolving to generic C
> in the multiarch path.
>
> - Small data (< 8B to < 512B) : ~52% - 82% improvement.
> - Bulk (< 16MB to < 256MB) : ~25% - 32% improvement.
> - Large (1GB) : ~33% improvement
>
> 2) memcmp_power10 (optimized .S) -> __memcmpeq_power10:
> Comparing dedicated __memcmpeq against the optimized memcmp
> it previously aliased to.
>
> - Small data (< 8B to < 256B) : No improvement observed.
> Real-world workloads predominantly operate on larger buffers
> - >= 512B : ~9% improvement.
> - 16MB - 128MB : ~25% - 32% improvement.
> - 256MB : ~3% improvement.
> - Large (1GB) : On par.
> ---
> This patch is reg tested.
>
> sysdeps/powerpc/powerpc64/le/power10/memcmp.S | 2 -
> .../powerpc/powerpc64/le/power10/memcmpeq.S | 156 ++++++++++++++++++
> sysdeps/powerpc/powerpc64/multiarch/Makefile | 2 +-
> .../powerpc64/multiarch/ifunc-impl-list.c | 20 +++
> .../powerpc64/multiarch/memcmp-ppc64.c | 7 +-
> .../powerpc64/multiarch/memcmpeq-power10.S | 28 ++++
> .../powerpc/powerpc64/multiarch/memcmpeq.c | 57 +++++++
> 7 files changed, 266 insertions(+), 6 deletions(-)
> create mode 100644 sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S
> create mode 100644 sysdeps/powerpc/powerpc64/multiarch/memcmpeq-power10.S
> create mode 100644 sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c
>
> diff --git a/sysdeps/powerpc/powerpc64/le/power10/memcmp.S b/sysdeps/powerpc/powerpc64/le/power10/memcmp.S
> index 46a74dea4d..8915676e1b 100644
> --- a/sysdeps/powerpc/powerpc64/le/power10/memcmp.S
> +++ b/sysdeps/powerpc/powerpc64/le/power10/memcmp.S
> @@ -161,5 +161,3 @@ L(tail8):
> END (MEMCMP)
> libc_hidden_builtin_def (memcmp)
> weak_alias (memcmp, bcmp)
> -strong_alias (memcmp, __memcmpeq)
> -libc_hidden_def (__memcmpeq)
> diff --git a/sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S b/sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S
> new file mode 100644
> index 0000000000..4a1a4ad3ce
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S
> @@ -0,0 +1,156 @@
> +/* Optimized __memcmpeq implementation for POWER10.
> + Copyright (C) 2026 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +#include <sysdep.h>
> +
> +#define COMPARE_32(vr1,vr2,offset)\
> + lxvp 32+vr1,offset(r3); \
> + lxvp 32+vr2,offset(r4); \
> + vxor v4,vr1,vr2; \
> + vxor v5,vr1+1,vr2+1; \
> + vor v19,v19,v4; \
> + vor v19,v19,v5;
> +
> +/* int [r3] __memcmpeq (const char *s1 [r3], const char *s2 [r4],
> + size_t size [r5])
> + Returns 0 if equal, 1 if not equal (no lexicographic comparison) */
> +
> +#ifndef MEMCMPEQ
> +# define MEMCMPEQ __memcmpeq
> +#endif
> + .machine power10
> +ENTRY_TOCLESS (MEMCMPEQ, 4)
> + CALL_MCOUNT 3
> +
> + /* Fast path: size == 0 */
> + cmpdi cr7,r5,0
> + beq cr7,L(finish)
> +
> + /* Fast path: same pointer */
> + cmpd cr7,r3,r4
> + beq cr7,L(finish)
> +
> + cmpldi cr6,r5,64
> + bgt cr6,L(loop_head)
> +
> +/* Compare 64 bytes. This section is used for lengths <= 64 and for the last
> + bytes for larger lengths. */
> +L(last_compare):
> + li r8,16
> +
> + sldi r9,r5,56
> + sldi r8,r8,56
> + addi r6,r3,16
> + addi r7,r4,16
> +
> + /* Align up to 16 bytes. */
> + lxvl 32+v0,r3,r9
> + lxvl 32+v2,r4,r9
> +
> + /* Branch to not_equal if any bytes differ (CR6 set by vcmpneb.).
> + Branch to finish if no bytes remain (CR0.LT set when r9 went
> + negative after sub.). */
> + sub. r9,r9,r8
> + vcmpneb. v4,v0,v2
> + bne cr6,L(not_equal)
> + bt 4*cr0+lt,L(finish)
> +
> + addi r3,r3,32
> + addi r4,r4,32
> +
> + lxvl 32+v1,r6,r9
> + lxvl 32+v3,r7,r9
> + sub. r9,r9,r8
> + vcmpneb. v5,v1,v3
> + bne cr6,L(not_equal)
> + bt 4*cr0+lt,L(finish)
> +
> + addi r6,r3,16
> + addi r7,r4,16
> +
> + lxvl 32+v6,r3,r9
> + lxvl 32+v8,r4,r9
> + sub. r9,r9,r8
> + vcmpneb. v4,v6,v8
> + bne cr6,L(not_equal)
> + bt 4*cr0+lt,L(finish)
> +
> + lxvl 32+v7,r6,r9
> + lxvl 32+v9,r7,r9
> + vcmpneb. v5,v7,v9
> + bne cr6,L(not_equal)
> +
> +L(finish):
> + /* The contents are equal. */
> + li r3,0
> + blr
> +
> +L(not_equal):
> + li r3,1
> + blr
> +
> +L(loop_head):
> + /* Calculate how many loops to run. */
> + srdi. r8,r5,7
> + beq L(loop_tail)
> + mtctr r8
> +
> + vxor v18,v18,v18
> + vxor v19,v19,v19
> + .p2align 5
> +L(loop_128):
> + COMPARE_32(v0,v2,0)
> + COMPARE_32(v6,v8,32)
> + COMPARE_32(v10,v12,64)
> + COMPARE_32(v14,v16,96)
> +
> + vcmpneb. v17,v19,v18
> + bne cr6,L(not_equal)
> +
> + addi r3,r3,128
> + addi r4,r4,128
> + bdnz L(loop_128)
> +
> + /* Account loop comparisons. */
> + clrldi. r5,r5,57
> + beq L(finish)
> +
> +/* Compares 64 bytes if length is still bigger than 64 bytes. */
> + .p2align 5
> +L(loop_tail):
> + /* Initialize accumulator for tail */
> + vxor v18,v18,v18
> + vxor v19,v19,v19
> +
> + cmpldi r5,64
> + ble L(last_compare)
> +
> + COMPARE_32(v0,v2,0)
> + COMPARE_32(v6,v8,32)
> +
> + vcmpneb. v17,v19,v18
> + bne cr6,L(not_equal)
> +
> + addi r3,r3,64
> + addi r4,r4,64
> + subi r5,r5,64
> + b L(last_compare)
> +
> +END (MEMCMPEQ)
> +
> +libc_hidden_def (MEMCMPEQ)
> diff --git a/sysdeps/powerpc/powerpc64/multiarch/Makefile b/sysdeps/powerpc/powerpc64/multiarch/Makefile
> index c9178223a8..164aac9dca 100644
> --- a/sysdeps/powerpc/powerpc64/multiarch/Makefile
> +++ b/sysdeps/powerpc/powerpc64/multiarch/Makefile
> @@ -30,7 +30,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7 memcpy-a2 memcpy-power6 \
> strncase-power8
>
> ifneq (,$(filter %le,$(config-machine)))
> -sysdep_routines += memcmp-power10 memcpy-power10 memmove-power10 memset-power10 \
> +sysdep_routines += memcmp-power10 memcpy-power10 memmove-power10 memset-power10 memcmpeq-power10 \
> rawmemchr-power9 rawmemchr-power10 \
> strcmp-power9 strcmp-power10 strncmp-power9 strncmp-power10 \
> strcpy-power9 strcat-power10 stpcpy-power9 \
> diff --git a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
> index 1458b4575d..b346381a35 100644
> --- a/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
> +++ b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
> @@ -218,6 +218,26 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array,
> __memcmp_power4)
> IFUNC_IMPL_ADD (array, i, memcmp, 1, __memcmp_ppc))
>
> + /* Support sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c.
> + * Pre-POWER10 variants reuse __memcmp_* since memcmp's return value
> + * satisfies __memcmpeq's zero/non-zero contract. */
> +
> + IFUNC_IMPL (i, name, __memcmpeq,
> +#ifdef __LITTLE_ENDIAN__
> + IFUNC_IMPL_ADD (array, i, __memcmpeq,
> + hwcap2 & PPC_FEATURE2_ARCH_3_1
> + && hwcap & PPC_FEATURE_HAS_VSX,
> + __memcmpeq_power10)
> +#endif
> + IFUNC_IMPL_ADD (array, i, __memcmpeq, hwcap2 & PPC_FEATURE2_ARCH_2_07
> + && hwcap & PPC_FEATURE_HAS_ALTIVEC,
> + __memcmp_power8)
> + IFUNC_IMPL_ADD (array, i, __memcmpeq, hwcap & PPC_FEATURE_ARCH_2_06,
> + __memcmp_power7)
> + IFUNC_IMPL_ADD (array, i, __memcmpeq, hwcap & PPC_FEATURE_POWER4,
> + __memcmp_power4)
> + IFUNC_IMPL_ADD (array, i, __memcmpeq, 1, __memcmp_ppc))
> +
> /* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c. */
> IFUNC_IMPL (i, name, mempcpy,
> IFUNC_IMPL_ADD (array, i, mempcpy,
> diff --git a/sysdeps/powerpc/powerpc64/multiarch/memcmp-ppc64.c b/sysdeps/powerpc/powerpc64/multiarch/memcmp-ppc64.c
> index ef69cfe8da..f885d3fb55 100644
> --- a/sysdeps/powerpc/powerpc64/multiarch/memcmp-ppc64.c
> +++ b/sysdeps/powerpc/powerpc64/multiarch/memcmp-ppc64.c
> @@ -22,14 +22,15 @@
> #define weak_alias(name, aliasname) \
> extern __typeof (__memcmp_ppc) aliasname \
> __attribute__ ((weak, alias ("__memcmp_ppc")));
> +/* __memcmpeq is now owned by the memcmpeq IFUNC selector (memcmpeq.os) */
> #undef strong_alias
> -#define strong_alias(name, aliasname) \
> - extern __typeof (__memcmp_ppc) aliasname \
> - __attribute__ ((alias ("__memcmp_ppc")));
> +#define strong_alias(name, aliasname)
> #if IS_IN (libc) && defined(SHARED)
> # undef libc_hidden_builtin_def
> # define libc_hidden_builtin_def(name) \
> __hidden_ver1(__memcmp_ppc, __GI_memcmp, __memcmp_ppc);
> +# undef libc_hidden_def
> +# define libc_hidden_def(name)
> #endif
>
> extern __typeof (memcmp) __memcmp_ppc attribute_hidden;
> diff --git a/sysdeps/powerpc/powerpc64/multiarch/memcmpeq-power10.S b/sysdeps/powerpc/powerpc64/multiarch/memcmpeq-power10.S
> new file mode 100644
> index 0000000000..ee4b433712
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/multiarch/memcmpeq-power10.S
> @@ -0,0 +1,28 @@
> +/* Wrapper for POWER10 __memcmpeq implementation.
> + Copyright (C) 2026 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +#define MEMCMPEQ __memcmpeq_power10
> +
> +#undef libc_hidden_builtin_def
> +#define libc_hidden_builtin_def(name)
> +#undef libc_hidden_def
> +#define libc_hidden_def(name)
> +#undef weak_alias
> +#define weak_alias(name, alias)
> +
> +#include <sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S>
> diff --git a/sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c b/sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c
> new file mode 100644
> index 0000000000..3f1266a2e8
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c
> @@ -0,0 +1,57 @@
> +/* Multiple versions of memcmpeq. PowerPC64 version.
> + Copyright (C) 2026 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +/* Define multiple versions only for definition in libc. */
> +#if IS_IN (libc)
> +# define __memcmpeq __redirect___memcmpeq
> +# include <string.h>
> +# include <shlib-compat.h>
> +# include "init-arch.h"
> +
> +/* Reuse the existing optimized memcmp variants for pre-POWER10 hardware
> + * as memcmp is a superset */
> +extern __typeof (memcmp) __memcmp_ppc attribute_hidden;
> +extern __typeof (memcmp) __memcmp_power4 attribute_hidden;
> +extern __typeof (memcmp) __memcmp_power7 attribute_hidden;
> +extern __typeof (memcmp) __memcmp_power8 attribute_hidden;
> +extern __typeof (__memcmpeq) __memcmpeq_power10 attribute_hidden;
> +# undef __memcmpeq
> +
> +/* Avoid DWARF definition DIE on ifunc symbol so that GDB can handle
> + ifunc symbol properly. */
> +libc_ifunc_redirected (__redirect___memcmpeq, __memcmpeq,
> +#ifdef __LITTLE_ENDIAN__
> + (hwcap2 & PPC_FEATURE2_ARCH_3_1
> + && hwcap & PPC_FEATURE_HAS_VSX)
> + ? __memcmpeq_power10 :
> +#endif
> + (hwcap2 & PPC_FEATURE2_ARCH_2_07
> + && hwcap & PPC_FEATURE_HAS_ALTIVEC)
> + ? __memcmp_power8 :
> + (hwcap & PPC_FEATURE_ARCH_2_06)
> + ? __memcmp_power7
> + : (hwcap & PPC_FEATURE_POWER4)
> + ? __memcmp_power4
> + : __memcmp_ppc);
> +# ifdef SHARED
> +__hidden_ver1 (__memcmpeq, __GI___memcmpeq, __redirect___memcmpeq)
> + __attribute__ ((visibility ("hidden"))) __attribute_copy__ (__memcmpeq);
> +# endif
> +#else
> +#include <string/memcmp.c>
> +#endif
Thanks for the review Adhemerval.
> Is it really worth to duplicate most of the memcmp implementation for this
> specific optimization? A simpler solution would add a ifunc variant that
> returns the already in place memcmp variants.
One clarification on scope: the IFUNC selector aliases to the existing
|__memcmp_*| variants for everything except POWER10.
Power8, power7, power4, and ppc reuse the in-place memcmp
implementations exactly as you're describing. The only dedicated file in
the patch is |memcmpeq-power10.S|.
POWER10 has its own because of the numbers in section 2 of the commit
message — Dedicated |__memcmpeq_power10| vs the same selector aliased to
|__memcmp_power10|:
|≥ 512B : ~9% 16MB – 128MB : ~25% – 32% 256MB : ~3% 1GB : on par|
16MB–128MB is the customer workload range — that's the band that
motivates the dedicated implementation.
Precedent: x86_64 takes the same approach —
|sysdeps/x86_64/multiarch/memcmpeq-{sse2,avx2,evex}.S| are dedicated
rather than aliased to |__memcmp_*|.
Regards:
Sachin.
On 26/05/26 06:34, Sachin Monga wrote:
> Thanks for the review Adhemerval.
>
>> Is it really worth to duplicate most of the memcmp implementation for this
>> specific optimization? A simpler solution would add a ifunc variant that
>> returns the already in place memcmp variants.
> One clarification on scope: the IFUNC selector aliases to the existing |__memcmp_*| variants for everything except POWER10.
>
> Power8, power7, power4, and ppc reuse the in-place memcmp implementations exactly as you're describing. The only dedicated file in the patch is |memcmpeq-power10.S|.
>
> POWER10 has its own because of the numbers in section 2 of the commit message — Dedicated |__memcmpeq_power10| vs the same selector aliased to |__memcmp_power10|:
>
> |≥ 512B : ~9% 16MB – 128MB : ~25% – 32% 256MB : ~3% 1GB : on par|
>
> 16MB–128MB is the customer workload range — that's the band that motivates the dedicated implementation.
>
> Precedent: x86_64 takes the same approach — |sysdeps/x86_64/multiarch/memcmpeq-{sse2,avx2,evex}.S| are dedicated rather than aliased to |__memcmp_*|.
Right, I was not expecting that the COMPARE_32 change would yield that much
difference.
b/sysdeps/powerpc/powerpc64/le/power10/memcmp.S
@@ -161,5 +161,3 @@ L(tail8):
END (MEMCMP)
libc_hidden_builtin_def (memcmp)
weak_alias (memcmp, bcmp)
-strong_alias (memcmp, __memcmpeq)
-libc_hidden_def (__memcmpeq)
b/sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S
new file mode 100644
@@ -0,0 +1,156 @@
+/* Optimized __memcmpeq implementation for POWER10.
+ Copyright (C) 2026 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+
+#define COMPARE_32(vr1,vr2,offset)\
+ lxvp 32+vr1,offset(r3); \
+ lxvp 32+vr2,offset(r4); \
+ vxor v4,vr1,vr2; \
+ vxor v5,vr1+1,vr2+1; \
+ vor v19,v19,v4; \
+ vor v19,v19,v5;
+
+/* int [r3] __memcmpeq (const char *s1 [r3], const char *s2 [r4],
+ size_t size [r5])
+ Returns 0 if equal, 1 if not equal (no lexicographic comparison) */
+
+#ifndef MEMCMPEQ
+# define MEMCMPEQ __memcmpeq
+#endif
+ .machine power10
+ENTRY_TOCLESS (MEMCMPEQ, 4)
+ CALL_MCOUNT 3
+
+ /* Fast path: size == 0 */
+ cmpdi cr7,r5,0
+ beq cr7,L(finish)
+
+ /* Fast path: same pointer */
+ cmpd cr7,r3,r4
+ beq cr7,L(finish)
+
+ cmpldi cr6,r5,64
+ bgt cr6,L(loop_head)
+
+/* Compare 64 bytes. This section is used for lengths <= 64 and for the
last
+ bytes for larger lengths. */
+L(last_compare):
+ li r8,16
+
+ sldi r9,r5,56
+ sldi r8,r8,56
+ addi r6,r3,16
+ addi r7,r4,16
+
+ /* Align up to 16 bytes. */
+ lxvl 32+v0,r3,r9
+ lxvl 32+v2,r4,r9
+
+ /* Branch to not_equal if any bytes differ (CR6 set by vcmpneb.).
+ Branch to finish if no bytes remain (CR0.LT set when r9 went
+ negative after sub.). */
+ sub. r9,r9,r8
+ vcmpneb. v4,v0,v2
+ bne cr6,L(not_equal)
+ bt 4*cr0+lt,L(finish)
+
+ addi r3,r3,32
+ addi r4,r4,32
+
+ lxvl 32+v1,r6,r9
+ lxvl 32+v3,r7,r9
+ sub. r9,r9,r8
+ vcmpneb. v5,v1,v3
+ bne cr6,L(not_equal)
+ bt 4*cr0+lt,L(finish)
+
+ addi r6,r3,16
+ addi r7,r4,16
+
+ lxvl 32+v6,r3,r9
+ lxvl 32+v8,r4,r9
+ sub. r9,r9,r8
+ vcmpneb. v4,v6,v8
+ bne cr6,L(not_equal)
+ bt 4*cr0+lt,L(finish)
+
+ lxvl 32+v7,r6,r9
+ lxvl 32+v9,r7,r9
+ vcmpneb. v5,v7,v9
+ bne cr6,L(not_equal)
+
+L(finish):
+ /* The contents are equal. */
+ li r3,0
+ blr
+
+L(not_equal):
+ li r3,1
+ blr
+
+L(loop_head):
+ /* Calculate how many loops to run. */
+ srdi. r8,r5,7
+ beq L(loop_tail)
+ mtctr r8
+
+ vxor v18,v18,v18
+ vxor v19,v19,v19
+ .p2align 5
+L(loop_128):
+ COMPARE_32(v0,v2,0)
+ COMPARE_32(v6,v8,32)
+ COMPARE_32(v10,v12,64)
+ COMPARE_32(v14,v16,96)
+
+ vcmpneb. v17,v19,v18
+ bne cr6,L(not_equal)
+
+ addi r3,r3,128
+ addi r4,r4,128
+ bdnz L(loop_128)
+
+ /* Account loop comparisons. */
+ clrldi. r5,r5,57
+ beq L(finish)
+
+/* Compares 64 bytes if length is still bigger than 64 bytes. */
+ .p2align 5
+L(loop_tail):
+ /* Initialize accumulator for tail */
+ vxor v18,v18,v18
+ vxor v19,v19,v19
+
+ cmpldi r5,64
+ ble L(last_compare)
+
+ COMPARE_32(v0,v2,0)
+ COMPARE_32(v6,v8,32)
+
+ vcmpneb. v17,v19,v18
+ bne cr6,L(not_equal)
+
+ addi r3,r3,64
+ addi r4,r4,64
+ subi r5,r5,64
+ b L(last_compare)
+
+END (MEMCMPEQ)
+
+libc_hidden_def (MEMCMPEQ)
b/sysdeps/powerpc/powerpc64/multiarch/Makefile
@@ -30,7 +30,7 @@ sysdep_routines += memcpy-power8-cached memcpy-power7
memcpy-a2 memcpy-power6 \
strncase-power8
ifneq (,$(filter %le,$(config-machine)))
-sysdep_routines += memcmp-power10 memcpy-power10 memmove-power10
memset-power10 \
+sysdep_routines += memcmp-power10 memcpy-power10 memmove-power10
memset-power10 memcmpeq-power10 \
rawmemchr-power9 rawmemchr-power10 \
strcmp-power9 strcmp-power10 strncmp-power9 strncmp-power10 \
strcpy-power9 strcat-power10 stpcpy-power9 \
b/sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
@@ -218,6 +218,26 @@ __libc_ifunc_impl_list (const char *name, struct
libc_ifunc_impl *array,
__memcmp_power4)
IFUNC_IMPL_ADD (array, i, memcmp, 1, __memcmp_ppc))
+ /* Support sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c.
+ * Pre-POWER10 variants reuse __memcmp_* since memcmp's return value
+ * satisfies __memcmpeq's zero/non-zero contract. */
+
+ IFUNC_IMPL (i, name, __memcmpeq,
+#ifdef __LITTLE_ENDIAN__
+ IFUNC_IMPL_ADD (array, i, __memcmpeq,
+ hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap & PPC_FEATURE_HAS_VSX,
+ __memcmpeq_power10)
+#endif
+ IFUNC_IMPL_ADD (array, i, __memcmpeq, hwcap2 &
PPC_FEATURE2_ARCH_2_07
+ && hwcap & PPC_FEATURE_HAS_ALTIVEC,
+ __memcmp_power8)
+ IFUNC_IMPL_ADD (array, i, __memcmpeq, hwcap &
PPC_FEATURE_ARCH_2_06,
+ __memcmp_power7)
+ IFUNC_IMPL_ADD (array, i, __memcmpeq, hwcap & PPC_FEATURE_POWER4,
+ __memcmp_power4)
+ IFUNC_IMPL_ADD (array, i, __memcmpeq, 1, __memcmp_ppc))
+
/* Support sysdeps/powerpc/powerpc64/multiarch/mempcpy.c. */
IFUNC_IMPL (i, name, mempcpy,
IFUNC_IMPL_ADD (array, i, mempcpy,
b/sysdeps/powerpc/powerpc64/multiarch/memcmp-ppc64.c
@@ -22,14 +22,15 @@
#define weak_alias(name, aliasname) \
extern __typeof (__memcmp_ppc) aliasname \
__attribute__ ((weak, alias ("__memcmp_ppc")));
+/* __memcmpeq is now owned by the memcmpeq IFUNC selector (memcmpeq.os) */
#undef strong_alias
-#define strong_alias(name, aliasname) \
- extern __typeof (__memcmp_ppc) aliasname \
- __attribute__ ((alias ("__memcmp_ppc")));
+#define strong_alias(name, aliasname)
#if IS_IN (libc) && defined(SHARED)
# undef libc_hidden_builtin_def
# define libc_hidden_builtin_def(name) \
__hidden_ver1(__memcmp_ppc, __GI_memcmp, __memcmp_ppc);
+# undef libc_hidden_def
+# define libc_hidden_def(name)
#endif
extern __typeof (memcmp) __memcmp_ppc attribute_hidden;
b/sysdeps/powerpc/powerpc64/multiarch/memcmpeq-power10.S
new file mode 100644
@@ -0,0 +1,28 @@
+/* Wrapper for POWER10 __memcmpeq implementation.
+ Copyright (C) 2026 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#define MEMCMPEQ __memcmpeq_power10
+
+#undef libc_hidden_builtin_def
+#define libc_hidden_builtin_def(name)
+#undef libc_hidden_def
+#define libc_hidden_def(name)
+#undef weak_alias
+#define weak_alias(name, alias)
+
+#include <sysdeps/powerpc/powerpc64/le/power10/memcmpeq.S>
b/sysdeps/powerpc/powerpc64/multiarch/memcmpeq.c
new file mode 100644
@@ -0,0 +1,57 @@
+/* Multiple versions of memcmpeq. PowerPC64 version.
+ Copyright (C) 2026 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* Define multiple versions only for definition in libc. */
+#if IS_IN (libc)
+# define __memcmpeq __redirect___memcmpeq
+# include <string.h>
+# include <shlib-compat.h>
+# include "init-arch.h"
+
+/* Reuse the existing optimized memcmp variants for pre-POWER10 hardware
+ * as memcmp is a superset */
+extern __typeof (memcmp) __memcmp_ppc attribute_hidden;
+extern __typeof (memcmp) __memcmp_power4 attribute_hidden;
+extern __typeof (memcmp) __memcmp_power7 attribute_hidden;
+extern __typeof (memcmp) __memcmp_power8 attribute_hidden;
+extern __typeof (__memcmpeq) __memcmpeq_power10 attribute_hidden;
+# undef __memcmpeq
+
+/* Avoid DWARF definition DIE on ifunc symbol so that GDB can handle
+ ifunc symbol properly. */
+libc_ifunc_redirected (__redirect___memcmpeq, __memcmpeq,
+#ifdef __LITTLE_ENDIAN__
+ (hwcap2 & PPC_FEATURE2_ARCH_3_1
+ && hwcap & PPC_FEATURE_HAS_VSX)
+ ? __memcmpeq_power10 :
+#endif
+ (hwcap2 & PPC_FEATURE2_ARCH_2_07
+ && hwcap & PPC_FEATURE_HAS_ALTIVEC)
+ ? __memcmp_power8 :
+ (hwcap & PPC_FEATURE_ARCH_2_06)
+ ? __memcmp_power7
+ : (hwcap & PPC_FEATURE_POWER4)
+ ? __memcmp_power4
+ : __memcmp_ppc);
+# ifdef SHARED
+__hidden_ver1 (__memcmpeq, __GI___memcmpeq, __redirect___memcmpeq)
+ __attribute__ ((visibility ("hidden"))) __attribute_copy__
(__memcmpeq);
+# endif
+#else
+#include <string/memcmp.c>