aarch64: Optimized memcmp for medium to large sizes
Commit Message
On Tuesday 06 March 2018 10:47 PM, Szabolcs Nagy wrote:
> this broke the build for me:
>
> /B/elf/librtld.os: In function `memcmp':
> /S/string/../sysdeps/aarch64/memcmp.S:78: undefined reference to `.Lloop8'
> collect2: error: ld returned 1 exit status
> make[2]: *** [/B/elf/ld.so] Error 1
> make[2]: Leaving directory `/S/elf'
Sorry, I took the lazy way out and failed to smoke test the loop8 name
fixup and missed one instance. I've pushed this obvious fix after
actually building it this time.
Siddhesh
From 4e54d918630ea53e29dd70d3bdffcb00d29ed3d4 Mon Sep 17 00:00:00 2001
From: Siddhesh Poyarekar <siddhesh@sourceware.org>
Date: Tue, 6 Mar 2018 22:56:35 +0530
Subject: [PATCH] aarch64: Fix branch target to loop16
I goofed up when changing the loop8 name to loop16 and missed on out
the branch instance. Fixed and actually build tested this time.
* sysdeps/aarch64/memcmp.S (more16): Fix branch target loop16.
---
ChangeLog | 2 ++
sysdeps/aarch64/memcmp.S | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
@@ -1,5 +1,7 @@
2018-03-06 Siddhesh Poyarekar <siddhesh@sourceware.org>
+ * sysdeps/aarch64/memcmp.S (more16): Fix loop16 branch target.
+
* sysdeps/aarch64/memcmp.S: Widen comparison to 16 bytes at a
time.
@@ -75,7 +75,7 @@ L(more16):
/* We overlap loads between 0-32 bytes at either side of SRC1 when we
try to align, so limit it only to strings larger than 128 bytes. */
cmp limit, 96
- b.ls L(loop8)
+ b.ls L(loop16)
/* Align src1 and adjust src2 with bytes not yet done. */
and tmp1, src1, 15
--
2.14.3