[1/4] arm: Fix armv7 neon memchr on ARM mode
Commit Message
Current optimized armv7 neon memchr uses the NO_THUMB wrongly to
conditionalize thumb instruction usage. The flags is meant to be
defined before sysdep.h inclusion and to indicate the assembly
requires to build in ARM mode, not to check whether thumb is
enable or not. This patch fixes it by using the GCC provided
'__thumb__' instead.
Also, even if the implementation is fixed to not use thumb instructions
it was clearly not proper checked in ARM mode: the carry bit flag will
be reset in previous 'cmp synd, #0' and thus the 'bhi cntin, #0' won't
be able to branch correctly if the loop finishes with 'cntin' being
negative (indicating that some bytes still require to be checked).
This patch also fixes it by checking the carry flag in previous loop
iteration directly (in ARM mode it will run both '.Lmasklast' and
'.Ltail' even if no byte is found in last loop iteration).
Checked on arm-linux-gnueabihf (with -marm and -mthumb mode).
[BZ #23031]
* sysdeps/arm/armv7/multiarch/memchr_neon.S (memchr): Fix tail check
on ARM mode.
(NO_THUMB): Check __thumb__ instead.
---
ChangeLog | 7 +++++++
sysdeps/arm/armv7/multiarch/memchr_neon.S | 9 +++------
2 files changed, 10 insertions(+), 6 deletions(-)
@@ -68,7 +68,7 @@
* allows to identify exactly which byte has matched.
*/
-#ifndef NO_THUMB
+#ifdef __thumb__
.thumb_func
#else
.arm
@@ -132,7 +132,7 @@ ENTRY(memchr)
/* The first block can also be the last */
bls .Lmasklast
/* Have we found something already? */
-#ifndef NO_THUMB
+#ifdef __thumb__
cbnz synd, .Ltail
#else
cmp synd, #0
@@ -176,14 +176,11 @@ ENTRY(memchr)
vpadd.i8 vdata0_0, vdata0_0, vdata1_0
vpadd.i8 vdata0_0, vdata0_0, vdata0_0
vmov synd, vdata0_0[0]
-#ifndef NO_THUMB
+#ifdef __thumb__
cbz synd, .Lnotfound
bhi .Ltail /* Uses the condition code from
subs cntin, cntin, #32 above. */
#else
- cmp synd, #0
- beq .Lnotfound
- cmp cntin, #0
bhi .Ltail
#endif