ARM: Fix performance issue in strcpy

Message ID 000501cfb245$2248d600$66da8200$@com
State Committed
Headers

Commit Message

Wilco Dijkstra Aug. 7, 2014, 1:40 p.m. UTC
  Hi,

This patch fixes a performance bug in strcp. The code dealing with unaligned copies uses mvns to
detect whether a register is 0. This is incorrect - the zero flag is only set if the value is -1. As
a result the code always does a byte-by-byte copy for the full string rather than doing the
word-based copy for the misaligned cases. Fixing this more than doubles performance.

OK for commit?

ChangeLog:
2014-08-07  Wilco Dijkstra <wdijkstr@arm.com>

	* sysdeps/arm/armv6/strcpy.S (strcpy):
	Fix performance issue in misaligned cases.
---
 sysdeps/arm/armv6/strcpy.S |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
  

Comments

Joseph Myers Aug. 7, 2014, 2:05 p.m. UTC | #1
On Thu, 7 Aug 2014, Wilco Dijkstra wrote:

> Hi,
> 
> This patch fixes a performance bug in strcp. The code dealing with 
> unaligned copies uses mvns to detect whether a register is 0. This is 
> incorrect - the zero flag is only set if the value is -1. As a result 
> the code always does a byte-by-byte copy for the full string rather than 
> doing the word-based copy for the misaligned cases. Fixing this more 
> than doubles performance.
> 
> OK for commit?

OK if this has passed a full glibc testsuite run for at least one 
configuration using this code.
  

Patch

diff --git a/sysdeps/arm/armv6/strcpy.S b/sysdeps/arm/armv6/strcpy.S
index 833a83c..67bd9d8 100644
--- a/sysdeps/arm/armv6/strcpy.S
+++ b/sysdeps/arm/armv6/strcpy.S
@@ -159,7 +159,7 @@  ENTRY (strcpy)
 	@ Prologue to unaligned loop.  Seed shifted non-zero bytes.
 	uqsub8	r4, r7, r2		@ Find EOS
 	uqsub8	r5, r7, r3
-	mvns	r4, r4			@ EOS in first word?
+	cmp	r4, #0			@ EOS in first word?
 	it	ne
 	subne	r1, r1, #8
 	bne	.Lbyte_loop
@@ -179,7 +179,7 @@  ENTRY (strcpy)
 	@ Rotated unaligned copy loop.  The tail of the prologue is
 	@ shared with the loop itself.
 	.balign 8
-1:	mvns	r5, r5			@ EOS in second word?
+1:	cmp	r5, #0			@ EOS in second word?
 	bne	4f
 	@ Combine first and second words
 	orr	r2, r2, r3, lsh_gt #(\unalign*8)
@@ -194,7 +194,7 @@  ENTRY (strcpy)
 	sfi_pld	r1, #128
 	uqsub8	r5, r7, r3
 	sfi_pld	r0, #128
-	mvns	r4, r4			@ EOS in first word?
+	cmp	r4, #0			@ EOS in first word?
 	bne	3f
 	@ Combine the leftover and the first word
 	orr	r6, r6, r2, lsh_gt #(\unalign*8)