ARM: Fix performance issue in strcpy

Message ID	000501cfb245$2248d600$66da8200$@com
State	Committed
Headers	Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org From: "Wilco Dijkstra" <wdijkstr@arm.com> To: <libc-alpha@sourceware.org> Subject: [PATCH] ARM: Fix performance issue in strcpy Date: Thu, 7 Aug 2014 14:40:20 +0100 Message-ID: <000501cfb245$2248d600$66da8200$@com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0006_01CFB24D.840D3E00"

Message ID

000501cfb245$2248d600$66da8200$@com

State

Committed

Headers

Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: libc-alpha-owner@sourceware.org
From: "Wilco Dijkstra" <wdijkstr@arm.com>
To: <libc-alpha@sourceware.org>
Subject: [PATCH] ARM: Fix performance issue in strcpy
Date: Thu, 7 Aug 2014 14:40:20 +0100
Message-ID: <000501cfb245$2248d600$66da8200$@com>
MIME-Version: 1.0
Content-Type: multipart/mixed;
	boundary="----=_NextPart_000_0006_01CFB24D.840D3E00"

Commit Message

Wilco Dijkstra Aug. 7, 2014, 1:40 p.m. UTC

  Hi,

This patch fixes a performance bug in strcp. The code dealing with unaligned copies uses mvns to
detect whether a register is 0. This is incorrect - the zero flag is only set if the value is -1. As
a result the code always does a byte-by-byte copy for the full string rather than doing the
word-based copy for the misaligned cases. Fixing this more than doubles performance.

OK for commit?

ChangeLog:
2014-08-07  Wilco Dijkstra <wdijkstr@arm.com>

	* sysdeps/arm/armv6/strcpy.S (strcpy):
	Fix performance issue in misaligned cases.
---
 sysdeps/arm/armv6/strcpy.S |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Joseph Myers Aug. 7, 2014, 2:05 p.m. UTC | #1

On Thu, 7 Aug 2014, Wilco Dijkstra wrote:

> Hi,
> 
> This patch fixes a performance bug in strcp. The code dealing with 
> unaligned copies uses mvns to detect whether a register is 0. This is 
> incorrect - the zero flag is only set if the value is -1. As a result 
> the code always does a byte-by-byte copy for the full string rather than 
> doing the word-based copy for the misaligned cases. Fixing this more 
> than doubles performance.
> 
> OK for commit?

OK if this has passed a full glibc testsuite run for at least one 
configuration using this code.

diff --git a/sysdeps/arm/armv6/strcpy.S b/sysdeps/arm/armv6/strcpy.S
index 833a83c..67bd9d8 100644
--- a/sysdeps/arm/armv6/strcpy.S
+++ b/sysdeps/arm/armv6/strcpy.S
@@ -159,7 +159,7 @@  ENTRY (strcpy)
 	@ Prologue to unaligned loop.  Seed shifted non-zero bytes.
 	uqsub8	r4, r7, r2		@ Find EOS
 	uqsub8	r5, r7, r3
-	mvns	r4, r4			@ EOS in first word?
+	cmp	r4, #0			@ EOS in first word?
 	it	ne
 	subne	r1, r1, #8
 	bne	.Lbyte_loop
@@ -179,7 +179,7 @@  ENTRY (strcpy)
 	@ Rotated unaligned copy loop.  The tail of the prologue is
 	@ shared with the loop itself.
 	.balign 8
-1:	mvns	r5, r5			@ EOS in second word?
+1:	cmp	r5, #0			@ EOS in second word?
 	bne	4f
 	@ Combine first and second words
 	orr	r2, r2, r3, lsh_gt #(\unalign*8)
@@ -194,7 +194,7 @@  ENTRY (strcpy)
 	sfi_pld	r1, #128
 	uqsub8	r5, r7, r3
 	sfi_pld	r0, #128
-	mvns	r4, r4			@ EOS in first word?
+	cmp	r4, #0			@ EOS in first word?
 	bne	3f
 	@ Combine the leftover and the first word
 	orr	r6, r6, r2, lsh_gt #(\unalign*8)

ARM: Fix performance issue in strcpy

Commit Message

Comments

Patch