powerpc: strcasecmp/strncasecmp optmization for power8 [BZ 20327]

Message ID 577BA36D.5060601@linux.vnet.ibm.com
State Committed
Headers

Commit Message

Rajalakshmi S July 5, 2016, 12:09 p.m. UTC
  On 07/04/2016 07:46 PM, Florian Weimer wrote:
> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
>
>> Committed as c8376f3e07602aaef9cb843bb73cb5f2b860634a
>> after removing those lines.
>
> This appears to have caused bug 20327.
>
> Could you have a look?
>
> Thanks,
> Florian
>
>
I have fixed it in the attached patch.
  

Comments

Florian Weimer July 5, 2016, 12:15 p.m. UTC | #1
On 07/05/2016 02:09 PM, Rajalakshmi Srinivasaraghavan wrote:

> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
>>
>>> Committed as c8376f3e07602aaef9cb843bb73cb5f2b860634a
>>> after removing those lines.
>>
>> This appears to have caused bug 20327.
>>
>> Could you have a look?
>>
>> Thanks,
>> Florian
>>
>>
> I have fixed it in the attached patch.

I can confirm this addresses the issue we saw.  Feel free to push if you 
feel confident about this change. :)

Thanks,
Florian
  
Tulio Magno Quites Machado Filho July 5, 2016, 2:01 p.m. UTC | #2
Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:

> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
> Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs

Could you replace POWER8 by powerpc, please?

> If the input values are unaligned and if there are null characters in the
> memory before the starting address of the input values, strcasecmp
> gives incorrect return code. Fixed it by adding mask the bits that
> are not part of the string.
>
> Tested on ppc64 and ppc64le.

Despite this being a bug fix, I believe we need the approval from Adhemerval
before integrating it during the freeze window.

> 	[BZ #20327]
> 	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
> 	are not part of the string.

This is a very important case.  Can we improve the current testcase to
validate this scenario too?

> ---
>  sysdeps/powerpc/powerpc64/power8/strcasecmp.S | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> index 63f6217..d6a4df2 100644
> --- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> +++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
> @@ -44,7 +44,9 @@
>  #ifdef __LITTLE_ENDIAN__
>  #define GET16BYTES(reg1, reg2, reg3) \
>  	lvx	reg1, 0, reg2; \
> -	vcmpequb.	v8, v0, reg1; \
> +	vspltisb	v8, -1; \
> +	vperm	v8, v8, reg1, reg3; \
> +	vcmpequb.	v8, v0, v8; \
>  	beq	cr6, 1f; \
>  	vspltisb	v9, 0; \
>  	b	2f; \
> @@ -57,7 +59,9 @@
>  #else
>  #define GET16BYTES(reg1, reg2, reg3) \
>  	lvx	reg1, 0, reg2; \
> -	vcmpequb.	v8, v0, reg1; \
> +	vspltisb	 v8, -1; \
> +	vperm	v8, reg1, v8,  reg3; \
> +	vcmpequb.	v8, v0, v8; \
>  	beq	cr6, 1f; \
>  	vspltisb	v9, 0; \
>  	b	2f; \

Although this code is simple, I believe this macro is missing more comments.

I suggest to explain the following:
 - How does this macro use reg1, reg2, reg3 and v8?
 - Why is it setting v9 to 0?
  
Tulio Magno Quites Machado Filho July 5, 2016, 2:05 p.m. UTC | #3
Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> writes:

> Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:
>
>> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
>> 	[BZ #20327]
>> 	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
>> 	are not part of the string.
>
> This is a very important case.  Can we improve the current testcase to
> validate this scenario too?

Ooops.  I just read in the bug report that Florian is already working on this,
so feel free to ignore this.
  
Adhemerval Zanella Netto July 5, 2016, 2:32 p.m. UTC | #4
On 05/07/2016 11:01, Tulio Magno Quites Machado Filho wrote:
> Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:
> 
>> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
>> Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs
> 
> Could you replace POWER8 by powerpc, please?
> 
>> If the input values are unaligned and if there are null characters in the
>> memory before the starting address of the input values, strcasecmp
>> gives incorrect return code. Fixed it by adding mask the bits that
>> are not part of the string.
>>
>> Tested on ppc64 and ppc64le.
> 
> Despite this being a bug fix, I believe we need the approval from Adhemerval
> before integrating it during the freeze window.
> 

Bugfixes are ok for current phase and I was sorting out the release blockers
yesterday.  I will send a message about it in a couple of hours.
  
Florian Weimer July 5, 2016, 3:34 p.m. UTC | #5
On 07/05/2016 04:05 PM, Tulio Magno Quites Machado Filho wrote:
> Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> writes:
>
>> Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:
>>
>>> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>>>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
>>> 	[BZ #20327]
>>> 	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
>>> 	are not part of the string.
>>
>> This is a very important case.  Can we improve the current testcase to
>> validate this scenario too?
>
> Ooops.  I just read in the bug report that Florian is already working on this,
> so feel free to ignore this.

Yeah, the proposed patch is here:

   <https://sourceware.org/ml/libc-alpha/2016-07/msg00126.html>

Thanks,
Florian
  
Rajalakshmi S July 5, 2016, 4:10 p.m. UTC | #6
On 07/05/2016 07:31 PM, Tulio Magno Quites Machado Filho wrote:
> Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com> writes:
>
>> On 07/04/2016 07:46 PM, Florian Weimer wrote:
>>> On 06/14/2016 11:45 AM, Rajalakshmi Srinivasaraghavan wrote:
>> Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs
> Could you replace POWER8 by powerpc, please?
>
>> If the input values are unaligned and if there are null characters in the
>> memory before the starting address of the input values, strcasecmp
>> gives incorrect return code. Fixed it by adding mask the bits that
>> are not part of the string.
>>
>> Tested on ppc64 and ppc64le.
> Despite this being a bug fix, I believe we need the approval from Adhemerval
> before integrating it during the freeze window.
>
>> 	[BZ #20327]
>> 	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
>> 	are not part of the string.
> This is a very important case.  Can we improve the current testcase to
> validate this scenario too?
>
>> ---
>>   sysdeps/powerpc/powerpc64/power8/strcasecmp.S | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
>> index 63f6217..d6a4df2 100644
>> --- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
>> +++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
>> @@ -44,7 +44,9 @@
>>   #ifdef __LITTLE_ENDIAN__
>>   #define GET16BYTES(reg1, reg2, reg3) \
>>   	lvx	reg1, 0, reg2; \
>> -	vcmpequb.	v8, v0, reg1; \
>> +	vspltisb	v8, -1; \
>> +	vperm	v8, v8, reg1, reg3; \
>> +	vcmpequb.	v8, v0, v8; \
>>   	beq	cr6, 1f; \
>>   	vspltisb	v9, 0; \
>>   	b	2f; \
>> @@ -57,7 +59,9 @@
>>   #else
>>   #define GET16BYTES(reg1, reg2, reg3) \
>>   	lvx	reg1, 0, reg2; \
>> -	vcmpequb.	v8, v0, reg1; \
>> +	vspltisb	 v8, -1; \
>> +	vperm	v8, reg1, v8,  reg3; \
>> +	vcmpequb.	v8, v0, v8; \
>>   	beq	cr6, 1f; \
>>   	vspltisb	v9, 0; \
>>   	b	2f; \
> Although this code is simple, I believe this macro is missing more comments.
>
> I suggest to explain the following:
>   - How does this macro use reg1, reg2, reg3 and v8?
>   - Why is it setting v9 to 0?
>
Added comments and Committed it as
30e4cc5413f72c2c728a544389da0c48500d9904
  

Patch

From 42c2aeecdb371b7544b81cb09cfcb12c2f873c72 Mon Sep 17 00:00:00 2001
From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
Date: Tue, 5 Jul 2016 04:41:51 -0400
Subject: [PATCH] POWER8: Fix return code of strcasecmp for unaligned inputs

If the input values are unaligned and if there are null characters in the
memory before the starting address of the input values, strcasecmp
gives incorrect return code. Fixed it by adding mask the bits that
are not part of the string.

Tested on ppc64 and ppc64le.

	[BZ #20327]
	* sysdeps/powerpc/powerpc64/power8/strcasecmp.S: Mask bits that
	are not part of the string.
---
 sysdeps/powerpc/powerpc64/power8/strcasecmp.S | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
index 63f6217..d6a4df2 100644
--- a/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
+++ b/sysdeps/powerpc/powerpc64/power8/strcasecmp.S
@@ -44,7 +44,9 @@ 
 #ifdef __LITTLE_ENDIAN__
 #define GET16BYTES(reg1, reg2, reg3) \
 	lvx	reg1, 0, reg2; \
-	vcmpequb.	v8, v0, reg1; \
+	vspltisb	v8, -1; \
+	vperm	v8, v8, reg1, reg3; \
+	vcmpequb.	v8, v0, v8; \
 	beq	cr6, 1f; \
 	vspltisb	v9, 0; \
 	b	2f; \
@@ -57,7 +59,9 @@ 
 #else
 #define GET16BYTES(reg1, reg2, reg3) \
 	lvx	reg1, 0, reg2; \
-	vcmpequb.	v8, v0, reg1; \
+	vspltisb	 v8, -1; \
+	vperm	v8, reg1, v8,  reg3; \
+	vcmpequb.	v8, v0, v8; \
 	beq	cr6, 1f; \
 	vspltisb	v9, 0; \
 	b	2f; \
-- 
1.8.3.1