s390: Use long branches across object boundaries (jgh instead of jh)

Message ID 875yt19q0w.fsf@oldenburg.str.redhat.com
State Committed
Commit 98966749f2b418825ff2ea496a0ee89fe63d2cc8
Headers
Series s390: Use long branches across object boundaries (jgh instead of jh) |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Florian Weimer Nov. 9, 2021, 5:50 p.m. UTC
  Depending on the layout chosen by the linker, the 16-bit displacement
of the jh instruction is insufficient to reach the target label.

Analysis of the linker failure was carried out by Nick Clifton.

Tested on a z13 and z15, s390x-linux-gnu only.

---
 sysdeps/s390/memmem-arch13.S | 2 +-
 sysdeps/s390/strstr-arch13.S | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
  

Comments

Carlos O'Donell Nov. 9, 2021, 7:06 p.m. UTC | #1
On 11/9/21 12:50, Florian Weimer wrote:
> Depending on the layout chosen by the linker, the 16-bit displacement
> of the jh instruction is insufficient to reach the target label.
> 
> Analysis of the linker failure was carried out by Nick Clifton.
> 
> Tested on a z13 and z15, s390x-linux-gnu only.

Looks correct to me. Converting from BRC to BRCL doubles the available offset bits.
I tested assembling a few variants and they look good to me.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> ---
>  sysdeps/s390/memmem-arch13.S | 2 +-
>  sysdeps/s390/strstr-arch13.S | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/sysdeps/s390/memmem-arch13.S b/sysdeps/s390/memmem-arch13.S
> index c5c8d8c97e..58df8cdb14 100644
> --- a/sysdeps/s390/memmem-arch13.S
> +++ b/sysdeps/s390/memmem-arch13.S
> @@ -41,7 +41,7 @@ ENTRY(MEMMEM_ARCH13)
>  #  error The arch13 variant of memmem needs the z13 variant of memmem!
>  # endif
>  	clgfi	%r5,9
> -	jh	MEMMEM_Z13

OK. jh is BRC (branch relative on condition, A7-M-4-RI) with a 16-bit offset.

> +	jgh	MEMMEM_Z13

OK. jgh is BRCL (branch relative on condition long, C0-M-4-RI) with a 32-bit offset.

>  
>  	aghik	%r0,%r5,-1		/* vll needs highest index.  */
>  	bc	4,0(%r14)		/* cc==1: return if needle-len == 0.  */
> diff --git a/sysdeps/s390/strstr-arch13.S b/sysdeps/s390/strstr-arch13.S
> index c7183e627c..222a6de91a 100644
> --- a/sysdeps/s390/strstr-arch13.S
> +++ b/sysdeps/s390/strstr-arch13.S
> @@ -49,7 +49,7 @@ ENTRY(STRSTR_ARCH13)
>  #  error The arch13 variant of strstr needs the z13 variant of strstr!
>  # endif
>  	clgfi	%r4,9
> -	jh	STRSTR_Z13
> +	jgh	STRSTR_Z13

Likewise.

>  
>  	/* In case of a partial match, the vstrs instruction returns the index
>  	   of the partial match in a vector-register.  Then we have to
>
  
Stefan Liebler Nov. 10, 2021, 1:57 p.m. UTC | #2
On 09/11/2021 20:06, Carlos O'Donell wrote:
> On 11/9/21 12:50, Florian Weimer wrote:
>> Depending on the layout chosen by the linker, the 16-bit displacement
>> of the jh instruction is insufficient to reach the target label.
>>
>> Analysis of the linker failure was carried out by Nick Clifton.
>>
>> Tested on a z13 and z15, s390x-linux-gnu only.
> 
> Looks correct to me. Converting from BRC to BRCL doubles the available offset bits.
> I tested assembling a few variants and they look good to me.
> 
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> 
>> ---
>>  sysdeps/s390/memmem-arch13.S | 2 +-
>>  sysdeps/s390/strstr-arch13.S | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/sysdeps/s390/memmem-arch13.S b/sysdeps/s390/memmem-arch13.S
>> index c5c8d8c97e..58df8cdb14 100644
>> --- a/sysdeps/s390/memmem-arch13.S
>> +++ b/sysdeps/s390/memmem-arch13.S
>> @@ -41,7 +41,7 @@ ENTRY(MEMMEM_ARCH13)
>>  #  error The arch13 variant of memmem needs the z13 variant of memmem!
>>  # endif
>>  	clgfi	%r5,9
>> -	jh	MEMMEM_Z13
> 
> OK. jh is BRC (branch relative on condition, A7-M-4-RI) with a 16-bit offset.
> 
>> +	jgh	MEMMEM_Z13
> 
> OK. jgh is BRCL (branch relative on condition long, C0-M-4-RI) with a 32-bit offset.
> 
>>  
>>  	aghik	%r0,%r5,-1		/* vll needs highest index.  */
>>  	bc	4,0(%r14)		/* cc==1: return if needle-len == 0.  */
>> diff --git a/sysdeps/s390/strstr-arch13.S b/sysdeps/s390/strstr-arch13.S
>> index c7183e627c..222a6de91a 100644
>> --- a/sysdeps/s390/strstr-arch13.S
>> +++ b/sysdeps/s390/strstr-arch13.S
>> @@ -49,7 +49,7 @@ ENTRY(STRSTR_ARCH13)
>>  #  error The arch13 variant of strstr needs the z13 variant of strstr!
>>  # endif
>>  	clgfi	%r4,9
>> -	jh	STRSTR_Z13
>> +	jgh	STRSTR_Z13
> 
> Likewise.
> 
>>  
>>  	/* In case of a partial match, the vstrs instruction returns the index
>>  	   of the partial match in a vector-register.  Then we have to
>>
> 
> 

This patch is okay. Thanks for catching this.
I've also had a look into the wcsmbs implementations like
sysdeps/s390/wcscpy-vx.S where we jump to c-implementation as fallback.
There we also use jg which is brcl with all bits set in the condition mask.

Thanks,
Stefan
  

Patch

diff --git a/sysdeps/s390/memmem-arch13.S b/sysdeps/s390/memmem-arch13.S
index c5c8d8c97e..58df8cdb14 100644
--- a/sysdeps/s390/memmem-arch13.S
+++ b/sysdeps/s390/memmem-arch13.S
@@ -41,7 +41,7 @@  ENTRY(MEMMEM_ARCH13)
 #  error The arch13 variant of memmem needs the z13 variant of memmem!
 # endif
 	clgfi	%r5,9
-	jh	MEMMEM_Z13
+	jgh	MEMMEM_Z13
 
 	aghik	%r0,%r5,-1		/* vll needs highest index.  */
 	bc	4,0(%r14)		/* cc==1: return if needle-len == 0.  */
diff --git a/sysdeps/s390/strstr-arch13.S b/sysdeps/s390/strstr-arch13.S
index c7183e627c..222a6de91a 100644
--- a/sysdeps/s390/strstr-arch13.S
+++ b/sysdeps/s390/strstr-arch13.S
@@ -49,7 +49,7 @@  ENTRY(STRSTR_ARCH13)
 #  error The arch13 variant of strstr needs the z13 variant of strstr!
 # endif
 	clgfi	%r4,9
-	jh	STRSTR_Z13
+	jgh	STRSTR_Z13
 
 	/* In case of a partial match, the vstrs instruction returns the index
 	   of the partial match in a vector-register.  Then we have to