[8/8] s390: Store SFrame CFA offset adjusted and scaled down

Message ID 20250402161203.2935633-9-jremus@linux.ibm.com
State Superseded
Headers
Series s390: Support to generate .sframe in assembler and linker |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Test passed

Commit Message

Jens Remus April 2, 2025, 4:12 p.m. UTC
  In SFrame V2 the size of the one to three offsets following a SFrame FDE
can be either signed 8-bit, 16-bit, or 32-bit integer, which the largest
offset determining their size:
  1. CFA offset from CFA base register
  2. RA (stack save slot) offset from CFA, usually -48 on s390x if saved
  3. FP (stack save slot) offset from CFA, usually -72 on s390x if saved
The FP and RA offsets from CFA, when FP/RA saved on the stack, usually
have fixed values that fit into signed 8-bit SFrame offsets.  Likewise
the DWARF register numbers on s390x of general registers (GR; 0-15) and
floating-point registers (FPR; 16-31), when FP/RA saved in registers.
With that the CFA offset from CFA base register has the greatest impact
on the signed SFrame offset size.

The s390x ELF ABI defines the stack pointer (SP) to be 8-byte aligned
[1] and the CFA as SP at call site + 160 [2].  The CFA offset from CFA
base register is therefore always a multiple of 8.

On S390 store the SFrame CFA offset from CFA base register scaled down
by the s390x-specific CFA alignment factor of 8, in addition to the
adjustment by the s390x-specific CFA adjustment of -160, to further
improve the use of signed 8-bit SFrame offsets.  This is similar to the
DWARF data alignment factor getting factored out from certain offsets
stored in DWARF CFI.

[1]: s390x ELF ABI, sections "Register Roles" and "Stack Frame
     Allocation", https://github.com/IBM/s390x-abi/releases
[2]: s390x ELF ABI, commit 4e38ad9c8a88 ("Document the CFA"),
     https://github.com/IBM/s390x-abi/commit/4e38ad9c8a88

include/
	* sframe.h (SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR): Define
	s390x-specific CFA offset alignment factor.
	(SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE,
	SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE): Scale down/up by
	SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR.

libsframe/
	* doc/sframe-spec.texi (S390,
	SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR): Document S390-specific
	CFA offset alignment factor.

Signed-off-by: Jens Remus <jremus@linux.ibm.com>
---

Notes (jremus):
    A test build of Glibc tag 2.41 on s390x libc.so shows an additional ~9%
    reduction in .sframe section size due to storing the SFrame CFA offsets
    scaled down by 8 in addition to the adjustment by -160.  In total
    adjusting and scaling down reduces the .sframe size by ~17%.  The
    overall number of offsets larger than 8-bit is effectively reduced down
    to ~2.4%.
    
    Statistics for libc.so - base (no CFA offset adjustment nor scaling):
    
      .sframe size:  169,749 bytes
    
      VALUE        TOTAL      MIN        MAX        AVG
      FDEs:        3652       -          -          -
      FREs/FDE:    15236      1          20         4
      Offsets/FDE: 29792      1          38         8
         8-bit:    0          0          0          0
        16-bit:    29792      1          38         8
        32-bit:    0          0          0          0
      Offsets/FRE: 29792      1          3          1
         8-bit:    -          0          0          0
        16-bit:    -          1          3          1
        32-bit:    -          0          0          0
      O_Padd/FDE:  342        -          -          0
         8-bit:    0
        16-bit:    342
        32-bit:    0
    
    Statistics for libc.so - CFA offset adjustment (no scaling):
    
      .sframe size: 154,757 bytes
    
      VALUE        TOTAL      MIN        MAX        AVG
      FDEs:        3654       -          -          -
      FREs/FDE:    15238      1          20         4
      Offsets/FDE: 29794      1          38         8
         8-bit:    14992      1          38         4
        16-bit:    14802      0          0          4
        32-bit:    0          0          0          0
      Offsets/FRE: 29794      2          6          1
         8-bit:    -          1          3          0
        16-bit:    -          1          3          0
        32-bit:    -          0          0          0
      O_Padd/FDE:  342        -          -          0
         8-bit:    283
        16-bit:    59
        32-bit:    0
    
    Statistics for libc.so - CFA offset adjustment and scaling:
    
      .sframe size: 140,657 bytes
    
      VALUE        TOTAL      MIN        MAX        AVG
      FDEs:        3654       -          -          -
      FREs/FDE:    15238      1          20         4
      Offsets/FDE: 29794      1          38         8
         8-bit:    29092      1          38         7
        16-bit:    702        0          0          0
        32-bit:    0          0          0          0
      Offsets/FRE: 29794      3          6          1
         8-bit:    -          1          3          1
        16-bit:    -          2          3          0
        32-bit:    -          0          0          0
      O_Padd/FDE:  342        -          -          0
         8-bit:    342
        16-bit:    0
        32-bit:    0

 include/sframe.h               | 11 ++++++++---
 libsframe/doc/sframe-spec.texi | 13 ++++++++-----
 2 files changed, 16 insertions(+), 8 deletions(-)
  

Comments

Indu Bhagat April 17, 2025, 6:53 p.m. UTC | #1
On 4/2/25 9:12 AM, Jens Remus wrote:
> In SFrame V2 the size of the one to three offsets following a SFrame FDE

"following an SFrame FRE"

Also, I would avoid saying "of the one to three offsets".  The format by 
itself does not limit the number of offsets, though for all current 
practical usages, we see 1-3 offsets.  It is, of course, not recommended 
to use the format to encode large number of offsets.

So it can be confusing to see that in git logs.

> can be either signed 8-bit, 16-bit, or 32-bit integer, which the largest
> offset determining their size:
>    1. CFA offset from CFA base register
>    2. RA (stack save slot) offset from CFA, usually -48 on s390x if saved
>    3. FP (stack save slot) offset from CFA, usually -72 on s390x if saved
> The FP and RA offsets from CFA, when FP/RA saved on the stack, usually
> have fixed values that fit into signed 8-bit SFrame offsets.  Likewise
> the DWARF register numbers on s390x of general registers (GR; 0-15) and
> floating-point registers (FPR; 16-31), when FP/RA saved in registers.
> With that the CFA offset from CFA base register has the greatest impact
> on the signed SFrame offset size.
> 
> The s390x ELF ABI defines the stack pointer (SP) to be 8-byte aligned
> [1] and the CFA as SP at call site + 160 [2].  The CFA offset from CFA
> base register is therefore always a multiple of 8.
> 
> On S390 store the SFrame CFA offset from CFA base register scaled down
> by the s390x-specific CFA alignment factor of 8, in addition to the
> adjustment by the s390x-specific CFA adjustment of -160, to further
> improve the use of signed 8-bit SFrame offsets.  This is similar to the
> DWARF data alignment factor getting factored out from certain offsets
> stored in DWARF CFI.
> 
> [1]: s390x ELF ABI, sections "Register Roles" and "Stack Frame
>       Allocation", https://github.com/IBM/s390x-abi/releases
> [2]: s390x ELF ABI, commit 4e38ad9c8a88 ("Document the CFA"),
>       https://github.com/IBM/s390x-abi/commit/4e38ad9c8a88
> 
> include/
> 	* sframe.h (SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR): Define
> 	s390x-specific CFA offset alignment factor.
> 	(SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE,
> 	SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE): Scale down/up by
> 	SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR.
> 
> libsframe/
> 	* doc/sframe-spec.texi (S390,
> 	SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR): Document S390-specific
> 	CFA offset alignment factor.
> 
> Signed-off-by: Jens Remus <jremus@linux.ibm.com>
> ---
> 
> Notes (jremus):
>      A test build of Glibc tag 2.41 on s390x libc.so shows an additional ~9%
>      reduction in .sframe section size due to storing the SFrame CFA offsets
>      scaled down by 8 in addition to the adjustment by -160.  In total
>      adjusting and scaling down reduces the .sframe size by ~17%.  The
>      overall number of offsets larger than 8-bit is effectively reduced down
>      to ~2.4%.
>      
>      Statistics for libc.so - base (no CFA offset adjustment nor scaling):
>      
>        .sframe size:  169,749 bytes
>      
>        VALUE        TOTAL      MIN        MAX        AVG
>        FDEs:        3652       -          -          -
>        FREs/FDE:    15236      1          20         4
>        Offsets/FDE: 29792      1          38         8
>           8-bit:    0          0          0          0
>          16-bit:    29792      1          38         8
>          32-bit:    0          0          0          0
>        Offsets/FRE: 29792      1          3          1
>           8-bit:    -          0          0          0
>          16-bit:    -          1          3          1
>          32-bit:    -          0          0          0
>        O_Padd/FDE:  342        -          -          0
>           8-bit:    0
>          16-bit:    342
>          32-bit:    0
>      
>      Statistics for libc.so - CFA offset adjustment (no scaling):
>      
>        .sframe size: 154,757 bytes
>      
>        VALUE        TOTAL      MIN        MAX        AVG
>        FDEs:        3654       -          -          -
>        FREs/FDE:    15238      1          20         4
>        Offsets/FDE: 29794      1          38         8
>           8-bit:    14992      1          38         4
>          16-bit:    14802      0          0          4
>          32-bit:    0          0          0          0
>        Offsets/FRE: 29794      2          6          1
>           8-bit:    -          1          3          0
>          16-bit:    -          1          3          0
>          32-bit:    -          0          0          0
>        O_Padd/FDE:  342        -          -          0
>           8-bit:    283
>          16-bit:    59
>          32-bit:    0
>      
>      Statistics for libc.so - CFA offset adjustment and scaling:
>      
>        .sframe size: 140,657 bytes
>      
>        VALUE        TOTAL      MIN        MAX        AVG
>        FDEs:        3654       -          -          -
>        FREs/FDE:    15238      1          20         4
>        Offsets/FDE: 29794      1          38         8
>           8-bit:    29092      1          38         7
>          16-bit:    702        0          0          0
>          32-bit:    0          0          0          0
>        Offsets/FRE: 29794      3          6          1
>           8-bit:    -          1          3          1
>          16-bit:    -          2          3          0
>          32-bit:    -          0          0          0
>        O_Padd/FDE:  342        -          -          0
>           8-bit:    342
>          16-bit:    0
>          32-bit:    0
> 
>   include/sframe.h               | 11 ++++++++---
>   libsframe/doc/sframe-spec.texi | 13 ++++++++-----
>   2 files changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/include/sframe.h b/include/sframe.h
> index 325bc6b8b5dc..06474970ccc4 100644
> --- a/include/sframe.h
> +++ b/include/sframe.h
> @@ -365,12 +365,17 @@ typedef struct sframe_frame_row_entry_addr4
>     (1ULL << ((SFRAME_FRE_TYPE_ADDR4 * 2) * 8))
>   
>   /* On S390, the CFA offset from CFA base register is by definition a minimum
> -   of 160.  Store it adjusted by -160 to enable use of 8-bit SFrame offsets.  */
> +   of 160.  Store it adjusted by -160 to enable use of 8-bit SFrame offsets.
> +   Additionally scale by an alignment factor of 8, as the SP and thus CFA
> +   offset on S390 is always 8-byte aligned.  */
>   #define SFRAME_S390_CFA_OFFSET_ADJUSTMENT	SFRAME_S390_SP_VAL_OFFSET
> +#define SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR	8
>   #define SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE(offset) \
> -  ((offset) + SFRAME_S390_CFA_OFFSET_ADJUSTMENT)
> +  (((offset) + SFRAME_S390_CFA_OFFSET_ADJUSTMENT) \
> +   / SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR)
>   #define SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE(offset) \
> -  ((offset) - SFRAME_S390_CFA_OFFSET_ADJUSTMENT)
> +  (((offset) * SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR) \
> +   - SFRAME_S390_CFA_OFFSET_ADJUSTMENT)
>   
>   /* On S390, the CFA is defined as SP at call site + 160.  Therefore the
>      SP value offset from CFA is -160.  */
> diff --git a/libsframe/doc/sframe-spec.texi b/libsframe/doc/sframe-spec.texi
> index f9a117d55cc8..d080f02e7bc5 100644
> --- a/libsframe/doc/sframe-spec.texi
> +++ b/libsframe/doc/sframe-spec.texi
> @@ -838,12 +838,15 @@ Hence, in summary:
>   
>   Irrespective of the ABI, the first stack offset is always used to locate the
>   CFA.  On S390 the value of the offset is stored adjusted by the S390-specific
> -@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} to enable the use of signed 8-bit
> -offsets on S390.
> +@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} and scaled down by the S390-specific
> +@code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR}, to enable and improve the use
> +of signed 8-bit offsets on S390.
>   S390-specific helpers @code{SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE} and
> -@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform and undo the
> -adjustment.  The CFA offset can therefore be interpreted as:
> -CFA = @code{BASE_REG} + offset1 - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
> +@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform or undo the
> +adjustment and scaling.  The CFA offset can therefore be interpreted as:
> +CFA = @code{BASE_REG}
> +    + (offset1 * @code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR})
> +    - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
>   or
>   CFA = @code{BASE_REG} + @code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE(offset1)}.
>   The identification of the @code{BASE_REG} is done by using the

Was it considered to apply the scaling down for all stack offsets 
uniformly ?  It looks to me that the benefit of that will be simply 
uniformity, and not so much size benefits.  But I am curious to know 
your opinion (or any experimentation) on this.
  
Jens Remus April 22, 2025, 2:33 p.m. UTC | #2
Hello Indu,

thank you for the review feedback!

On 17.04.2025 20:53, Indu Bhagat wrote:
> On 4/2/25 9:12 AM, Jens Remus wrote:
>> In SFrame V2 the size of the one to three offsets following a SFrame FDE
> 
> "following an SFrame FRE"
> 
> Also, I would avoid saying "of the one to three offsets".  The format
> by itself does not limit the number of offsets, though for all current
> practical usages, we see 1-3 offsets.  It is, of course, not
> recommended to use the format to encode large number of offsets.
> 
> So it can be confusing to see that in git logs.

Sure.  Same as for the preceding commit.

>> can be either signed 8-bit, 16-bit, or 32-bit integer, which the largest
>> offset determining their size:
>>    1. CFA offset from CFA base register
>>    2. RA (stack save slot) offset from CFA, usually -48 on s390x if saved
>>    3. FP (stack save slot) offset from CFA, usually -72 on s390x if saved
>> The FP and RA offsets from CFA, when FP/RA saved on the stack, usually
>> have fixed values that fit into signed 8-bit SFrame offsets.  Likewise
>> the DWARF register numbers on s390x of general registers (GR; 0-15) and
>> floating-point registers (FPR; 16-31), when FP/RA saved in registers.
>> With that the CFA offset from CFA base register has the greatest impact
>> on the signed SFrame offset size.
>>
>> The s390x ELF ABI defines the stack pointer (SP) to be 8-byte aligned
>> [1] and the CFA as SP at call site + 160 [2].  The CFA offset from CFA
>> base register is therefore always a multiple of 8.
>>
>> On S390 store the SFrame CFA offset from CFA base register scaled down
>> by the s390x-specific CFA alignment factor of 8, in addition to the
>> adjustment by the s390x-specific CFA adjustment of -160, to further
>> improve the use of signed 8-bit SFrame offsets.  This is similar to the
>> DWARF data alignment factor getting factored out from certain offsets
>> stored in DWARF CFI.
>>
>> [1]: s390x ELF ABI, sections "Register Roles" and "Stack Frame
>>       Allocation", https://github.com/IBM/s390x-abi/releases
>> [2]: s390x ELF ABI, commit 4e38ad9c8a88 ("Document the CFA"),
>>       https://github.com/IBM/s390x-abi/commit/4e38ad9c8a88

>> diff --git a/libsframe/doc/sframe-spec.texi b/libsframe/doc/sframe-spec.texi

>> @@ -838,12 +838,15 @@ Hence, in summary:
>>   Irrespective of the ABI, the first stack offset is always used to locate the
>>   CFA.  On S390 the value of the offset is stored adjusted by the S390-specific
>> -@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} to enable the use of signed 8-bit
>> -offsets on S390.
>> +@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} and scaled down by the S390-specific
>> +@code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR}, to enable and improve the use
>> +of signed 8-bit offsets on S390.
>>   S390-specific helpers @code{SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE} and
>> -@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform and undo the
>> -adjustment.  The CFA offset can therefore be interpreted as:
>> -CFA = @code{BASE_REG} + offset1 - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
>> +@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform or undo the
>> +adjustment and scaling.  The CFA offset can therefore be interpreted as:
>> +CFA = @code{BASE_REG}
>> +    + (offset1 * @code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR})
>> +    - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
>>   or
>>   CFA = @code{BASE_REG} + @code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE(offset1)}.
>>   The identification of the @code{BASE_REG} is done by using the
> 
> Was it considered to apply the scaling down for all stack offsets
> uniformly ?  It looks to me that the benefit of that will be simply
> uniformity, and not so much size benefits.  But I am curious to know
> your opinion (or any experimentation) on this.

On s390x GCC and Clang adhere to the "Register save area usage example"
shown in the s390x ELF ABI.  That is the "preferred" FP register r11 is
always saved at offset -72 from CFA and the RA register r14 at offset
-48, if saved on the stack.

Both -72 and -48 fit into a signed 8-bit SFrame offset.  Therefore I did
not see any benefit to apply a scaling as I did for the CFA offset from
CFA base register.

Furthermore the encoding of DWARF register numbers as offsets (see patch
"[PATCH 4/8] s390: Represent FP/RA saved in register in SFrame") limits
the possibility to apply such scaling.  As the LSB is used as indicator
it could only be done >>2 instead of >>3.

If SFrame V3 would represent register save slot offsets and register
numbers in a different way, then it may make sense to consider to apply
such as an arch-dependent scaling similar to the DWARF data alignment
factor uniformly to all offsets (e.g. CFA offset, RA offset, FP offset).

Regards,
Jens
  
Indu Bhagat May 5, 2025, 4:56 a.m. UTC | #3
On 4/22/25 7:33 AM, Jens Remus wrote:
> Hello Indu,
> 
> thank you for the review feedback!
> 
> On 17.04.2025 20:53, Indu Bhagat wrote:
>> On 4/2/25 9:12 AM, Jens Remus wrote:
>>> In SFrame V2 the size of the one to three offsets following a SFrame FDE
>>
>> "following an SFrame FRE"
>>
>> Also, I would avoid saying "of the one to three offsets".  The format
>> by itself does not limit the number of offsets, though for all current
>> practical usages, we see 1-3 offsets.  It is, of course, not
>> recommended to use the format to encode large number of offsets.
>>
>> So it can be confusing to see that in git logs.
> 
> Sure.  Same as for the preceding commit.
> 
>>> can be either signed 8-bit, 16-bit, or 32-bit integer, which the largest
>>> offset determining their size:
>>>    1. CFA offset from CFA base register
>>>    2. RA (stack save slot) offset from CFA, usually -48 on s390x if 
>>> saved
>>>    3. FP (stack save slot) offset from CFA, usually -72 on s390x if 
>>> saved
>>> The FP and RA offsets from CFA, when FP/RA saved on the stack, usually
>>> have fixed values that fit into signed 8-bit SFrame offsets.  Likewise
>>> the DWARF register numbers on s390x of general registers (GR; 0-15) and
>>> floating-point registers (FPR; 16-31), when FP/RA saved in registers.
>>> With that the CFA offset from CFA base register has the greatest impact
>>> on the signed SFrame offset size.
>>>
>>> The s390x ELF ABI defines the stack pointer (SP) to be 8-byte aligned
>>> [1] and the CFA as SP at call site + 160 [2].  The CFA offset from CFA
>>> base register is therefore always a multiple of 8.
>>>
>>> On S390 store the SFrame CFA offset from CFA base register scaled down
>>> by the s390x-specific CFA alignment factor of 8, in addition to the
>>> adjustment by the s390x-specific CFA adjustment of -160, to further
>>> improve the use of signed 8-bit SFrame offsets.  This is similar to the
>>> DWARF data alignment factor getting factored out from certain offsets
>>> stored in DWARF CFI.
>>>
>>> [1]: s390x ELF ABI, sections "Register Roles" and "Stack Frame
>>>       Allocation", https://github.com/IBM/s390x-abi/releases
>>> [2]: s390x ELF ABI, commit 4e38ad9c8a88 ("Document the CFA"),
>>>       https://github.com/IBM/s390x-abi/commit/4e38ad9c8a88
> 
>>> diff --git a/libsframe/doc/sframe-spec.texi b/libsframe/doc/sframe- 
>>> spec.texi
> 
>>> @@ -838,12 +838,15 @@ Hence, in summary:
>>>   Irrespective of the ABI, the first stack offset is always used to 
>>> locate the
>>>   CFA.  On S390 the value of the offset is stored adjusted by the 
>>> S390-specific
>>> -@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} to enable the use of signed 
>>> 8-bit
>>> -offsets on S390.
>>> +@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} and scaled down by the 
>>> S390-specific
>>> +@code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR}, to enable and 
>>> improve the use
>>> +of signed 8-bit offsets on S390.
>>>   S390-specific helpers @code{SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE} and
>>> -@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform 
>>> and undo the
>>> -adjustment.  The CFA offset can therefore be interpreted as:
>>> -CFA = @code{BASE_REG} + offset1 - 
>>> @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
>>> +@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform 
>>> or undo the
>>> +adjustment and scaling.  The CFA offset can therefore be interpreted 
>>> as:
>>> +CFA = @code{BASE_REG}
>>> +    + (offset1 * @code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR})
>>> +    - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
>>>   or
>>>   CFA = @code{BASE_REG} + 
>>> @code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE(offset1)}.
>>>   The identification of the @code{BASE_REG} is done by using the
>>
>> Was it considered to apply the scaling down for all stack offsets
>> uniformly ?  It looks to me that the benefit of that will be simply
>> uniformity, and not so much size benefits.  But I am curious to know
>> your opinion (or any experimentation) on this.
> 
> On s390x GCC and Clang adhere to the "Register save area usage example"
> shown in the s390x ELF ABI.  That is the "preferred" FP register r11 is
> always saved at offset -72 from CFA and the RA register r14 at offset
> -48, if saved on the stack.
> 
> Both -72 and -48 fit into a signed 8-bit SFrame offset.  Therefore I did
> not see any benefit to apply a scaling as I did for the CFA offset from
> CFA base register.
> 
> Furthermore the encoding of DWARF register numbers as offsets (see patch
> "[PATCH 4/8] s390: Represent FP/RA saved in register in SFrame") limits
> the possibility to apply such scaling.  As the LSB is used as indicator
> it could only be done >>2 instead of >>3.
> 
> If SFrame V3 would represent register save slot offsets and register
> numbers in a different way, then it may make sense to consider to apply
> such as an arch-dependent scaling similar to the DWARF data alignment
> factor uniformly to all offsets (e.g. CFA offset, RA offset, FP offset).
> 

Makes sense.
  

Patch

diff --git a/include/sframe.h b/include/sframe.h
index 325bc6b8b5dc..06474970ccc4 100644
--- a/include/sframe.h
+++ b/include/sframe.h
@@ -365,12 +365,17 @@  typedef struct sframe_frame_row_entry_addr4
   (1ULL << ((SFRAME_FRE_TYPE_ADDR4 * 2) * 8))
 
 /* On S390, the CFA offset from CFA base register is by definition a minimum
-   of 160.  Store it adjusted by -160 to enable use of 8-bit SFrame offsets.  */
+   of 160.  Store it adjusted by -160 to enable use of 8-bit SFrame offsets.
+   Additionally scale by an alignment factor of 8, as the SP and thus CFA
+   offset on S390 is always 8-byte aligned.  */
 #define SFRAME_S390_CFA_OFFSET_ADJUSTMENT	SFRAME_S390_SP_VAL_OFFSET
+#define SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR	8
 #define SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE(offset) \
-  ((offset) + SFRAME_S390_CFA_OFFSET_ADJUSTMENT)
+  (((offset) + SFRAME_S390_CFA_OFFSET_ADJUSTMENT) \
+   / SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR)
 #define SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE(offset) \
-  ((offset) - SFRAME_S390_CFA_OFFSET_ADJUSTMENT)
+  (((offset) * SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR) \
+   - SFRAME_S390_CFA_OFFSET_ADJUSTMENT)
 
 /* On S390, the CFA is defined as SP at call site + 160.  Therefore the
    SP value offset from CFA is -160.  */
diff --git a/libsframe/doc/sframe-spec.texi b/libsframe/doc/sframe-spec.texi
index f9a117d55cc8..d080f02e7bc5 100644
--- a/libsframe/doc/sframe-spec.texi
+++ b/libsframe/doc/sframe-spec.texi
@@ -838,12 +838,15 @@  Hence, in summary:
 
 Irrespective of the ABI, the first stack offset is always used to locate the
 CFA.  On S390 the value of the offset is stored adjusted by the S390-specific
-@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} to enable the use of signed 8-bit
-offsets on S390.
+@code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT} and scaled down by the S390-specific
+@code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR}, to enable and improve the use
+of signed 8-bit offsets on S390.
 S390-specific helpers @code{SFRAME_V2_FRE_S390_CFA_OFFSET_ENCODE} and
-@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform and undo the
-adjustment.  The CFA offset can therefore be interpreted as:
-CFA = @code{BASE_REG} + offset1 - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
+@code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE} are provided to perform or undo the
+adjustment and scaling.  The CFA offset can therefore be interpreted as:
+CFA = @code{BASE_REG}
+    + (offset1 * @code{SFRAME_S390_CFA_OFFSET_ALIGNMENT_FACTOR})
+    - @code{SFRAME_S390_CFA_OFFSET_ADJUSTMENT}
 or
 CFA = @code{BASE_REG} + @code{SFRAME_V2_FRE_S390_CFA_OFFSET_DECODE(offset1)}.
 The identification of the @code{BASE_REG} is done by using the