gas/ELF: Assume merge entity size of 1 if it's missing

Message ID 20250103032831.622617-1-thiago.bauermann@linaro.org
State New
Headers
Series gas/ELF: Assume merge entity size of 1 if it's missing |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Test passed

Commit Message

Thiago Jung Bauermann Jan. 3, 2025, 3:28 a.m. UTC
  commit d5cbf916be4a ("gas/ELF: also reject merge entity size being zero")
made gas stricter when an entity size isn't specified (and thus assumed to
be zero).

Unfortunately this mistake can happen in 32-bit Arm assembly, where the @
character is the start of a comment. E.g.:

  .section .rodata.str, "aMS", @progbits, 1

The aforementioned commit actually pointed out a binutils testcase with
this problem, as well as a bug in GCC's configure script.

Therefore, change the parser to accept the specific case of an absent
entsize with "M" and assume 1, as suggested by Alan Modra.

With this change, a version of GCC's configure script with the bug
recognizes support for section merging:

  configure:27013: checking assembler for section merging support
  configure:27022: /path/to/bin/as   --fatal-warnings -o conftest.o conftest.s >&5
  conftest.s: Assembler messages:
  conftest.s:1: entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1.
  configure:27025: $? = 0
  configure:27036: result: yes

	PR gas/32491
	* gas/config/obj-elf.c (obj_elf_section): Assume entsize of 1 if
	not specified.
	* gas/read.c (get_absolute_expression_kind): New function.
	* gas/read.h (get_absolute_expression_kind): Declare.
---
 gas/config/obj-elf.c | 15 +++++++++++----
 gas/read.c           | 21 +++++++++++++++++++++
 gas/read.h           |  1 +
 3 files changed, 33 insertions(+), 4 deletions(-)

Also, note that I posted a patch to fix GCC's configure script here:

https://inbox.sourceware.org/gcc-patches/20241227214756.1059146-1-thiago.bauermann@linaro.org/


base-commit: f087e2e3f6dece21c5cb6049b4531bf762a3068f
  

Comments

Jan Beulich Jan. 6, 2025, 8:31 a.m. UTC | #1
On 03.01.2025 04:28, Thiago Jung Bauermann wrote:
> commit d5cbf916be4a ("gas/ELF: also reject merge entity size being zero")
> made gas stricter when an entity size isn't specified (and thus assumed to
> be zero).
> 
> Unfortunately this mistake can happen in 32-bit Arm assembly, where the @
> character is the start of a comment. E.g.:
> 
>   .section .rodata.str, "aMS", @progbits, 1

And what if it's

   .section .rodata.str, "aMS", @progbits, 4

? You just can't ...

> The aforementioned commit actually pointed out a binutils testcase with
> this problem, as well as a bug in GCC's configure script.
> 
> Therefore, change the parser to accept the specific case of an absent
> entsize with "M" and assume 1, as suggested by Alan Modra.

... make such an assumption (but see below).

> With this change, a version of GCC's configure script with the bug
> recognizes support for section merging:
> 
>   configure:27013: checking assembler for section merging support
>   configure:27022: /path/to/bin/as   --fatal-warnings -o conftest.o conftest.s >&5
>   conftest.s: Assembler messages:
>   conftest.s:1: entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1.
>   configure:27025: $? = 0
>   configure:27036: result: yes

Nor do I agree on this aspect. I'm sure --fatal-warnings is passed here
for a reason. Yet you make the diagnostics ...

> --- a/gas/config/obj-elf.c
> +++ b/gas/config/obj-elf.c
> @@ -1318,15 +1318,22 @@ obj_elf_section (int push)
>  	  if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0
>  	      && *input_line_pointer == ',')
>  	    {
> +	      operatorT kind;
> +
>  	      ++input_line_pointer;
>  	      SKIP_WHITESPACE ();
>  	      if (inherit && *input_line_pointer == ','
>  		  && (bfd_section_flags (now_seg)
>  		      & (SEC_MERGE | SEC_STRINGS)) != 0)
>  		goto fetch_entsize;
> -	      entsize = get_absolute_expression ();
> +	      entsize = get_absolute_expression_kind (&kind);
>  	      SKIP_WHITESPACE ();
> -	      if (entsize <= 0)
> +	      if (kind == O_absent)
> +		{
> +		  as_tsktsk (_("entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1."));
> +		  entsize = 1;
> +		}
> +	      else if (entsize <= 0)
>  		{
>  		  as_warn (_("invalid merge / string entity size"));
>  		  attr &= ~(SHF_MERGE | SHF_STRINGS);
> @@ -1342,8 +1349,8 @@ obj_elf_section (int push)
>  	    }
>  	  else if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0)
>  	    {
> -	      as_warn (_("entity size for SHF_MERGE / SHF_STRINGS not specified"));
> -	      attr &= ~(SHF_MERGE | SHF_STRINGS);
> +	      as_tsktsk (_("entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1."));
> +	      entsize = 1;
>  	    }

... not be warnings anymore. With them being warnings, the defaulting may then
be acceptable (albeit still somewhat questionable).

> --- a/gas/read.c
> +++ b/gas/read.c
> @@ -45,6 +45,7 @@
>  #include "ginsn.h"
>  
>  #include <limits.h>
> +#include <stdbool.h>
>  
>  #ifndef TC_START_LABEL
>  #define TC_START_LABEL(STR, NUL_CHAR, NEXT_CHAR) (NEXT_CHAR == ':')
> @@ -569,6 +570,26 @@ get_absolute_expression (void)
>    return get_absolute_expr (&exp);
>  }
>  
> +/* Return value of absolute expression starting at INPUT_LINE_POINTER and
> +   set KIND to either O_constant, O_absent or O_illegal depending on
> +   whether the expression was constant, absent or something else
> +   entirely.  */
> +
> +offsetT
> +get_absolute_expression_kind (operatorT *kind)
> +{
> +  expressionS exp;
> +  offsetT value;
> +
> +  value = get_absolute_expr (&exp);
> +  if (exp.X_op == O_absent || exp.X_op == O_constant)
> +      *kind = exp.X_op;
> +  else
> +      *kind = O_illegal;
> +
> +  return value;
> +}

Personally I question such oddly named, overly special purpose helpers.
Callers needing details on the expression simply should be using the
respective underlying functions.

Jan
  
Thiago Jung Bauermann Jan. 8, 2025, 2:33 a.m. UTC | #2
References: <20250103032831.622617-1-thiago.bauermann@linaro.org>
	<6866522c-e094-42cc-af7d-b673deb36ee8@suse.com>
User-Agent: mu4e 1.12.8; emacs 29.4
Hello Jan,

Thanks for the review.

Jan Beulich <jbeulich@suse.com> writes:

> On 03.01.2025 04:28, Thiago Jung Bauermann wrote:
>> commit d5cbf916be4a ("gas/ELF: also reject merge entity size being zero")
>> made gas stricter when an entity size isn't specified (and thus assumed to
>> be zero).
>>
>> Unfortunately this mistake can happen in 32-bit Arm assembly, where the @
>> character is the start of a comment. E.g.:
>>
>>   .section .rodata.str, "aMS", @progbits, 1
>
> And what if it's
>
>    .section .rodata.str, "aMS", @progbits, 4
>
> ? You just can't ...
>
>> The aforementioned commit actually pointed out a binutils testcase with
>> this problem, as well as a bug in GCC's configure script.
>>
>> Therefore, change the parser to accept the specific case of an absent
>> entsize with "M" and assume 1, as suggested by Alan Modra.
>
> ... make such an assumption (but see below).

That's a good point.

>> With this change, a version of GCC's configure script with the bug
>> recognizes support for section merging:
>>
>>   configure:27013: checking assembler for section merging support
>>   configure:27022: /path/to/bin/as --fatal-warnings -o conftest.o conftest.s >&5
>>   conftest.s: Assembler messages:
>>   conftest.s:1: entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1.
>>   configure:27025: $? = 0
>>   configure:27036: result: yes
>
> Nor do I agree on this aspect. I'm sure --fatal-warnings is passed here
> for a reason. Yet you make the diagnostics ...
>
>> --- a/gas/config/obj-elf.c
>> +++ b/gas/config/obj-elf.c
>> @@ -1318,15 +1318,22 @@ obj_elf_section (int push)
>>  	  if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0
>>  	      && *input_line_pointer == ',')
>>  	    {
>> +	      operatorT kind;
>> +
>>  	      ++input_line_pointer;
>>  	      SKIP_WHITESPACE ();
>>  	      if (inherit && *input_line_pointer == ','
>>  		  && (bfd_section_flags (now_seg)
>>  		      & (SEC_MERGE | SEC_STRINGS)) != 0)
>>  		goto fetch_entsize;
>> -	      entsize = get_absolute_expression ();
>> +	      entsize = get_absolute_expression_kind (&kind);
>>  	      SKIP_WHITESPACE ();
>> -	      if (entsize <= 0)
>> +	      if (kind == O_absent)
>> +		{
>> +		  as_tsktsk (_("entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1."));
>> +		  entsize = 1;
>> +		}
>> +	      else if (entsize <= 0)
>>  		{
>>  		  as_warn (_("invalid merge / string entity size"));
>>  		  attr &= ~(SHF_MERGE | SHF_STRINGS);
>> @@ -1342,8 +1349,8 @@ obj_elf_section (int push)
>>  	    }
>>  	  else if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0)
>>  	    {
>> -	      as_warn (_("entity size for SHF_MERGE / SHF_STRINGS not specified"));
>> -	      attr &= ~(SHF_MERGE | SHF_STRINGS);
>> +	      as_tsktsk (_("entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1."));
>> +	      entsize = 1;
>>  	    }
>
> ... not be warnings anymore. With them being warnings, the defaulting may then
> be acceptable (albeit still somewhat questionable).

This is also a good point, though if this diagnostic is still a warning then
the patch loses most of its value, which is to keep existing behaviour of
previously released versions of GCC when built with current binutils.

Another alternative would be to backport the GCC configure.ac fix
to currently supported GCC branches.

>> --- a/gas/read.c
>> +++ b/gas/read.c
>> @@ -45,6 +45,7 @@
>>  #include "ginsn.h"
>>
>>  #include <limits.h>
>> +#include <stdbool.h>
>>
>>  #ifndef TC_START_LABEL
>>  #define TC_START_LABEL(STR, NUL_CHAR, NEXT_CHAR) (NEXT_CHAR == ':')

BTW, this #include is spurious. It's from a previous version of the code.

>> @@ -569,6 +570,26 @@ get_absolute_expression (void)
>>    return get_absolute_expr (&exp);
>>  }
>>
>> +/* Return value of absolute expression starting at INPUT_LINE_POINTER and
>> +   set KIND to either O_constant, O_absent or O_illegal depending on
>> +   whether the expression was constant, absent or something else
>> +   entirely.  */
>> +
>> +offsetT
>> +get_absolute_expression_kind (operatorT *kind)
>> +{
>> +  expressionS exp;
>> +  offsetT value;
>> +
>> +  value = get_absolute_expr (&exp);
>> +  if (exp.X_op == O_absent || exp.X_op == O_constant)
>> +      *kind = exp.X_op;
>> +  else
>> +      *kind = O_illegal;
>> +
>> +  return value;
>> +}
>
> Personally I question such oddly named, overly special purpose helpers.
> Callers needing details on the expression simply should be using the
> respective underlying functions.

I did it this way because get_absolute_expr is a static function. Would
it be acceptable if the patch exported it instead?

--
Thiago
Date: Tue, 07 Jan 2025 23:33:11 -0300
  
Jan Beulich Jan. 8, 2025, 7:12 a.m. UTC | #3
On 08.01.2025 03:33, Thiago Jung Bauermann wrote:
> Jan Beulich <jbeulich@suse.com> writes:
>> On 03.01.2025 04:28, Thiago Jung Bauermann wrote:
>>> @@ -569,6 +570,26 @@ get_absolute_expression (void)
>>>    return get_absolute_expr (&exp);
>>>  }
>>>
>>> +/* Return value of absolute expression starting at INPUT_LINE_POINTER and
>>> +   set KIND to either O_constant, O_absent or O_illegal depending on
>>> +   whether the expression was constant, absent or something else
>>> +   entirely.  */
>>> +
>>> +offsetT
>>> +get_absolute_expression_kind (operatorT *kind)
>>> +{
>>> +  expressionS exp;
>>> +  offsetT value;
>>> +
>>> +  value = get_absolute_expr (&exp);
>>> +  if (exp.X_op == O_absent || exp.X_op == O_constant)
>>> +      *kind = exp.X_op;
>>> +  else
>>> +      *kind = O_illegal;
>>> +
>>> +  return value;
>>> +}
>>
>> Personally I question such oddly named, overly special purpose helpers.
>> Callers needing details on the expression simply should be using the
>> respective underlying functions.
> 
> I did it this way because get_absolute_expr is a static function. Would
> it be acceptable if the patch exported it instead?

I suppose that would be okay, yes. Yet how about calling
expression_and_evaluate() directly, and going from there?

Jan
  

Patch

diff --git a/gas/config/obj-elf.c b/gas/config/obj-elf.c
index c4af01808b6a..d5b558210919 100644
--- a/gas/config/obj-elf.c
+++ b/gas/config/obj-elf.c
@@ -1318,15 +1318,22 @@  obj_elf_section (int push)
 	  if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0
 	      && *input_line_pointer == ',')
 	    {
+	      operatorT kind;
+
 	      ++input_line_pointer;
 	      SKIP_WHITESPACE ();
 	      if (inherit && *input_line_pointer == ','
 		  && (bfd_section_flags (now_seg)
 		      & (SEC_MERGE | SEC_STRINGS)) != 0)
 		goto fetch_entsize;
-	      entsize = get_absolute_expression ();
+	      entsize = get_absolute_expression_kind (&kind);
 	      SKIP_WHITESPACE ();
-	      if (entsize <= 0)
+	      if (kind == O_absent)
+		{
+		  as_tsktsk (_("entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1."));
+		  entsize = 1;
+		}
+	      else if (entsize <= 0)
 		{
 		  as_warn (_("invalid merge / string entity size"));
 		  attr &= ~(SHF_MERGE | SHF_STRINGS);
@@ -1342,8 +1349,8 @@  obj_elf_section (int push)
 	    }
 	  else if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0)
 	    {
-	      as_warn (_("entity size for SHF_MERGE / SHF_STRINGS not specified"));
-	      attr &= ~(SHF_MERGE | SHF_STRINGS);
+	      as_tsktsk (_("entity size for SHF_MERGE / SHF_STRINGS not specified. Assuming 1."));
+	      entsize = 1;
 	    }
 
 	  if ((attr & (SHF_MERGE | SHF_STRINGS)) != 0 && type == SHT_NOBITS)
diff --git a/gas/read.c b/gas/read.c
index 6d0d4b5e31a1..9468277bc609 100644
--- a/gas/read.c
+++ b/gas/read.c
@@ -45,6 +45,7 @@ 
 #include "ginsn.h"
 
 #include <limits.h>
+#include <stdbool.h>
 
 #ifndef TC_START_LABEL
 #define TC_START_LABEL(STR, NUL_CHAR, NEXT_CHAR) (NEXT_CHAR == ':')
@@ -569,6 +570,26 @@  get_absolute_expression (void)
   return get_absolute_expr (&exp);
 }
 
+/* Return value of absolute expression starting at INPUT_LINE_POINTER and
+   set KIND to either O_constant, O_absent or O_illegal depending on
+   whether the expression was constant, absent or something else
+   entirely.  */
+
+offsetT
+get_absolute_expression_kind (operatorT *kind)
+{
+  expressionS exp;
+  offsetT value;
+
+  value = get_absolute_expr (&exp);
+  if (exp.X_op == O_absent || exp.X_op == O_constant)
+      *kind = exp.X_op;
+  else
+      *kind = O_illegal;
+
+  return value;
+}
+
 static int pop_override_ok;
 static const char *pop_table_name;
 
diff --git a/gas/read.h b/gas/read.h
index d11daf13034f..bc236523a89e 100644
--- a/gas/read.h
+++ b/gas/read.h
@@ -117,6 +117,7 @@  extern char *demand_copy_string (int *lenP);
 extern char *demand_copy_C_string (int *len_pointer);
 extern char get_absolute_expression_and_terminator (long *val_pointer);
 extern offsetT get_absolute_expression (void);
+extern offsetT get_absolute_expression_kind (operatorT *kind);
 extern unsigned int next_char_of_string (void);
 extern void s_mri_sect (char *);
 extern char *mri_comment_field (char *);