[1/3] gas: sanitize FB- and dollar-label uses

Message ID b07c5a52-5c10-407a-a948-cdafb47c9a03@suse.com
State New
Headers
Series gas: FB and dollar labels |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Testing passed

Commit Message

Jan Beulich March 22, 2024, 9:40 a.m. UTC
  I don't view it as sensible to be more lax when it comes to references
to (uses of) such labels compared to their definition: The latter has
been limited to decimal numerics, while the former permitted any radix.
Beyond that leading zeroes on such labels aren't helpful either. Imo
labels and their use sites would better match literally, to avoid
confusion.

As it turns out, one z80 testcase actually had such an odd use of labels
where definition and use don't match in spelling. That testcase is being
adjusted accordingly.

While there also adjust a comment on a local variable in
integer_constant().
---
Further to this, such label definitions are bounded by INT_MAX, whereas
label uses can go even into the 64-bit range with BFD64. But perhaps
that can be viewed as a cosmetic issue, as it only affects what error is
raised on such label uses.

The latest seeing the z80 testsuite instance of such a mismatch, I
wonder whether there's a need for an option to allow people to restore
original behavior. (As an aside, in that testcase I can't really see the
purpose of the 2nd 600$ label.)

Also, despite documentation saying so, -L does not cause any symbols to
be emitted afaict. Question is whether documentation or implementation
is wrong.
  

Comments

Nick Clifton March 26, 2024, 9:58 a.m. UTC | #1
Hi Jan,

   I just one, very minor comment:

> -  int small;			/* True if fits in 32 bits.  */
> +  int small;		/* True if fits in 32 bits (64 bits with BFD64).  */

   I think that this should be:

      bool small;       /* True if fits in 32 bits (64 bits with BFD64).  */

   It irks me when types are used incorrectly, and given that we have
   a boolean type, I feel that it should be used wherever it make sense.

   This is obviously not critical, so please feel free to ignore the
   suggestion if you do not agree with it.

Cheers
   Nick
  
Jan Beulich March 26, 2024, 10:08 a.m. UTC | #2
Nick,

On 26.03.2024 10:58, Nick Clifton wrote:
>    I just one, very minor comment:
> 
>> -  int small;			/* True if fits in 32 bits.  */
>> +  int small;		/* True if fits in 32 bits (64 bits with BFD64).  */
> 
>    I think that this should be:
> 
>       bool small;       /* True if fits in 32 bits (64 bits with BFD64).  */
> 
>    It irks me when types are used incorrectly, and given that we have
>    a boolean type, I feel that it should be used wherever it make sense.
> 
>    This is obviously not critical, so please feel free to ignore the
>    suggestion if you do not agree with it.

actually I'm glad you mention it. I was meaning to switch to bool, but
then I wasn't sure whether such would be liked as a "side effect" of
another change. You indicating (as to my reading) that it would, I'll
be happy to make type adjustments here and elsewhere (there are a lot
more places where in particular "bool" is meant when "int" is used),
when touching stuff anyway.

Jan
  
Nick Clifton March 26, 2024, 11:15 a.m. UTC | #3
Hi Jan,

>>> -  int small;			/* True if fits in 32 bits.  */
>>> +  int small;		/* True if fits in 32 bits (64 bits with BFD64).  */
>>
>>     I think that this should be:
>>
>>        bool small;       /* True if fits in 32 bits (64 bits with BFD64).  */

> actually I'm glad you mention it. I was meaning to switch to bool, but
> then I wasn't sure whether such would be liked as a "side effect" of
> another change. You indicating (as to my reading) that it would, 

Right.  As I see it, you fixing both the comment and the type for that
field, and both of these things are reasonable part of the patch.

> I'll
> be happy to make type adjustments here and elsewhere (there are a lot
> more places where in particular "bool" is meant when "int" is used),
> when touching stuff anyway.

Please do.

I have it in my head to take a week and just go through all of the binutils
code base, correcting types like this.  All I need is a free week... :-)

Cheers
   Nick
  

Patch

--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,9 @@ 
 -*- text -*-
 
+* References to FB and dollar labels, when supported, are no longer permitted
+  in a radix other than 10.  (Note that definitions of such labels were already
+  thus restricted, except that leading zeroes were permitted.)
+
 * The base register operand in D(X,B) and D(L,B) may be explicitly omitted
   in assembly on s390. It can now be coded as D(X,) or D(L,) instead of D(X,0)
   D(X,%r0), D(L,0), and D(L,%r0).
--- a/gas/expr.c
+++ b/gas/expr.c
@@ -284,7 +284,7 @@  integer_constant (int radix, expressionS
   char *name;		/* Points to name of symbol.  */
   symbolS *symbolP;	/* Points to symbol.  */
 
-  int small;			/* True if fits in 32 bits.  */
+  int small;		/* True if fits in 32 bits (64 bits with BFD64).  */
 
   /* May be bignum, or may fit in 32 bits.  */
   /* Most numbers fit into 32 bits, and we want this case to be fast.
@@ -549,10 +549,12 @@  integer_constant (int radix, expressionS
 #ifndef tc_allow_L_suffix
 #define tc_allow_L_suffix 1
 #endif
+  bool l_seen = !tc_allow_L_suffix;
   /* PR 20732: Look for, and ignore, a L or LL suffix to the number.  */
   if (tc_allow_L_suffix && (c == 'L' || c == 'l'))
     {
       c = * input_line_pointer++;
+      l_seen = true;
       if (c == 'L' || c == 'l')
 	c = *input_line_pointer++;
       if (!u_seen && (c == 'U' || c == 'u'))
@@ -561,13 +563,14 @@  integer_constant (int radix, expressionS
 
   if (small)
     {
-      /* Here with number, in correct radix. c is the next char.
-	 Note that unlike un*x, we allow "011f" "0x9f" to both mean
-	 the same as the (conventional) "9f".
-	 This is simply easier than checking for strict canonical
-	 form.  Syntax sux!  */
+      /* Here with number, in correct radix. c is the next char.  */
+      bool maybe_label = suffix == NULL
+			 && (!tc_allow_U_suffix || !u_seen)
+			 && (!tc_allow_L_suffix || !l_seen)
+			 && (radix == 10 ||
+			     (radix == 8 && input_line_pointer == start + 1));
 
-      if (LOCAL_LABELS_FB && c == 'b')
+      if (LOCAL_LABELS_FB && c == 'b' && maybe_label)
 	{
 	  /* Backward ref to local label.
 	     Because it is backward, expect it to be defined.  */
@@ -593,7 +596,7 @@  integer_constant (int radix, expressionS
 
 	  expressionP->X_add_number = 0;
 	}			/* case 'b' */
-      else if (LOCAL_LABELS_FB && c == 'f')
+      else if (LOCAL_LABELS_FB && c == 'f' && maybe_label)
 	{
 	  /* Forward reference.  Expect symbol to be undefined or
 	     unknown.  undefined: seen it before.  unknown: never seen
@@ -609,7 +612,7 @@  integer_constant (int radix, expressionS
 	  expressionP->X_add_symbol = symbolP;
 	  expressionP->X_add_number = 0;
 	}			/* case 'f' */
-      else if (LOCAL_LABELS_DOLLAR && c == '$')
+      else if (LOCAL_LABELS_DOLLAR && c == '$' && maybe_label)
 	{
 	  /* If the dollar label is *currently* defined, then this is just
 	     another reference to it.  If it is not *currently* defined,
--- a/gas/read.c
+++ b/gas/read.c
@@ -1269,6 +1269,10 @@  read_a_source_file (const char *name)
 	      while (ISDIGIT (*input_line_pointer))
 		{
 		  const long digit = *input_line_pointer - '0';
+
+		  /* Don't accept labels which look like octal numbers.  */
+		  if (temp == 0)
+		    break;
 		  if (temp > (INT_MAX - digit) / 10)
 		    {
 		      as_bad (_("local label too large near %s"), backup);
--- a/gas/testsuite/gas/z80/sdcc.s
+++ b/gas/testsuite/gas/z80/sdcc.s
@@ -13,7 +13,7 @@  valueadr = 0x1234
 _start::
 ;comment
 	ld      hl, #4+0
-00000$:
+0$:
 	adc	a, a
 	adc	a, b
 	adc	a, c
@@ -29,7 +29,7 @@  _start::
 	adc	a, (hl)
 	adc	a, 5 (ix)
 	adc	a, -2 (iy)
-00100$:
+100$:
 	add	a, a
 	add	a, b
 	add	a, c
@@ -45,7 +45,7 @@  _start::
 	add	a, (hl)
 	add	a, 5 (ix)
 	add	a, -2 (iy)
-00200$:
+200$:
 	and	a, a
 	and	a, b
 	and	a, c
@@ -61,7 +61,7 @@  _start::
 	and	a, (hl)
 	and	a, 5 (ix)
 	and	a, -2 (iy)
-00300$:
+300$:
 	cp	a, a
 	cp	a, b
 	cp	a, c
@@ -77,7 +77,7 @@  _start::
 	cp	a, (hl)
 	cp	a, 5 (ix)
 	cp	a, -2 (iy)
-00400$:
+400$:
 	or	a, a
 	or	a, b
 	or	a, c
@@ -93,7 +93,7 @@  _start::
 	or	a, (hl)
 	or	a, 5 (ix)
 	or	a, -2 (iy)
-00500$:
+500$:
 	sbc	a, a
 	sbc	a, b
 	sbc	a, c
@@ -109,7 +109,7 @@  _start::
 	sbc	a, (hl)
 	sbc	a, 5 (ix)
 	sbc	a, -2 (iy)
-00600$:
+600$:
 	sub	a, a
 	sub	a, b
 	sub	a, c
@@ -125,7 +125,7 @@  _start::
 	sub	a, (hl)
 	sub	a, 5 (ix)
 	sub	a, -2 (iy)
-00700$:
+700$:
 	xor	a, a
 	xor	a, b
 	xor	a, c
@@ -152,10 +152,10 @@  _start::
 _func:
 	ld	hl,0
 	ld	(hl),#<function
-00100$:
+100$:
 	inc	hl
 	ld	(hl),#>function
-00600$:
+600$:
 	jr	100$
 _finish::
 	ld	a, 2 (iy)