[v2,32/65] MMIX: use is_whitespace()

Message ID a78761f1-6017-47b1-9d40-fd1d2f189042@suse.com
State New
Headers
Series gas: whitespace handling |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 fail Patch failed to apply

Commit Message

Jan Beulich Jan. 27, 2025, 4:28 p.m. UTC
  Convert open-coded checks as well as ISSPACE() uses. At the same time
use is_end_of_stmt() instead of open-coded checks; do the conversion
even when not adjacent to code being modified anyway to cover all cases
where the is_end_of_line[] index was wrongly cast from plain char (which
can be signed) to unsigned int.
---
v2: New.
  

Comments

Hans-Peter Nilsson Jan. 31, 2025, 6:18 a.m. UTC | #1
On Mon, 27 Jan 2025, Jan Beulich wrote:
> Convert open-coded checks as well as ISSPACE() uses. At the same time
> use is_end_of_stmt() instead of open-coded checks; do the conversion
> even when not adjacent to code being modified anyway to cover all cases
> where the is_end_of_line[] index was wrongly cast from plain char (which
> can be signed) to unsigned int.

Sorry, it looks like I'm going to have to finish looking at this 
late in week-end.

brgds, H-P
  
Hans-Peter Nilsson Jan. 31, 2025, 6:33 a.m. UTC | #2
On Mon, 27 Jan 2025, Jan Beulich wrote:
> Convert open-coded checks as well as ISSPACE() uses. At the same time
> use is_end_of_stmt() instead of open-coded checks; do the conversion
> even when not adjacent to code being modified anyway to cover all cases
> where the is_end_of_line[] index was wrongly cast from plain char (which
> can be signed) to unsigned int.

Where is the "wrongly cast" to which you refer?

I see no more "wrongly casts" before the patch than after, 
considering that is_whitespace is doing such a cast.  But, I 
could easily have missed a spot.

brgds, H-P
  
Jan Beulich Jan. 31, 2025, 6:54 a.m. UTC | #3
On 31.01.2025 07:33, Hans-Peter Nilsson wrote:
> On Mon, 27 Jan 2025, Jan Beulich wrote:
>> Convert open-coded checks as well as ISSPACE() uses. At the same time
>> use is_end_of_stmt() instead of open-coded checks; do the conversion
>> even when not adjacent to code being modified anyway to cover all cases
>> where the is_end_of_line[] index was wrongly cast from plain char (which
>> can be signed) to unsigned int.
> 
> Where is the "wrongly cast" to which you refer?
> 
> I see no more "wrongly casts" before the patch than after, 
> considering that is_whitespace is doing such a cast.  But, I 
> could easily have missed a spot.

Take this example:

      while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
	s++;

If plain char is signed, negative values will convert to huge positive
unsigned int ones. As opposed to when casting to unsigned char (as all
the IS*() and is*() helper macros effectively[1] do, not just
is_end_of_stmt()).

Jan

[1] safe-ctype.h actually ANDs by 0xff, apparently in an attempt to
deal with CHAR_BITS != 8 cases. That's not really correct though imo;
that case needs dealing with by growing the array dimensions, to
cover all values representable in a char.

Jan
  
Hans-Peter Nilsson Jan. 31, 2025, 7:05 a.m. UTC | #4
On Fri, 31 Jan 2025, Jan Beulich wrote:
> On 31.01.2025 07:33, Hans-Peter Nilsson wrote:
> > On Mon, 27 Jan 2025, Jan Beulich wrote:
> >> Convert open-coded checks as well as ISSPACE() uses. At the same time
> >> use is_end_of_stmt() instead of open-coded checks; do the conversion
> >> even when not adjacent to code being modified anyway to cover all cases
> >> where the is_end_of_line[] index was wrongly cast from plain char (which
> >> can be signed) to unsigned int.
> > 
> > Where is the "wrongly cast" to which you refer?
> > 
> > I see no more "wrongly casts" before the patch than after, 
> > considering that is_whitespace is doing such a cast.  But, I 
> > could easily have missed a spot.
> 
> Take this example:
> 
>       while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])

Oh boy, all those "unsigned int" should have been "unsigned 
char".  Doh!  Thanks.

brgds, H-P
  
Hans-Peter Nilsson Feb. 3, 2025, 6:54 a.m. UTC | #5
On Mon, 27 Jan 2025, Jan Beulich wrote:
> Convert open-coded checks as well as ISSPACE() uses. At the same time
> use is_end_of_stmt() instead of open-coded checks; do the conversion
> even when not adjacent to code being modified anyway to cover all cases
> where the is_end_of_line[] index was wrongly cast from plain char (which
> can be signed) to unsigned int.
> ---
> v2: New.
> 
> --- a/gas/config/tc-mmix.c
> +++ b/gas/config/tc-mmix.c

Sorry, there are a few hunks that are wrong, like missing a test 
for ';' and other changes that may work but doesn't logically 
make sense to me.

Most of the stuff like s/ISSPACE/is_whitespace/ is obviously ok, 
as is of course fixing those horrible (unsigned int) typos.

I plan to post a replacement patch (5 hunks differ), but I need 
to look at it again with fresh eyes so I'll pause it for a day 
or two.

brgds, H-P
  
Jan Beulich Feb. 3, 2025, 7:30 a.m. UTC | #6
On 03.02.2025 07:54, Hans-Peter Nilsson wrote:
> On Mon, 27 Jan 2025, Jan Beulich wrote:
>> Convert open-coded checks as well as ISSPACE() uses. At the same time
>> use is_end_of_stmt() instead of open-coded checks; do the conversion
>> even when not adjacent to code being modified anyway to cover all cases
>> where the is_end_of_line[] index was wrongly cast from plain char (which
>> can be signed) to unsigned int.
>> ---
>> v2: New.
>>
>> --- a/gas/config/tc-mmix.c
>> +++ b/gas/config/tc-mmix.c
> 
> Sorry, there are a few hunks that are wrong, like missing a test 
> for ';' and other changes that may work but doesn't logically 
> make sense to me.

Missing ';' checks? The patch replaces a few with is_end_of_stmt(), yes,
but that's intentional (to stop its open-coding).

> Most of the stuff like s/ISSPACE/is_whitespace/ is obviously ok, 
> as is of course fixing those horrible (unsigned int) typos.
> 
> I plan to post a replacement patch (5 hunks differ), but I need 
> to look at it again with fresh eyes so I'll pause it for a day 
> or two.

Well, okay, I'll leave the patch out then for the time being, as I was
meaning to commit the series today or tomorrow. Nevertheless I continue
to think the patch is correct.

Jan
  
Hans-Peter Nilsson Feb. 5, 2025, 6:22 a.m. UTC | #7
On Mon, 3 Feb 2025, Jan Beulich wrote:

> On 03.02.2025 07:54, Hans-Peter Nilsson wrote:
> > On Mon, 27 Jan 2025, Jan Beulich wrote:
> >> Convert open-coded checks as well as ISSPACE() uses. At the same time
> >> use is_end_of_stmt() instead of open-coded checks; do the conversion
> >> even when not adjacent to code being modified anyway to cover all cases
> >> where the is_end_of_line[] index was wrongly cast from plain char (which
> >> can be signed) to unsigned int.
> >> ---
> >> v2: New.
> >>
> >> --- a/gas/config/tc-mmix.c
> >> +++ b/gas/config/tc-mmix.c
> > 
> > Sorry, there are a few hunks that are wrong, like missing a test 
> > for ';' and other changes that may work but doesn't logically 
> > make sense to me.
> 
> Missing ';' checks? The patch replaces a few with is_end_of_stmt(), yes,
> but that's intentional (to stop its open-coding).

Oh right, I'd looked at the end_of_line[] definition and forgot 
about MMIX adding ';' by means of line_separator_chars[]. In the 
end, all code-changes were ok though with the caveat of now 
supporting just tab and space.

Though bundling all that into saying "stop open-coding" was too 
vague.  You were simply doing too much and still too little in 
the same patch, making review harder than necessary.

Too much: "replacing open-coding" and removing redundant 
checking while changing those whitespace tests; the IS... calls 
to is_... and fixing erroneous casts.

Unfortunately there isn't a 1:1 equivalences, so the input 
language is now different.  Now with is_whitespace it accepts 
only ' ' and '\t' where previously ISSPACE covered more 
whitespace characters in some places.  Not sure there are any 
assembly language programmers using e.g. \f instead of \t, but 
some of their code now no longer assembles for some targets.  
Example: replace the space after the first 1H in 1hjmp1b.s with 
a form-feed and run the related tests.  But, pragmatically, I'll 
ok that change.  Nobody in their right mind should use other 
than \t and ' ' as assembly-code field separators.  They had it 
coming. 1/2 :-)

Too little: not explicitly mentioning what "open-coding" you 
intentionally changed (as opposed to accidental edits).
The commit message could have helped some of this, but it 
said too little to tell the implication of the changes 
and for me as a maintainer to feel ok with all the changes.

BTW, please use "git format" for patches sent to the list.  
It'd help local testing and may simplify (programming of) 
pre-commit CI autotesters.

What follows is what I committed on your behalf.  It's your 
patch (so naming you as the author) but with a more verbose 
commit message plus a comment above a construct that I had to 
step through with gdb to see why there had to be an 
is_whitespace() || is_end_of_stmt() or else a testcase would 
fail: a case where ISSPACE covered '\n'.

brgds, H-P
----------------------------
From: Jan Beulich <jbeulich@suse.com>
Subject: [PATCH] gas MMIX: Use more of is_... framework like is_whitespace and
 is_end_of_stmt

Convert uses of ISSPACE() and testing for specific characters into
calls to is_whitespace and is_end_of_stmt.  While doing that, also
remove some redundant tests, like ';' together with is_end_of_line[]
and is_whitespace and !is_end_of_line.

Note the invalid casts being fixed as part of moving to is_... macros;
there were (unsigned int) where there should have been (unsigned char)
applied on char as index to is_end_of_line[].

Beware that the input language changes slightly: some constructs with
whitespace characters other than space and TAB are now invalid.
---
 gas/config/tc-mmix.c | 70 +++++++++++++++++++-------------------------
 1 file changed, 30 insertions(+), 40 deletions(-)

diff --git a/gas/config/tc-mmix.c b/gas/config/tc-mmix.c
index a43774d755c0..3715790348cf 100644
--- a/gas/config/tc-mmix.c
+++ b/gas/config/tc-mmix.c
@@ -454,11 +454,11 @@ get_operands (int max_operands, char *s, expressionS *exp)
   while (nextchar == ',')
     {
       /* Skip leading whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
 
       /* Check to see if we have any operands left to parse */
-      if (*p == 0 || *p == '\n' || *p == '\r')
+      if (is_end_of_stmt (*p))
 	{
 	  break;
 	}
@@ -489,7 +489,7 @@ get_operands (int max_operands, char *s, expressionS *exp)
       p = input_line_pointer;
 
       /* Skip leading whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
       nextchar = *p++;
     }
@@ -545,7 +545,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
   int regno;
 
   /* Skip leading whitespace */
-  while (*p == ' ' || *p == '\t')
+  while (is_whitespace (*p))
     p++;
 
   input_line_pointer = p;
@@ -565,7 +565,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
       p = input_line_pointer;
 
       /* Skip whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
 
       if (*p == ',')
@@ -573,7 +573,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
 	  p++;
 
 	  /* Skip whitespace */
-	  while (*p == ' ' || *p == '\t')
+	  while (is_whitespace (*p))
 	    p++;
 	  sregp = p;
 	  input_line_pointer = sregp;
@@ -594,7 +594,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
       p = input_line_pointer;
 
       /* Skip whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
 
       if (*p == ',')
@@ -602,7 +602,7 @@ get_putget_operands (struct mmix_opcode *insn, char *operands,
 	  p++;
 
 	  /* Skip whitespace */
-	  while (*p == ' ' || *p == '\t')
+	  while (is_whitespace (*p))
 	    p++;
 
 	  input_line_pointer = p;
@@ -829,7 +829,7 @@ md_assemble (char *str)
        ++operands)
     ;
 
-  if (ISSPACE (*operands))
+  if (is_whitespace (*operands))
     {
       modified_char = *operands;
       *operands++ = '\0';
@@ -2924,7 +2924,11 @@ mmix_handle_mmixal (void)
   /* If we're on a line with a label, check if it's a mmixal fb-label.
      Save an indicator and skip the label; it must be set only after all
      fb-labels of expressions are evaluated.  */
-  if (ISDIGIT (s[0]) && s[1] == 'H' && ISSPACE (s[2]))
+  if (ISDIGIT (s[0]) && s[1] == 'H'
+      /* A lone "1H" on a line is valid: we'll then see is_end_of_stmt()
+	 being true for the following character (likely a '\n' but '\n'
+	 doesn't count as is_whitespace).  */
+      && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
     {
       current_fb_label = s[0] - '0';
 
@@ -2935,12 +2939,12 @@ mmix_handle_mmixal (void)
       s += 2;
       input_line_pointer = s;
 
-      while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
+      while (is_whitespace (*s))
 	s++;
 
       /* For errors emitted here, the book-keeping is off by one; the
 	 caller is about to bump the counters.  Adjust the error messages.  */
-      if (is_end_of_line[(unsigned int) *s])
+      if (is_end_of_stmt (*s))
 	{
 	  unsigned int line;
 	  const char * name = as_where (&line);
@@ -2973,7 +2977,7 @@ mmix_handle_mmixal (void)
       return;
     }
 
-  if (*s == 0 || is_end_of_line[(unsigned int) *s])
+  if (is_end_of_stmt (*s))
     /* We avoid handling empty lines here.  */
     return;
 
@@ -2986,9 +2990,7 @@ mmix_handle_mmixal (void)
 
   /* Find the start of the instruction or pseudo following the label,
      if there is one.  */
-  for (insn = s;
-       *insn && ISSPACE (*insn) && ! is_end_of_line[(unsigned int) *insn];
-       insn++)
+  for (insn = s; is_whitespace (*insn); insn++)
     /* Empty */
     ;
 
@@ -3003,7 +3005,7 @@ mmix_handle_mmixal (void)
 	      instruction or MMIXAL-pseudo (getting its alignment).  Thus
 	      is acts like a "normal" :-ended label.  Ditto if it's
 	      followed by a non-MMIXAL pseudo.  */
-	   && !is_end_of_line[(unsigned int) *insn]
+	   && !is_end_of_stmt (*insn)
 	   && *insn != '.')
     {
       /* For labels that don't end in ":", we save it so we can later give
@@ -3038,7 +3040,7 @@ mmix_handle_mmixal (void)
   while (*s)
     {
       c = *s++;
-      if (is_end_of_line[(unsigned int) c])
+      if (is_end_of_stmt (c))
 	break;
       if (c == MAGIC_FB_BACKWARD_CHAR || c == MAGIC_FB_FORWARD_CHAR)
 	as_bad (_("invalid characters in input"));
@@ -3048,34 +3050,24 @@ mmix_handle_mmixal (void)
   s = insn;
 
   /* Skip the insn.  */
-  while (*s
-	 && ! ISSPACE (*s)
-	 && *s != ';'
-	 && ! is_end_of_line[(unsigned int) *s])
+  while (! is_whitespace (*s) && ! is_end_of_stmt (*s))
     s++;
 
   /* Skip the spaces after the insn.  */
-  while (*s
-	 && ISSPACE (*s)
-	 && *s != ';'
-	 && ! is_end_of_line[(unsigned int) *s])
+  while (is_whitespace (*s))
     s++;
 
   /* Skip the operands.  While doing this, replace [0-9][BF] with
      (MAGIC_FB_BACKWARD_CHAR|MAGIC_FB_FORWARD_CHAR)[0-9].  */
-  while ((c = *s) != 0
-	 && ! ISSPACE (c)
-	 && c != ';'
-	 && ! is_end_of_line[(unsigned int) c])
+  while (! is_whitespace (c = *s) && ! is_end_of_stmt (c))
     {
       if (c == '"')
 	{
 	  s++;
 
 	  /* FIXME: Test-case for semi-colon in string.  */
-	  while (*s
-		 && *s != '"'
-		 && (! is_end_of_line[(unsigned int) *s] || *s == ';'))
+	  while (*s != '"'
+		 && (! is_end_of_stmt (*s) || *s == ';'))
 	    s++;
 
 	  if (*s == '"')
@@ -3101,10 +3093,7 @@ mmix_handle_mmixal (void)
     }
 
   /* Skip any spaces after the operands.  */
-  while (*s
-	 && ISSPACE (*s)
-	 && *s != ';'
-	 && !is_end_of_line[(unsigned int) *s])
+  while (is_whitespace (*s))
     s++;
 
   /* If we're now looking at a semi-colon, then it's an end-of-line
@@ -3114,7 +3103,8 @@ mmix_handle_mmixal (void)
   /* Make IS into an EQU by replacing it with "= ".  Only match upper-case
      though; let lower-case be a syntax error.  */
   s = insn;
-  if (s[0] == 'I' && s[1] == 'S' && ISSPACE (s[2]))
+  if (s[0] == 'I' && s[1] == 'S'
+      && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
     {
       *s = '=';
       s[1] = ' ';
@@ -3163,7 +3153,7 @@ mmix_handle_mmixal (void)
   else if (s[0] == 'G'
 	   && s[1] == 'R'
 	   && startswith (s, "GREG")
-	   && (ISSPACE (s[4]) || is_end_of_line[(unsigned char) s[4]]))
+	   && (is_whitespace (s[4]) || is_end_of_stmt (s[4])))
     {
       input_line_pointer = s + 4;
 
@@ -4258,7 +4248,7 @@ mmix_cons (int nbytes)
 
   SKIP_WHITESPACE ();
 
-  if (is_end_of_line[(unsigned int) *input_line_pointer])
+  if (is_end_of_stmt (*input_line_pointer))
     {
       /* Default to zero if the expression was absent.  */
  
Jan Beulich Feb. 5, 2025, 7:14 a.m. UTC | #8
On 05.02.2025 07:22, Hans-Peter Nilsson wrote:
> On Mon, 3 Feb 2025, Jan Beulich wrote:
>> On 03.02.2025 07:54, Hans-Peter Nilsson wrote:
>>> On Mon, 27 Jan 2025, Jan Beulich wrote:
>>>> Convert open-coded checks as well as ISSPACE() uses. At the same time
>>>> use is_end_of_stmt() instead of open-coded checks; do the conversion
>>>> even when not adjacent to code being modified anyway to cover all cases
>>>> where the is_end_of_line[] index was wrongly cast from plain char (which
>>>> can be signed) to unsigned int.
>>>> ---
>>>> v2: New.
>>>>
>>>> --- a/gas/config/tc-mmix.c
>>>> +++ b/gas/config/tc-mmix.c
>>>
>>> Sorry, there are a few hunks that are wrong, like missing a test 
>>> for ';' and other changes that may work but doesn't logically 
>>> make sense to me.
>>
>> Missing ';' checks? The patch replaces a few with is_end_of_stmt(), yes,
>> but that's intentional (to stop its open-coding).
> 
> Oh right, I'd looked at the end_of_line[] definition and forgot 
> about MMIX adding ';' by means of line_separator_chars[]. In the 
> end, all code-changes were ok though with the caveat of now 
> supporting just tab and space.
> 
> Though bundling all that into saying "stop open-coding" was too 
> vague.  You were simply doing too much and still too little in 
> the same patch, making review harder than necessary.

Such mixing is pretty common in binutils changes. Other projects
I work on are more strict in this regard; I'm trying to find a
balance, but I don#t always succeed.

> Too much: "replacing open-coding" and removing redundant 
> checking while changing those whitespace tests; the IS... calls 
> to is_... and fixing erroneous casts.
> 
> Unfortunately there isn't a 1:1 equivalences, so the input 
> language is now different.  Now with is_whitespace it accepts 
> only ' ' and '\t' where previously ISSPACE covered more 
> whitespace characters in some places.  Not sure there are any 
> assembly language programmers using e.g. \f instead of \t, but 
> some of their code now no longer assembles for some targets.  
> Example: replace the space after the first 1H in 1hjmp1b.s with 
> a form-feed and run the related tests.  But, pragmatically, I'll 
> ok that change.  Nobody in their right mind should use other 
> than \t and ' ' as assembly-code field separators.  They had it 
> coming. 1/2 :-)

Well, I'm aware - see the two respective remarks in the cover
letter. ISSPACE() was problematic anyway, for including \r and
\n as well. ISBLANK() may have been slightly better. In any
event - we now have control over what we want to consider white
space. We can add \f and/or \v, or any others. My goal is though
that for no character is_whitespace() and is_end_of_stmt() both
yield true.

Jan
  

Patch

--- a/gas/config/tc-mmix.c
+++ b/gas/config/tc-mmix.c
@@ -454,11 +454,11 @@  get_operands (int max_operands, char *s,
   while (nextchar == ',')
     {
       /* Skip leading whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
 
       /* Check to see if we have any operands left to parse */
-      if (*p == 0 || *p == '\n' || *p == '\r')
+      if (is_end_of_stmt (*p))
 	{
 	  break;
 	}
@@ -489,7 +489,7 @@  get_operands (int max_operands, char *s,
       p = input_line_pointer;
 
       /* Skip leading whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
       nextchar = *p++;
     }
@@ -545,7 +545,7 @@  get_putget_operands (struct mmix_opcode
   int regno;
 
   /* Skip leading whitespace */
-  while (*p == ' ' || *p == '\t')
+  while (is_whitespace (*p))
     p++;
 
   input_line_pointer = p;
@@ -565,7 +565,7 @@  get_putget_operands (struct mmix_opcode
       p = input_line_pointer;
 
       /* Skip whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
 
       if (*p == ',')
@@ -573,7 +573,7 @@  get_putget_operands (struct mmix_opcode
 	  p++;
 
 	  /* Skip whitespace */
-	  while (*p == ' ' || *p == '\t')
+	  while (is_whitespace (*p))
 	    p++;
 	  sregp = p;
 	  input_line_pointer = sregp;
@@ -594,7 +594,7 @@  get_putget_operands (struct mmix_opcode
       p = input_line_pointer;
 
       /* Skip whitespace */
-      while (*p == ' ' || *p == '\t')
+      while (is_whitespace (*p))
 	p++;
 
       if (*p == ',')
@@ -602,7 +602,7 @@  get_putget_operands (struct mmix_opcode
 	  p++;
 
 	  /* Skip whitespace */
-	  while (*p == ' ' || *p == '\t')
+	  while (is_whitespace (*p))
 	    p++;
 
 	  input_line_pointer = p;
@@ -829,7 +829,7 @@  md_assemble (char *str)
        ++operands)
     ;
 
-  if (ISSPACE (*operands))
+  if (is_whitespace (*operands))
     {
       modified_char = *operands;
       *operands++ = '\0';
@@ -2924,7 +2924,8 @@  mmix_handle_mmixal (void)
   /* If we're on a line with a label, check if it's a mmixal fb-label.
      Save an indicator and skip the label; it must be set only after all
      fb-labels of expressions are evaluated.  */
-  if (ISDIGIT (s[0]) && s[1] == 'H' && ISSPACE (s[2]))
+  if (ISDIGIT (s[0]) && s[1] == 'H'
+      && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
     {
       current_fb_label = s[0] - '0';
 
@@ -2935,12 +2936,12 @@  mmix_handle_mmixal (void)
       s += 2;
       input_line_pointer = s;
 
-      while (*s && ISSPACE (*s) && ! is_end_of_line[(unsigned int) *s])
+      while (is_whitespace (*s))
 	s++;
 
       /* For errors emitted here, the book-keeping is off by one; the
 	 caller is about to bump the counters.  Adjust the error messages.  */
-      if (is_end_of_line[(unsigned int) *s])
+      if (is_end_of_stmt (*s))
 	{
 	  unsigned int line;
 	  const char * name = as_where (&line);
@@ -2973,7 +2974,7 @@  mmix_handle_mmixal (void)
       return;
     }
 
-  if (*s == 0 || is_end_of_line[(unsigned int) *s])
+  if (is_end_of_stmt (*s))
     /* We avoid handling empty lines here.  */
     return;
 
@@ -2986,9 +2987,7 @@  mmix_handle_mmixal (void)
 
   /* Find the start of the instruction or pseudo following the label,
      if there is one.  */
-  for (insn = s;
-       *insn && ISSPACE (*insn) && ! is_end_of_line[(unsigned int) *insn];
-       insn++)
+  for (insn = s; is_whitespace (*insn); insn++)
     /* Empty */
     ;
 
@@ -3003,7 +3002,7 @@  mmix_handle_mmixal (void)
 	      instruction or MMIXAL-pseudo (getting its alignment).  Thus
 	      is acts like a "normal" :-ended label.  Ditto if it's
 	      followed by a non-MMIXAL pseudo.  */
-	   && !is_end_of_line[(unsigned int) *insn]
+	   && !is_end_of_stmt (*insn)
 	   && *insn != '.')
     {
       /* For labels that don't end in ":", we save it so we can later give
@@ -3038,7 +3037,7 @@  mmix_handle_mmixal (void)
   while (*s)
     {
       c = *s++;
-      if (is_end_of_line[(unsigned int) c])
+      if (is_end_of_stmt (c))
 	break;
       if (c == MAGIC_FB_BACKWARD_CHAR || c == MAGIC_FB_FORWARD_CHAR)
 	as_bad (_("invalid characters in input"));
@@ -3048,34 +3047,24 @@  mmix_handle_mmixal (void)
   s = insn;
 
   /* Skip the insn.  */
-  while (*s
-	 && ! ISSPACE (*s)
-	 && *s != ';'
-	 && ! is_end_of_line[(unsigned int) *s])
+  while (! is_whitespace (*s) && ! is_end_of_stmt (*s))
     s++;
 
   /* Skip the spaces after the insn.  */
-  while (*s
-	 && ISSPACE (*s)
-	 && *s != ';'
-	 && ! is_end_of_line[(unsigned int) *s])
+  while (is_whitespace (*s))
     s++;
 
   /* Skip the operands.  While doing this, replace [0-9][BF] with
      (MAGIC_FB_BACKWARD_CHAR|MAGIC_FB_FORWARD_CHAR)[0-9].  */
-  while ((c = *s) != 0
-	 && ! ISSPACE (c)
-	 && c != ';'
-	 && ! is_end_of_line[(unsigned int) c])
+  while (! is_whitespace (c = *s) && ! is_end_of_stmt (c))
     {
       if (c == '"')
 	{
 	  s++;
 
 	  /* FIXME: Test-case for semi-colon in string.  */
-	  while (*s
-		 && *s != '"'
-		 && (! is_end_of_line[(unsigned int) *s] || *s == ';'))
+	  while (*s != '"'
+		 && (! is_end_of_stmt (*s) || *s == ';'))
 	    s++;
 
 	  if (*s == '"')
@@ -3101,10 +3090,7 @@  mmix_handle_mmixal (void)
     }
 
   /* Skip any spaces after the operands.  */
-  while (*s
-	 && ISSPACE (*s)
-	 && *s != ';'
-	 && !is_end_of_line[(unsigned int) *s])
+  while (is_whitespace (*s))
     s++;
 
   /* If we're now looking at a semi-colon, then it's an end-of-line
@@ -3114,7 +3100,8 @@  mmix_handle_mmixal (void)
   /* Make IS into an EQU by replacing it with "= ".  Only match upper-case
      though; let lower-case be a syntax error.  */
   s = insn;
-  if (s[0] == 'I' && s[1] == 'S' && ISSPACE (s[2]))
+  if (s[0] == 'I' && s[1] == 'S'
+      && (is_whitespace (s[2]) || is_end_of_stmt (s[2])))
     {
       *s = '=';
       s[1] = ' ';
@@ -3163,7 +3150,7 @@  mmix_handle_mmixal (void)
   else if (s[0] == 'G'
 	   && s[1] == 'R'
 	   && startswith (s, "GREG")
-	   && (ISSPACE (s[4]) || is_end_of_line[(unsigned char) s[4]]))
+	   && (is_whitespace (s[4]) || is_end_of_stmt (s[4])))
     {
       input_line_pointer = s + 4;
 
@@ -4258,7 +4245,7 @@  mmix_cons (int nbytes)
 
   SKIP_WHITESPACE ();
 
-  if (is_end_of_line[(unsigned int) *input_line_pointer])
+  if (is_end_of_stmt (*input_line_pointer))
     {
       /* Default to zero if the expression was absent.  */