libcpp, v2: Add -Wtrailing-whitespace= warning

Message ID ZuyKZkf7i/qV3e42@tucnak
State New
Headers
Series libcpp, v2: Add -Wtrailing-whitespace= warning |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed

Commit Message

Jakub Jelinek Sept. 19, 2024, 8:32 p.m. UTC
  On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> On Wed, Sep 18, 2024 at 7:33 PM Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Wed, Sep 18, 2024 at 06:17:58PM +0100, Richard Sandiford wrote:
> > > +1  I'd much rather learn about this kind of error before the code reaches
> > > a review tool :)
> > >
> > > >From a quick check, it doesn't look like Clang has this, so there is no
> > > existing name to follow.
> >
> > I was considering also -Wtrailing-whitespace, but
> > 1) git diff really warns just about trailing spaces/tabs, not form feeds or
> > vertical tabs
> > 2) gcc source contains tons of spots with form feed in it (though,
> > I think pretty much always as the sole character on a line).
> > And not really sure how people use vertical tabs in the source if at all.
> > Perhaps form feed could be not warned if at end of line if it isn't the sole
> > character on a line...
> 
> Generally I like diagnosing this early.  For the above I'd say
> -Wtrailing-whitespace=
> with a set of things to diagnose (and a sane default - just spaces and
> tabs - for
> -Wtrailiing-whitespace) would be nice.  As for naming possibly follow the
> is{space,blank,cntrl} character classifications?  If those are a good
> fit, that is.

Here is a patch which currently allows blank (' ' '\t') and space (' ' '\t'
'\f' '\v'), cntrl not yet added, not anything non-ASCII, but in theory could
be added later (though, non-ASCII would be just for inside of comments,
say non-breaking space etc. in the source is otherwise an error).

Bootstrapped/regtested on x86_64-linux and i686-linux.

2024-09-19  Jakub Jelinek  <jakub@redhat.com>

libcpp/
	* include/cpplib.h (struct cpp_options): Add
	cpp_warn_trailing_whitespace member.
	(enum cpp_warning_reason): Add CPP_W_TRAILING_WHITESPACE.
	* internal.h (struct _cpp_line_note): Document 'W' line note.
	* lex.cc (_cpp_clean_line): Add 'W' line note for trailing whitespace
	except for trailing whitespace after backslash.  Formatting fix.
	(_cpp_process_line_notes): Emit -Wtrailing-whitespace diagnostics.
	Formatting fixes.
	(lex_raw_string): Clear type on 'W' notes.
gcc/
	* doc/invoke.texi (Wtrailing-whitespace): Document.
gcc/c-family/
	* c.opt (Wtrailing-whitespace=): New option.
	(Wtrailing-whitespace): New alias.
gcc/testsuite/
	* c-c++-common/cpp/Wtrailing-whitespace-1.c: New test.
	* c-c++-common/cpp/Wtrailing-whitespace-2.c: New test.
	* c-c++-common/cpp/Wtrailing-whitespace-3.c: New test.
	* c-c++-common/cpp/Wtrailing-whitespace-4.c: New test.
	* c-c++-common/cpp/Wtrailing-whitespace-5.c: New test.



	Jakub
  

Comments

Eric Gallager Sept. 20, 2024, 1:23 a.m. UTC | #1
On Thu, Sep 19, 2024 at 4:35 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> > On Wed, Sep 18, 2024 at 7:33 PM Jakub Jelinek <jakub@redhat.com> wrote:
> > >
> > > On Wed, Sep 18, 2024 at 06:17:58PM +0100, Richard Sandiford wrote:
> > > > +1  I'd much rather learn about this kind of error before the code reaches
> > > > a review tool :)
> > > >
> > > > >From a quick check, it doesn't look like Clang has this, so there is no
> > > > existing name to follow.
> > >
> > > I was considering also -Wtrailing-whitespace, but
> > > 1) git diff really warns just about trailing spaces/tabs, not form feeds or
> > > vertical tabs
> > > 2) gcc source contains tons of spots with form feed in it (though,
> > > I think pretty much always as the sole character on a line).
> > > And not really sure how people use vertical tabs in the source if at all.
> > > Perhaps form feed could be not warned if at end of line if it isn't the sole
> > > character on a line...
> >
> > Generally I like diagnosing this early.  For the above I'd say
> > -Wtrailing-whitespace=
> > with a set of things to diagnose (and a sane default - just spaces and
> > tabs - for
> > -Wtrailiing-whitespace) would be nice.  As for naming possibly follow the
> > is{space,blank,cntrl} character classifications?  If those are a good
> > fit, that is.
>
> Here is a patch which currently allows blank (' ' '\t') and space (' ' '\t'
> '\f' '\v'), cntrl not yet added, not anything non-ASCII, but in theory could
> be added later (though, non-ASCII would be just for inside of comments,
> say non-breaking space etc. in the source is otherwise an error).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.
>

I think this is getting too complex now; I preferred the simpler version...

> 2024-09-19  Jakub Jelinek  <jakub@redhat.com>
>
> libcpp/
>         * include/cpplib.h (struct cpp_options): Add
>         cpp_warn_trailing_whitespace member.
>         (enum cpp_warning_reason): Add CPP_W_TRAILING_WHITESPACE.
>         * internal.h (struct _cpp_line_note): Document 'W' line note.
>         * lex.cc (_cpp_clean_line): Add 'W' line note for trailing whitespace
>         except for trailing whitespace after backslash.  Formatting fix.
>         (_cpp_process_line_notes): Emit -Wtrailing-whitespace diagnostics.
>         Formatting fixes.
>         (lex_raw_string): Clear type on 'W' notes.
> gcc/
>         * doc/invoke.texi (Wtrailing-whitespace): Document.
> gcc/c-family/
>         * c.opt (Wtrailing-whitespace=): New option.
>         (Wtrailing-whitespace): New alias.
> gcc/testsuite/
>         * c-c++-common/cpp/Wtrailing-whitespace-1.c: New test.
>         * c-c++-common/cpp/Wtrailing-whitespace-2.c: New test.
>         * c-c++-common/cpp/Wtrailing-whitespace-3.c: New test.
>         * c-c++-common/cpp/Wtrailing-whitespace-4.c: New test.
>         * c-c++-common/cpp/Wtrailing-whitespace-5.c: New test.
>
> --- libcpp/include/cpplib.h.jj  2024-09-13 16:09:32.690455174 +0200
> +++ libcpp/include/cpplib.h     2024-09-19 16:59:09.674903649 +0200
> @@ -594,6 +594,9 @@ struct cpp_options
>    /* True if -finput-charset= option has been used explicitly.  */
>    bool cpp_input_charset_explicit;
>
> +  /* -Wtrailing-whitespace= value.  */
> +  unsigned char cpp_warn_trailing_whitespace;
> +
>    /* Dependency generation.  */
>    struct
>    {
> @@ -709,7 +712,8 @@ enum cpp_warning_reason {
>    CPP_W_EXPANSION_TO_DEFINED,
>    CPP_W_BIDIRECTIONAL,
>    CPP_W_INVALID_UTF8,
> -  CPP_W_UNICODE
> +  CPP_W_UNICODE,
> +  CPP_W_TRAILING_WHITESPACE
>  };
>
>  /* Callback for header lookup for HEADER, which is the name of a
> --- libcpp/internal.h.jj        2024-09-18 09:45:36.832570227 +0200
> +++ libcpp/internal.h   2024-09-19 16:54:56.610321817 +0200
> @@ -318,8 +318,8 @@ struct _cpp_line_note
>
>    /* Type of note.  The 9 'from' trigraph characters represent those
>       trigraphs, '\\' an escaped newline, ' ' an escaped newline with
> -     intervening space, 0 represents a note that has already been handled,
> -     and anything else is invalid.  */
> +     intervening space, 'W' trailing whitespace, 0 represents a note that
> +     has already been handled, and anything else is invalid.  */
>    unsigned int type;
>  };
>
> --- libcpp/lex.cc.jj    2024-09-13 16:09:32.720454758 +0200
> +++ libcpp/lex.cc       2024-09-19 16:58:37.434339128 +0200
> @@ -928,7 +928,7 @@ _cpp_clean_line (cpp_reader *pfile)
>               if (p == buffer->next_line || p[-1] != '\\')
>                 break;
>
> -             add_line_note (buffer, p - 1, p != d ? ' ': '\\');
> +             add_line_note (buffer, p - 1, p != d ? ' ' : '\\');
>               d = p - 2;
>               buffer->next_line = p - 1;
>             }
> @@ -943,6 +943,20 @@ _cpp_clean_line (cpp_reader *pfile)
>                 }
>             }
>         }
> +     done:
> +      if (d > buffer->next_line
> +         && CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
> +       switch (CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
> +         {
> +         case 1:
> +           if (ISBLANK (d[-1]))
> +             add_line_note (buffer, d - 1, 'W');
> +           break;
> +         case 2:
> +           if (IS_NVSPACE (d[-1]) && d[-1])
> +             add_line_note (buffer, d - 1, 'W');
> +           break;
> +         }
>      }
>    else
>      {
> @@ -955,7 +969,6 @@ _cpp_clean_line (cpp_reader *pfile)
>         s++;
>      }
>
> - done:
>    *d = '\n';
>    /* A sentinel note that should never be processed.  */
>    add_line_note (buffer, d + 1, '\n');
> @@ -1013,13 +1026,23 @@ _cpp_process_line_notes (cpp_reader *pfi
>
>        if (note->type == '\\' || note->type == ' ')
>         {
> -         if (note->type == ' ' && !in_comment)
> -           cpp_error_with_line (pfile, CPP_DL_WARNING, pfile->line_table->highest_line, col,
> -                                "backslash and newline separated by space");
> +         if (note->type == ' ')
> +           {
> +             if (!in_comment)
> +               cpp_error_with_line (pfile, CPP_DL_WARNING,
> +                                    pfile->line_table->highest_line, col,
> +                                    "backslash and newline separated by "
> +                                    "space");
> +             else if (CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
> +               cpp_warning_with_line (pfile, CPP_W_TRAILING_WHITESPACE,
> +                                      pfile->line_table->highest_line, col,
> +                                      "trailing whitespace");
> +           }
>
>           if (buffer->next_line > buffer->rlimit)
>             {
> -             cpp_error_with_line (pfile, CPP_DL_PEDWARN, pfile->line_table->highest_line, col,
> +             cpp_error_with_line (pfile, CPP_DL_PEDWARN,
> +                                  pfile->line_table->highest_line, col,
>                                    "backslash-newline at end of file");
>               /* Prevent "no newline at end of file" warning.  */
>               buffer->next_line = buffer->rlimit;
> @@ -1040,15 +1063,16 @@ _cpp_process_line_notes (cpp_reader *pfi
>                                        note->type,
>                                        (int) _cpp_trigraph_map[note->type]);
>               else
> -               {
> -                 cpp_warning_with_line
> -                   (pfile, CPP_W_TRIGRAPHS,
> -                     pfile->line_table->highest_line, col,
> -                    "trigraph ??%c ignored, use -trigraphs to enable",
> -                    note->type);
> -               }
> +               cpp_warning_with_line (pfile, CPP_W_TRIGRAPHS,
> +                                      pfile->line_table->highest_line, col,
> +                                      "trigraph ??%c ignored, use -trigraphs "
> +                                      "to enable", note->type);
>             }
>         }
> +      else if (note->type == 'W')
> +       cpp_warning_with_line (pfile, CPP_W_TRAILING_WHITESPACE,
> +                              pfile->line_table->highest_line, col,
> +                              "trailing whitespace");
>        else if (note->type == 0)
>         /* Already processed in lex_raw_string.  */;
>        else
> @@ -2539,6 +2563,12 @@ lex_raw_string (cpp_reader *pfile, cpp_t
>             note->type = 0;
>             note++;
>             break;
> +
> +         case 'W':
> +           /* Don't warn about trailing whitespace in raw string literals.  */
> +           note->type = 0;
> +           note++;
> +           break;
>
>           default:
>             gcc_checking_assert (_cpp_trigraph_map[note->type]);
> --- gcc/doc/invoke.texi.jj      2024-09-12 18:15:20.458626277 +0200
> +++ gcc/doc/invoke.texi 2024-09-19 17:18:14.730458180 +0200
> @@ -413,7 +413,8 @@ Objective-C and Objective-C++ Dialects}.
>  -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]}
>  -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum
>  -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand
> --Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs
> +-Wsystem-headers  -Wtautological-compare  -Wtrailing-whitespace
> +-Wtrailing-whitespace=@var{kind}  -Wtrampolines  -Wtrigraphs
>  -Wtrivial-auto-var-init  -Wno-tsan  -Wtype-limits  -Wundef
>  -Wuninitialized  -Wunknown-pragmas
>  -Wunsuffixed-float-constants
> @@ -8996,6 +8997,21 @@ will always be false.
>
>  This warning is enabled by @option{-Wall}.
>
> +@opindex Wtrailing-whitespace
> +@opindex Wno-trailing-whitespace
> +@opindex Wtrailing-whitespace=
> +@item -Wtrailing-whitespace
> +@itemx -Wtrailing-whitespace=@var{kind}
> +Warn about trailing whitespace at the end of lines, including inside of
> +comments, but excluding trailing whitespace in raw string literals.
> +@code{-Wtrailing-whitespace} is equivalent to
> +@code{-Wtrailing-whitespace=blank} and warns just about trailing space and
> +horizontal tab characters.  @code{-Wtrailing-whitespace=space} warns about
> +those or trailing form feed or vertical tab characters.
> +@code{-Wno-trailing-whitespace} or @code{-Wtrailing-whitespace=none}
> +disables the warning, which is the default.
> +This is a coding style warning.
> +
>  @opindex Wtrampolines
>  @opindex Wno-trampolines
>  @item -Wtrampolines
> --- gcc/c-family/c.opt.jj       2024-09-12 18:15:20.415626861 +0200
> +++ gcc/c-family/c.opt  2024-09-19 17:10:50.463448288 +0200
> @@ -1450,6 +1450,26 @@ Wtraditional-conversion
>  C ObjC Var(warn_traditional_conversion) Warning
>  Warn of prototypes causing type conversions different from what would happen in the absence of prototype.
>
> +Enum
> +Name(warn_trailing_whitespace_kind) Type(int) UnknownError(argument %qs to %<-Wtrailing-whitespace=%> not recognized)
> +
> +EnumValue
> +Enum(warn_trailing_whitespace_kind) String(none) Value(0)
> +
> +EnumValue
> +Enum(warn_trailing_whitespace_kind) String(blank) Value(1)
> +
> +EnumValue
> +Enum(warn_trailing_whitespace_kind) String(space) Value(2)
> +
> +Wtrailing-whitespace=
> +C ObjC C++ ObjC++ CPP(cpp_warn_trailing_whitespace) CppReason(CPP_W_TRAILING_WHITESPACE) Enum(warn_trailing_whitespace_kind) Joined RejectNegative Var(warn_trailing_whitespace) Init(0) Warning
> +Warn about trailing whitespace on lines except when in raw string literals.
> +
> +Wtrailing-whitespace
> +C ObjC C++ ObjC++ Alias(Wtrailing-whitespace=,blank,none) Warning
> +Warn about trailing whitespace on lines except when in raw string literals.   Equivalent to Wtrailing-whitespace=blank when enabled or Wtrailing-whitespace=none when disabled.
> +
>  Wtrigraphs
>  C ObjC C++ ObjC++ CPP(warn_trigraphs) CppReason(CPP_W_TRIGRAPHS) Var(cpp_warn_trigraphs) Init(2) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
>  Warn if trigraphs are encountered that might affect the meaning of the program.
> --- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-1.c.jj  2024-09-18 14:44:22.712636656 +0200
> +++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-1.c     2024-09-19 17:40:36.972655199 +0200
> @@ -0,0 +1,37 @@
> +/* { dg-do compile { target { c || c++11 } } } */
> +/* { dg-options "-Wtrailing-whitespace" } */
> +
> +int i;
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +int j;
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +int \
> +  k \
> +  ;
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-3 } */
> +
> +
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +const char *p = R"*|*(
> +
> +
> +.
> +)*|*";
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +// This is a comment with trailing whitespace
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +/* This is a comment with trailing whitespace
> +*/
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
> +// This is a comment with trailing whitespace
> +/* This is a comment with trailing whitespace
> +*/
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .+1 } */
> +
> \ No newline at end of file
> --- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-2.c.jj  2024-09-19 17:31:57.169567809 +0200
> +++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-2.c     2024-09-19 17:40:43.709563785 +0200
> @@ -0,0 +1,37 @@
> +/* { dg-do compile { target { c || c++11 } } } */
> +/* { dg-options "-Wtrailing-whitespace=blank" } */
> +
> +int i;
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +int j;
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +int \
> +  k \
> +  ;
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-3 } */
> +
> +
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +const char *p = R"*|*(
> +
> +
> +.
> +)*|*";
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +// This is a comment with trailing whitespace
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +/* This is a comment with trailing whitespace
> +*/
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
> +// This is a comment with trailing whitespace
> +/* This is a comment with trailing whitespace
> +*/
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .+1 } */
> +
> \ No newline at end of file
> --- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-3.c.jj  2024-09-19 17:32:20.467260617 +0200
> +++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-3.c     2024-09-19 17:40:50.284474568 +0200
> @@ -0,0 +1,43 @@
> +/* { dg-do compile { target { c || c++11 } } } */
> +/* { dg-options "-Wtrailing-whitespace=space" } */
> +
> +int i;
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +int j;
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +int \
> +  k \
> +  ;
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-3 } */
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +const char *p = R"*|*(
> +
> +
> +.
> +)*|*";
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +// This is a comment with trailing whitespace
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +/* This is a comment with trailing whitespace
> +*/
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
> +// This is a comment with trailing whitespace
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
> +/* This is a comment with trailing whitespace
> +*/
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
> +/* { dg-warning "trailing whitespace" "" { target *-*-* } .+1 } */
> +
> \ No newline at end of file
> --- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-4.c.jj  2024-09-19 17:33:02.531705971 +0200
> +++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-4.c     2024-09-19 17:41:21.131056009 +0200
> @@ -0,0 +1,28 @@
> +/* { dg-do compile { target { c || c++11 } } } */
> +/* { dg-options "-Wtrailing-whitespace=none" } */
> +
> +int i;
> +int j;
> +int \
> +  k \
> +  ;
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +
> +
> +
> +
> +
> +
> +const char *p = R"*|*(
> +
> +
> +.
> +)*|*";
> +// This is a comment with trailing whitespace
> +/* This is a comment with trailing whitespace
> +*/
> +// This is a comment with trailing whitespace
> +/* This is a comment with trailing whitespace
> +*/
> +
> \ No newline at end of file
> --- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-5.c.jj  2024-09-19 17:33:45.427140375 +0200
> +++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-5.c     2024-09-19 17:41:28.352958013 +0200
> @@ -0,0 +1,28 @@
> +/* { dg-do compile { target { c || c++11 } } } */
> +/* { dg-options "-Wno-trailing-whitespace" } */
> +
> +int i;
> +int j;
> +int \
> +  k \
> +  ;
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
> +
> +
> +
> +
> +
> +
> +const char *p = R"*|*(
> +
> +
> +.
> +)*|*";
> +// This is a comment with trailing whitespace
> +/* This is a comment with trailing whitespace
> +*/
> +// This is a comment with trailing whitespace
> +/* This is a comment with trailing whitespace
> +*/
> +
> \ No newline at end of file
>
>
>         Jakub
>
  
Joseph Myers Oct. 11, 2024, 5:38 p.m. UTC | #2
On Thu, 19 Sep 2024, Jakub Jelinek wrote:

> Here is a patch which currently allows blank (' ' '\t') and space (' ' '\t'
> '\f' '\v'), cntrl not yet added, not anything non-ASCII, but in theory could
> be added later (though, non-ASCII would be just for inside of comments,
> say non-breaking space etc. in the source is otherwise an error).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux.

OK with duplicates of the tests added using CR LF line endings (and 
comments in those duplicates saying they deliberately use CR LF line 
endings), I think this is a case where it's worth having such tests to 
verify that space / blank before CR LF is handled the same as before LF.

It will be good if we can move to using such options for building GCC's 
target libraries, and host code when supported by the host compiler.
  

Patch

--- libcpp/include/cpplib.h.jj	2024-09-13 16:09:32.690455174 +0200
+++ libcpp/include/cpplib.h	2024-09-19 16:59:09.674903649 +0200
@@ -594,6 +594,9 @@  struct cpp_options
   /* True if -finput-charset= option has been used explicitly.  */
   bool cpp_input_charset_explicit;
 
+  /* -Wtrailing-whitespace= value.  */
+  unsigned char cpp_warn_trailing_whitespace;
+
   /* Dependency generation.  */
   struct
   {
@@ -709,7 +712,8 @@  enum cpp_warning_reason {
   CPP_W_EXPANSION_TO_DEFINED,
   CPP_W_BIDIRECTIONAL,
   CPP_W_INVALID_UTF8,
-  CPP_W_UNICODE
+  CPP_W_UNICODE,
+  CPP_W_TRAILING_WHITESPACE
 };
 
 /* Callback for header lookup for HEADER, which is the name of a
--- libcpp/internal.h.jj	2024-09-18 09:45:36.832570227 +0200
+++ libcpp/internal.h	2024-09-19 16:54:56.610321817 +0200
@@ -318,8 +318,8 @@  struct _cpp_line_note
 
   /* Type of note.  The 9 'from' trigraph characters represent those
      trigraphs, '\\' an escaped newline, ' ' an escaped newline with
-     intervening space, 0 represents a note that has already been handled,
-     and anything else is invalid.  */
+     intervening space, 'W' trailing whitespace, 0 represents a note that
+     has already been handled, and anything else is invalid.  */
   unsigned int type;
 };
 
--- libcpp/lex.cc.jj	2024-09-13 16:09:32.720454758 +0200
+++ libcpp/lex.cc	2024-09-19 16:58:37.434339128 +0200
@@ -928,7 +928,7 @@  _cpp_clean_line (cpp_reader *pfile)
 	      if (p == buffer->next_line || p[-1] != '\\')
 		break;
 
-	      add_line_note (buffer, p - 1, p != d ? ' ': '\\');
+	      add_line_note (buffer, p - 1, p != d ? ' ' : '\\');
 	      d = p - 2;
 	      buffer->next_line = p - 1;
 	    }
@@ -943,6 +943,20 @@  _cpp_clean_line (cpp_reader *pfile)
 		}
 	    }
 	}
+     done:
+      if (d > buffer->next_line
+	  && CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
+	switch (CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
+	  {
+	  case 1:
+	    if (ISBLANK (d[-1]))
+	      add_line_note (buffer, d - 1, 'W');
+	    break;
+	  case 2:
+	    if (IS_NVSPACE (d[-1]) && d[-1])
+	      add_line_note (buffer, d - 1, 'W');
+	    break;
+	  }
     }
   else
     {
@@ -955,7 +969,6 @@  _cpp_clean_line (cpp_reader *pfile)
 	s++;
     }
 
- done:
   *d = '\n';
   /* A sentinel note that should never be processed.  */
   add_line_note (buffer, d + 1, '\n');
@@ -1013,13 +1026,23 @@  _cpp_process_line_notes (cpp_reader *pfi
 
       if (note->type == '\\' || note->type == ' ')
 	{
-	  if (note->type == ' ' && !in_comment)
-	    cpp_error_with_line (pfile, CPP_DL_WARNING, pfile->line_table->highest_line, col,
-				 "backslash and newline separated by space");
+	  if (note->type == ' ')
+	    {
+	      if (!in_comment)
+		cpp_error_with_line (pfile, CPP_DL_WARNING,
+				     pfile->line_table->highest_line, col,
+				     "backslash and newline separated by "
+				     "space");
+	      else if (CPP_OPTION (pfile, cpp_warn_trailing_whitespace))
+		cpp_warning_with_line (pfile, CPP_W_TRAILING_WHITESPACE,
+				       pfile->line_table->highest_line, col,
+				       "trailing whitespace");
+	    }
 
 	  if (buffer->next_line > buffer->rlimit)
 	    {
-	      cpp_error_with_line (pfile, CPP_DL_PEDWARN, pfile->line_table->highest_line, col,
+	      cpp_error_with_line (pfile, CPP_DL_PEDWARN,
+				   pfile->line_table->highest_line, col,
 				   "backslash-newline at end of file");
 	      /* Prevent "no newline at end of file" warning.  */
 	      buffer->next_line = buffer->rlimit;
@@ -1040,15 +1063,16 @@  _cpp_process_line_notes (cpp_reader *pfi
 				       note->type,
 				       (int) _cpp_trigraph_map[note->type]);
 	      else
-		{
-		  cpp_warning_with_line 
-		    (pfile, CPP_W_TRIGRAPHS,
-                     pfile->line_table->highest_line, col,
-		     "trigraph ??%c ignored, use -trigraphs to enable",
-		     note->type);
-		}
+		cpp_warning_with_line (pfile, CPP_W_TRIGRAPHS,
+				       pfile->line_table->highest_line, col,
+				       "trigraph ??%c ignored, use -trigraphs "
+				       "to enable", note->type);
 	    }
 	}
+      else if (note->type == 'W')
+	cpp_warning_with_line (pfile, CPP_W_TRAILING_WHITESPACE,
+			       pfile->line_table->highest_line, col,
+			       "trailing whitespace");
       else if (note->type == 0)
 	/* Already processed in lex_raw_string.  */;
       else
@@ -2539,6 +2563,12 @@  lex_raw_string (cpp_reader *pfile, cpp_t
 	    note->type = 0;
 	    note++;
 	    break;
+
+	  case 'W':
+	    /* Don't warn about trailing whitespace in raw string literals.  */
+	    note->type = 0;
+	    note++;
+	    break;
 
 	  default:
 	    gcc_checking_assert (_cpp_trigraph_map[note->type]);
--- gcc/doc/invoke.texi.jj	2024-09-12 18:15:20.458626277 +0200
+++ gcc/doc/invoke.texi	2024-09-19 17:18:14.730458180 +0200
@@ -413,7 +413,8 @@  Objective-C and Objective-C++ Dialects}.
 -Wsuggest-attribute=@r{[}pure@r{|}const@r{|}noreturn@r{|}format@r{|}malloc@r{]}
 -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum
 -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand
--Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs
+-Wsystem-headers  -Wtautological-compare  -Wtrailing-whitespace
+-Wtrailing-whitespace=@var{kind}  -Wtrampolines  -Wtrigraphs
 -Wtrivial-auto-var-init  -Wno-tsan  -Wtype-limits  -Wundef
 -Wuninitialized  -Wunknown-pragmas
 -Wunsuffixed-float-constants
@@ -8996,6 +8997,21 @@  will always be false.
 
 This warning is enabled by @option{-Wall}.
 
+@opindex Wtrailing-whitespace
+@opindex Wno-trailing-whitespace
+@opindex Wtrailing-whitespace=
+@item -Wtrailing-whitespace
+@itemx -Wtrailing-whitespace=@var{kind}
+Warn about trailing whitespace at the end of lines, including inside of
+comments, but excluding trailing whitespace in raw string literals. 
+@code{-Wtrailing-whitespace} is equivalent to
+@code{-Wtrailing-whitespace=blank} and warns just about trailing space and
+horizontal tab characters.  @code{-Wtrailing-whitespace=space} warns about
+those or trailing form feed or vertical tab characters.
+@code{-Wno-trailing-whitespace} or @code{-Wtrailing-whitespace=none}
+disables the warning, which is the default.
+This is a coding style warning.
+
 @opindex Wtrampolines
 @opindex Wno-trampolines
 @item -Wtrampolines
--- gcc/c-family/c.opt.jj	2024-09-12 18:15:20.415626861 +0200
+++ gcc/c-family/c.opt	2024-09-19 17:10:50.463448288 +0200
@@ -1450,6 +1450,26 @@  Wtraditional-conversion
 C ObjC Var(warn_traditional_conversion) Warning
 Warn of prototypes causing type conversions different from what would happen in the absence of prototype.
 
+Enum
+Name(warn_trailing_whitespace_kind) Type(int) UnknownError(argument %qs to %<-Wtrailing-whitespace=%> not recognized)
+
+EnumValue
+Enum(warn_trailing_whitespace_kind) String(none) Value(0)
+
+EnumValue
+Enum(warn_trailing_whitespace_kind) String(blank) Value(1)
+
+EnumValue
+Enum(warn_trailing_whitespace_kind) String(space) Value(2)
+
+Wtrailing-whitespace=
+C ObjC C++ ObjC++ CPP(cpp_warn_trailing_whitespace) CppReason(CPP_W_TRAILING_WHITESPACE) Enum(warn_trailing_whitespace_kind) Joined RejectNegative Var(warn_trailing_whitespace) Init(0) Warning
+Warn about trailing whitespace on lines except when in raw string literals.
+
+Wtrailing-whitespace
+C ObjC C++ ObjC++ Alias(Wtrailing-whitespace=,blank,none) Warning
+Warn about trailing whitespace on lines except when in raw string literals.   Equivalent to Wtrailing-whitespace=blank when enabled or Wtrailing-whitespace=none when disabled.
+
 Wtrigraphs
 C ObjC C++ ObjC++ CPP(warn_trigraphs) CppReason(CPP_W_TRIGRAPHS) Var(cpp_warn_trigraphs) Init(2) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall)
 Warn if trigraphs are encountered that might affect the meaning of the program.
--- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-1.c.jj	2024-09-18 14:44:22.712636656 +0200
+++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-1.c	2024-09-19 17:40:36.972655199 +0200
@@ -0,0 +1,37 @@ 
+/* { dg-do compile { target { c || c++11 } } } */
+/* { dg-options "-Wtrailing-whitespace" } */
+
+int i;   
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+int j;		
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+int \	
+  k \	
+  ;	
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-3 } */
+
+ 
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+
+ 
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+const char *p = R"*|*(		
+  
+        
+.
+)*|*";	 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+// This is a comment with trailing whitespace 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+/* This is a comment with trailing whitespace	
+*/
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
+// This is a comment with trailing whitespace 
+/* This is a comment with trailing whitespace 
+*/
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .+1 } */
+    
\ No newline at end of file
--- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-2.c.jj	2024-09-19 17:31:57.169567809 +0200
+++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-2.c	2024-09-19 17:40:43.709563785 +0200
@@ -0,0 +1,37 @@ 
+/* { dg-do compile { target { c || c++11 } } } */
+/* { dg-options "-Wtrailing-whitespace=blank" } */
+
+int i;   
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+int j;		
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+int \	
+  k \	
+  ;	
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-3 } */
+
+ 
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+
+ 
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+const char *p = R"*|*(		
+  
+        
+.
+)*|*";	 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+// This is a comment with trailing whitespace 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+/* This is a comment with trailing whitespace	
+*/
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
+// This is a comment with trailing whitespace 
+/* This is a comment with trailing whitespace 
+*/
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .+1 } */
+    
\ No newline at end of file
--- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-3.c.jj	2024-09-19 17:32:20.467260617 +0200
+++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-3.c	2024-09-19 17:40:50.284474568 +0200
@@ -0,0 +1,43 @@ 
+/* { dg-do compile { target { c || c++11 } } } */
+/* { dg-options "-Wtrailing-whitespace=space" } */
+
+int i;   
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+int j;		
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+int \	
+  k \	
+  ;	
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-3 } */
+
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+ 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+const char *p = R"*|*(		
+  
+        
+.
+)*|*";	 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+// This is a comment with trailing whitespace 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+/* This is a comment with trailing whitespace	
+*/
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
+// This is a comment with trailing whitespace 
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-1 } */
+/* This is a comment with trailing whitespace 
+*/
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .-2 } */
+/* { dg-warning "trailing whitespace" "" { target *-*-* } .+1 } */
+    
\ No newline at end of file
--- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-4.c.jj	2024-09-19 17:33:02.531705971 +0200
+++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-4.c	2024-09-19 17:41:21.131056009 +0200
@@ -0,0 +1,28 @@ 
+/* { dg-do compile { target { c || c++11 } } } */
+/* { dg-options "-Wtrailing-whitespace=none" } */
+
+int i;   
+int j;		
+int \	
+  k \	
+  ;	
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+
+ 
+ 
+
+ 
+ 
+const char *p = R"*|*(		
+  
+        
+.
+)*|*";	 
+// This is a comment with trailing whitespace 
+/* This is a comment with trailing whitespace	
+*/
+// This is a comment with trailing whitespace 
+/* This is a comment with trailing whitespace 
+*/
+    
\ No newline at end of file
--- gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-5.c.jj	2024-09-19 17:33:45.427140375 +0200
+++ gcc/testsuite/c-c++-common/cpp/Wtrailing-whitespace-5.c	2024-09-19 17:41:28.352958013 +0200
@@ -0,0 +1,28 @@ 
+/* { dg-do compile { target { c || c++11 } } } */
+/* { dg-options "-Wno-trailing-whitespace" } */
+
+int i;   
+int j;		
+int \	
+  k \	
+  ;	
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+/* { dg-warning "backslash and newline separated by space" "" { target *-*-* } .-3 } */
+
+ 
+ 
+
+ 
+ 
+const char *p = R"*|*(		
+  
+        
+.
+)*|*";	 
+// This is a comment with trailing whitespace 
+/* This is a comment with trailing whitespace	
+*/
+// This is a comment with trailing whitespace 
+/* This is a comment with trailing whitespace 
+*/
+    
\ No newline at end of file