libcpp: Fix _Pragma in #__VA_ARGS__ [PR103165]
Commit Message
This patch fixes a pseudo regression of my previous _Pragma patch.
The issue was that a tokenized '_Pragma' (-> CPP_PRAGMA) could end
up as to-be quoted argument ('#__VA_ARG__') and that wasn't never
handled and gave an ICE for a GCC after my previous patch and before
this patch.
The expected 'gcc -E' output is in the '#if 0' block in the testcase.
Before my previous patch, "gcc -E" yielded the following, which is
obviously wrong:
const char *str =
#pragma omp error severity(warning) message ("Test") at(compilation)
"\"1,2\" ;" ;
#pragma omp error severity(warning) message ("Test") at(compilation)
;
Build on x86-64-gnu-linux, the "make -k check" is running.
OK when it passes?
* * *
Disclaimer: While this patch does a step into the right direction,
it probably does help with any of the other _Pragma issues. Neither
with 'gcc -E' when the pragma wasn't registered (still expanded too
early) nor with the 'GCC diagnostic' issues in general as there the
input_location is used to decide when to pop - and depending on the
column numbers, this may or may not work.
See https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582927.html
for some more details and links to PRs.
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Comments
On Wed, 10 Nov 2021, Tobias Burnus wrote:
> Disclaimer: While this patch does a step into the right direction,
> it probably does help with any of the other _Pragma issues. Neither
> with 'gcc -E' when the pragma wasn't registered (still expanded too
> early) nor with the 'GCC diagnostic' issues in general as there the
> input_location is used to decide when to pop - and depending on the
> column numbers, this may or may not work.
And fully correct stringization of _Pragma should respect the spelling of
the preprocessing tokens (of the string-literal preprocessing token, that
is; spelling variations for the other preprocessing tokens aren't possible
here) and the presence or absence of whitespace between them.
_Pragma("foo")
_Pragma ("foo")
_Pragma("foo" )
_Pragma(L"foo")
_Pragma ( "foo" )
(for example) should all have their spelling preserved by stringization
(but any nonempty white space sequence becomes a single space).
On Wed, Nov 10, 2021 at 09:30:29PM +0000, Joseph Myers wrote:
> On Wed, 10 Nov 2021, Tobias Burnus wrote:
>
> > Disclaimer: While this patch does a step into the right direction,
> > it probably does help with any of the other _Pragma issues. Neither
> > with 'gcc -E' when the pragma wasn't registered (still expanded too
> > early) nor with the 'GCC diagnostic' issues in general as there the
> > input_location is used to decide when to pop - and depending on the
> > column numbers, this may or may not work.
>
> And fully correct stringization of _Pragma should respect the spelling of
> the preprocessing tokens (of the string-literal preprocessing token, that
> is; spelling variations for the other preprocessing tokens aren't possible
> here) and the presence or absence of whitespace between them.
>
> _Pragma("foo")
> _Pragma ("foo")
> _Pragma("foo" )
> _Pragma(L"foo")
> _Pragma ( "foo" )
>
> (for example) should all have their spelling preserved by stringization
> (but any nonempty white space sequence becomes a single space).
Yeah. And not just that, I think also all the exact whitespace in the
string literal (this time with no replacement of nonempty white space with a
single space).
Consider in pragma-3.c e.g.
#define inner(...) #__VA_ARGS__ ; _Pragma ( " omp error severity (warning) message (\"Test\") at(compilation)" )
should yield:
const char *str = "\"1,2\" ; _Pragma ( \" omp error severity (warning) message (\\\"Test\\\") at(compilation)\" )";
I guess we could encode the PREV_WHITE flags from the ( and ) tokens as 2 separate
bits somewhere (e.g. in some bits of the pragma id), but we need to encode the whole
string literal somewhere too.
Now, in cpp_token we have:
union cpp_token_u {
/* Caller-supplied identifier for a CPP_PRAGMA. */
unsigned int GTY ((tag ("CPP_TOKEN_FLD_PRAGMA"))) pragma;
}
where several other members of the union are structs, either with a pair
of unsigned and pointer or two pointers. So, could we make
the pragma union member also a struct with the pragma id and
pointer to the _Pragma string literal cpp_token?
Though, that doesn't solve the case where in destringize_and_run
pfile->directive_result.type != CPP_PRAGMA.
Are we handling the pragma at a wrong phase of preprocessing?
Jakub
On Thu, 18 Nov 2021, Jakub Jelinek via Gcc-patches wrote:
> Are we handling the pragma at a wrong phase of preprocessing?
I think that converting it to a single preprocessing token (rather than
four separate preprocessing tokens), at a stage when stringizing might
still occur, does indicate it's being processed too soon, and it would be
better to do that only when it's known that the _Pragma preprocessing
token will actually occur in the results of preprocessing the source file.
libcpp: Fix _Pragma in #__VA_ARGS__ [PR103165]
Using '#define inner(...) #__VA_ARGS__ _Pragma("...")' yields a string plus
the _Pragma, passing this to another '#__VA_ARGS__' lead to having a
CPP_PRAGMA inside stringify_arg, which wasn't handled before this commit and
gave an ICE.
In GCC versions before r12-4797-g0078a058 (cf. PR102409), instead of giving
an ICE, the _Pragma wasn't stringified but output as a #pragma before the
actual macro expansion.
PR preprocessor/103165
libcpp/ChangeLog:
* directives.c (_cpp_get_pragma_by_id): New.
* internal.h (_cpp_get_pragma_by_id): New prototype.
* macro.c (stringify_arg): Use it; hande stringification of _Pragma.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pragma-3.c: New test.
* c-c++-common/gomp/pragma-4.c: New test.
gcc/testsuite/c-c++-common/gomp/pragma-3.c | 20 ++++++++++++++++++++
gcc/testsuite/c-c++-common/gomp/pragma-4.c | 20 ++++++++++++++++++++
libcpp/directives.c | 23 +++++++++++++++++++++++
libcpp/internal.h | 3 +++
libcpp/macro.c | 29 +++++++++++++++++++++++++++--
5 files changed, 93 insertions(+), 2 deletions(-)
new file mode 100644
@@ -0,0 +1,20 @@
+/* { dg-additional-options "-fdump-tree-original" } */
+/* PR preprocessor/103165 */
+
+#define inner(...) #__VA_ARGS__ ; _Pragma("omp error severity(warning) message (\"Test\") at(compilation)")
+#define outer(...) inner(__VA_ARGS__)
+
+void
+f (void)
+{
+ const char *str = outer(inner(1,2)); /* { dg-warning "'pragma omp error' encountered: Test" } */
+}
+
+#if 0
+After preprocessing, the expected result are the following three lines:
+ const char *str = "\"1,2\" ; _Pragma(\"omp error severity(warning) message (\"Test\") at(compilation)\")" ;
+#pragma omp error severity(warning) message ("Test") at(compilation)
+ ;
+#endif
+
+/* { dg-final { scan-tree-dump "const char \\* str = \\(const char \\*\\) \"\\\\\"1,2\\\\\" ; _Pragma\\(\\\\\"omp error severity\\(warning\\) message \\(\\\\\"Test\\\\\"\\) at\\(compilation\\)\\\\\"\\)\";" "original" } } */
new file mode 100644
@@ -0,0 +1,20 @@
+/* { dg-additional-options "-fdump-tree-original -save-temps" } */
+/* PR preprocessor/103165 */
+
+#define inner(...) #__VA_ARGS__ ; _Pragma("omp error severity(warning) message (\"Test\") at(compilation)")
+#define outer(...) inner(__VA_ARGS__)
+
+void
+f (void)
+{
+ const char *str = outer(inner(1,2)); /* { dg-warning "'pragma omp error' encountered: Test" } */
+}
+
+#if 0
+After preprocessing, the expected result are the following three lines:
+ const char *str = "\"1,2\" ; _Pragma(\"omp error severity(warning) message (\"Test\") at(compilation)\")" ;
+#pragma omp error severity(warning) message ("Test") at(compilation)
+ ;
+#endif
+
+/* { dg-final { scan-tree-dump "const char \\* str = \\(const char \\*\\) \"\\\\\"1,2\\\\\" ; _Pragma\\(\\\\\"omp error severity\\(warning\\) message \\(\\\\\"Test\\\\\"\\) at\\(compilation\\)\\\\\"\\)\";" "original" } } */
@@ -1236,6 +1236,29 @@ do_ident (cpp_reader *pfile)
check_eol (pfile, false);
}
+/* Convert a pragma id back to the space + name. Currently used by
+ stringify_arg for _Pragma. The assumption is that it is only very rarely
+ called such that O(num-pragmas + num-pragma-spaces) checks is acceptable. */
+void
+_cpp_get_pragma_by_id (cpp_reader *pfile, const cpp_token *token,
+ cpp_hashnode const **space, cpp_hashnode const **name)
+{
+ struct pragma_entry *s, *n = NULL;
+ for (s = pfile->pragmas; s; s = s->next)
+ {
+ if (!s->is_nspace)
+ continue;
+ for (n = s->u.space; n; n = n->next)
+ if (token->val.pragma == n->u.ident)
+ break;
+ if (n)
+ break;
+ }
+ gcc_assert (s && n);
+ *space = s->pragma;
+ *name = n->pragma;
+}
+
/* Lookup a PRAGMA name in a singly-linked CHAIN. Returns the
matching entry, or NULL if none is found. The returned entry could
be the start of a namespace chain, or a pragma. */
@@ -762,6 +762,9 @@ extern void _cpp_define_builtin (cpp_reader *, const char *);
extern char ** _cpp_save_pragma_names (cpp_reader *);
extern void _cpp_restore_pragma_names (cpp_reader *, char **);
extern int _cpp_do__Pragma (cpp_reader *, location_t);
+extern void _cpp_get_pragma_by_id (cpp_reader *, const cpp_token*,
+ cpp_hashnode const **,
+ cpp_hashnode const **);
extern void _cpp_init_directives (cpp_reader *);
extern void _cpp_init_internal_pragmas (cpp_reader *);
extern void _cpp_do_file_change (cpp_reader *, enum lc_reason, const char *,
@@ -839,6 +839,7 @@ stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
unsigned int i, escape_it, backslash_count = 0;
const cpp_token *source = NULL;
size_t len;
+ cpp_hashnode const *pragma_space = NULL, *pragma_name = NULL;
if (BUFF_ROOM (pfile->u_buff) < 3)
_cpp_extend_buff (pfile, &pfile->u_buff, 3);
@@ -887,7 +888,15 @@ stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
/* Room for each char being written in octal, initial space and
final quote and NUL. */
- len = cpp_token_len (token);
+ if (token->type == CPP_PRAGMA)
+ {
+ gcc_assert (token->flags & PRAGMA_OP);
+ _cpp_get_pragma_by_id (pfile, token, &pragma_space, &pragma_name);
+ len = (strlen ("_Pragma(\"") + pragma_space->ident.len
+ + 1 + pragma_name->ident.len);
+ }
+ else
+ len = cpp_token_len (token);
if (escape_it)
len *= 4;
len += 3;
@@ -909,7 +918,23 @@ stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
}
source = NULL;
- if (escape_it)
+ if (token->type == CPP_PRAGMA)
+ {
+ memcpy (dest, "_Pragma(\\\"", strlen("_Pragma(\\\""));
+ dest += strlen("_Pragma(\\\"");
+ memcpy (dest, pragma_space->ident.str, pragma_space->ident.len);
+ dest += pragma_space->ident.len;
+ *dest = ' ';
+ dest++;
+ memcpy (dest, pragma_name->ident.str, pragma_name->ident.len);
+ dest += pragma_name->ident.len;
+ }
+ else if (token->type == CPP_PRAGMA_EOL)
+ {
+ memcpy (dest, "\\\")", 3);
+ dest += 3;
+ }
+ else if (escape_it)
{
_cpp_buff *buff = _cpp_get_buff (pfile, len);
unsigned char *buf = BUFF_FRONT (buff);