libcpp: Support extended characters for #pragma {push, pop}_macro [PR109704]

Message ID 20240113221251.2180315-1-lhyatt@gmail.com
State New
Headers
Series libcpp: Support extended characters for #pragma {push, pop}_macro [PR109704] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed

Commit Message

Lewis Hyatt Jan. 13, 2024, 10:12 p.m. UTC
  Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704

The below patch fixes the issue noted in the PR that extended characters
cannot appear in the identifier passed to a #pragma push_macro or #pragma
pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for
GCC 13 please?

I know we just entered stage 4, however I feel this is kinda like an old
regression, given that the issue was not apparent until support for UCNs and
UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into
GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this
release. (The other major one was for extended characters in a user-defined
literal, that was fixed by r14-2629).

Speaking of just entering stage 4. I do have 4 really short patches sent
over the past several months that never got any response. Is there any
chance someone may have a few minutes to look at them please? They are
really just like 1-3 line fixes for PRs.

libcpp (pinged once recently):
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html

diagnostics (pinged for 3rd time last week):
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

c-family/PCH (pinged a month ago):
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html

Thanks!

-Lewis

-- >8 --

The implementation of #pragma push_macro and #pragma pop_macro has to date
made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
identifier out of a string. When support was added for extended characters
in identifiers ($, UCNs, or UTF-8), that support was added only for the
"normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
not for the ad-hoc way. Consequently, extended identifiers are not usable
with these pragmas.

The logic for lexing identifiers has become more complicated than it was
when _cpp_lex_identifier() was written -- it now handles things like \N{}
escapes in C++, for instance -- and it no longer seems practical to maintain
a redundant code path for lexing identifiers. Address the issue by changing
the implementation of #pragma {push,pop}_macro to lex identifiers in the
expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
there.

The existing implementation has some quirks because of the ad-hoc parsing
logic. For example:

 #pragma push_macro("X ")
 ...
 #pragma pop_macro("X")

will not restore macro X (note the extra space in the first string). However:

 #pragma push_macro("X ")
 ...
 #pragma pop_macro("X ")

actually does sucessfully restore "X". This is because the key for looking
up the saved macro on the push stack is the original string passed, so the
string passed to pop_macro needs to match it exactly. It is not that easy to
reproduce this logic in the world of extended characters, given that for
example it should be valid to pass a UCN to push_macro, and the
corresponding UTF-8 to pop_macro. Given that this aspect of the existing
behavior seems unintentional and has no tests (and does not match other
implementations), I opted to make the new logic more straightforward. The
string passed needs to lex to one token, which must be a valid identifier,
or else no action is taken and no error is generated. Any diagnostics
encountered during lexing (e.g., due to a UTF-8 character not permitted to
appear in an identifier) are also suppressed.

It could be nice (for GCC 15) to also add a warning if a pop_macro does not
match a previous push_macro.

libcpp/ChangeLog:

	PR preprocessor/109704
	* include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
	* errors.cc
	(cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
	function.
	(cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
	function.
	* charset.cc (noop_diagnostic_cb): Remove.
	(cpp_interpret_string_ranges): Refactor diagnostic suppression logic
	into new class cpp_auto_suppress_diagnostics.
	(count_source_chars): Likewise.
	* directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
	(lex_identifier_from_string): New static helper function.
	(push_pop_macro_common): Refactor common logic from
	do_pragma_push_macro and do_pragma_pop_macro; use
	lex_identifier_from_string instead of _cpp_lex_identifier.
	(do_pragma_push_macro): Reimplement using push_pop_macro_common.
	(do_pragma_pop_macro): Likewise.
	* internal.h (_cpp_lex_identifier): Remove.
	* lex.cc (lex_identifier_intern): Remove.
	(_cpp_lex_identifier): Remove.

gcc/testsuite/ChangeLog:

	PR preprocessor/109704
	* c-c++-common/cpp/pragma-push-pop-utf8.c: New test.
	* g++.dg/pch/pushpop-2.C: New test.
	* g++.dg/pch/pushpop-2.Hs: New test.
	* gcc.dg/pch/pushpop-2.c: New test.
	* gcc.dg/pch/pushpop-2.hs: New test.
---
 libcpp/charset.cc                             |  33 +--
 libcpp/directives.cc                          | 175 +++++++--------
 libcpp/errors.cc                              |  16 ++
 libcpp/include/cpplib.h                       |  13 ++
 libcpp/internal.h                             |   1 -
 libcpp/lex.cc                                 |  33 ---
 .../c-c++-common/cpp/pragma-push-pop-utf8.c   | 203 ++++++++++++++++++
 gcc/testsuite/g++.dg/pch/pushpop-2.C          |  18 ++
 gcc/testsuite/g++.dg/pch/pushpop-2.Hs         |   9 +
 gcc/testsuite/gcc.dg/pch/pushpop-2.c          |  18 ++
 gcc/testsuite/gcc.dg/pch/pushpop-2.hs         |   9 +
 11 files changed, 378 insertions(+), 150 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
 create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.C
 create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.Hs
 create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.c
 create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.hs
  

Comments

Lewis Hyatt Feb. 10, 2024, 2:02 p.m. UTC | #1
Hello-

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642926.html

May I please ping this one? Thanks!

On Sat, Jan 13, 2024 at 5:12 PM Lewis Hyatt <lhyatt@gmail.com> wrote:
>
> Hello-
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704
>
> The below patch fixes the issue noted in the PR that extended characters
> cannot appear in the identifier passed to a #pragma push_macro or #pragma
> pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for
> GCC 13 please?
>
> I know we just entered stage 4, however I feel this is kinda like an old
> regression, given that the issue was not apparent until support for UCNs and
> UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into
> GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this
> release. (The other major one was for extended characters in a user-defined
> literal, that was fixed by r14-2629).
>
> Speaking of just entering stage 4. I do have 4 really short patches sent
> over the past several months that never got any response. Is there any
> chance someone may have a few minutes to look at them please? They are
> really just like 1-3 line fixes for PRs.
>
> libcpp (pinged once recently):
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html
>
> diagnostics (pinged for 3rd time last week):
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html

> -- >8 --
>
> The implementation of #pragma push_macro and #pragma pop_macro has to date
> made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
> identifier out of a string. When support was added for extended characters
> in identifiers ($, UCNs, or UTF-8), that support was added only for the
> "normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
> not for the ad-hoc way. Consequently, extended identifiers are not usable
> with these pragmas.
>
> The logic for lexing identifiers has become more complicated than it was
> when _cpp_lex_identifier() was written -- it now handles things like \N{}
> escapes in C++, for instance -- and it no longer seems practical to maintain
> a redundant code path for lexing identifiers. Address the issue by changing
> the implementation of #pragma {push,pop}_macro to lex identifiers in the
> expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
> there.
>
> The existing implementation has some quirks because of the ad-hoc parsing
> logic. For example:
>
>  #pragma push_macro("X ")
>  ...
>  #pragma pop_macro("X")
>
> will not restore macro X (note the extra space in the first string). However:
>
>  #pragma push_macro("X ")
>  ...
>  #pragma pop_macro("X ")
>
> actually does sucessfully restore "X". This is because the key for looking
> up the saved macro on the push stack is the original string passed, so the
> string passed to pop_macro needs to match it exactly. It is not that easy to
> reproduce this logic in the world of extended characters, given that for
> example it should be valid to pass a UCN to push_macro, and the
> corresponding UTF-8 to pop_macro. Given that this aspect of the existing
> behavior seems unintentional and has no tests (and does not match other
> implementations), I opted to make the new logic more straightforward. The
> string passed needs to lex to one token, which must be a valid identifier,
> or else no action is taken and no error is generated. Any diagnostics
> encountered during lexing (e.g., due to a UTF-8 character not permitted to
> appear in an identifier) are also suppressed.
>
> It could be nice (for GCC 15) to also add a warning if a pop_macro does not
> match a previous push_macro.
>
> libcpp/ChangeLog:
>
>         PR preprocessor/109704
>         * include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
>         * errors.cc
>         (cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
>         function.
>         (cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
>         function.
>         * charset.cc (noop_diagnostic_cb): Remove.
>         (cpp_interpret_string_ranges): Refactor diagnostic suppression logic
>         into new class cpp_auto_suppress_diagnostics.
>         (count_source_chars): Likewise.
>         * directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
>         (lex_identifier_from_string): New static helper function.
>         (push_pop_macro_common): Refactor common logic from
>         do_pragma_push_macro and do_pragma_pop_macro; use
>         lex_identifier_from_string instead of _cpp_lex_identifier.
>         (do_pragma_push_macro): Reimplement using push_pop_macro_common.
>         (do_pragma_pop_macro): Likewise.
>         * internal.h (_cpp_lex_identifier): Remove.
>         * lex.cc (lex_identifier_intern): Remove.
>         (_cpp_lex_identifier): Remove.
>
> gcc/testsuite/ChangeLog:
>
>         PR preprocessor/109704
>         * c-c++-common/cpp/pragma-push-pop-utf8.c: New test.
>         * g++.dg/pch/pushpop-2.C: New test.
>         * g++.dg/pch/pushpop-2.Hs: New test.
>         * gcc.dg/pch/pushpop-2.c: New test.
>         * gcc.dg/pch/pushpop-2.hs: New test.
> ---
>  libcpp/charset.cc                             |  33 +--
>  libcpp/directives.cc                          | 175 +++++++--------
>  libcpp/errors.cc                              |  16 ++
>  libcpp/include/cpplib.h                       |  13 ++
>  libcpp/internal.h                             |   1 -
>  libcpp/lex.cc                                 |  33 ---
>  .../c-c++-common/cpp/pragma-push-pop-utf8.c   | 203 ++++++++++++++++++
>  gcc/testsuite/g++.dg/pch/pushpop-2.C          |  18 ++
>  gcc/testsuite/g++.dg/pch/pushpop-2.Hs         |   9 +
>  gcc/testsuite/gcc.dg/pch/pushpop-2.c          |  18 ++
>  gcc/testsuite/gcc.dg/pch/pushpop-2.hs         |   9 +
>  11 files changed, 378 insertions(+), 150 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
>  create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.C
>  create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.Hs
>  create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.hs
>
> diff --git a/libcpp/charset.cc b/libcpp/charset.cc
> index 54d7b9e0932..7937df7d78c 100644
> --- a/libcpp/charset.cc
> +++ b/libcpp/charset.cc
> @@ -2590,19 +2590,6 @@ cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count,
>    return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL);
>  }
>
> -/* A "do nothing" diagnostic-handling callback for use by
> -   cpp_interpret_string_ranges, so that it can temporarily suppress
> -   diagnostic-handling.  */
> -
> -static bool
> -noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level,
> -                   enum cpp_warning_reason, rich_location *,
> -                   const char *, va_list *)
> -{
> -  /* no-op.  */
> -  return true;
> -}
> -
>  /* This function mimics the behavior of cpp_interpret_string, but
>     rather than generating a string in the execution character set,
>     *OUT is written to with the source code ranges of the characters
> @@ -2642,20 +2629,10 @@ cpp_interpret_string_ranges (cpp_reader *pfile, const cpp_string *from,
>       failing, rather than being emitted as a user-visible diagnostic.
>       If an diagnostic does occur, we should see it via the return value of
>       cpp_interpret_string_1.  */
> -  bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level,
> -                                   enum cpp_warning_reason, rich_location *,
> -                                   const char *, va_list *)
> -    ATTRIBUTE_FPTR_PRINTF(5,0);
> -
> -  saved_diagnostic_handler = pfile->cb.diagnostic;
> -  pfile->cb.diagnostic = noop_diagnostic_cb;
> -
> +  cpp_auto_suppress_diagnostics suppress {pfile};
>    bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type,
>                                         loc_readers, out);
>
> -  /* Restore the saved diagnostic-handler.  */
> -  pfile->cb.diagnostic = saved_diagnostic_handler;
> -
>    if (!result)
>      return "cpp_interpret_string_1 failed";
>
> @@ -2691,17 +2668,11 @@ static unsigned
>  count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type)
>  {
>    cpp_string str2 = { 0, 0 };
> -  bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level,
> -                                   enum cpp_warning_reason, rich_location *,
> -                                   const char *, va_list *)
> -    ATTRIBUTE_FPTR_PRINTF(5,0);
> -  saved_diagnostic_handler = pfile->cb.diagnostic;
> -  pfile->cb.diagnostic = noop_diagnostic_cb;
> +  cpp_auto_suppress_diagnostics suppress {pfile};
>    convert_f save_func = pfile->narrow_cset_desc.func;
>    pfile->narrow_cset_desc.func = convert_count_chars;
>    bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type);
>    pfile->narrow_cset_desc.func = save_func;
> -  pfile->cb.diagnostic = saved_diagnostic_handler;
>    if (ret)
>      {
>        if (str2.text != str.text)
> diff --git a/libcpp/directives.cc b/libcpp/directives.cc
> index 479f8c716e8..019e4009dc9 100644
> --- a/libcpp/directives.cc
> +++ b/libcpp/directives.cc
> @@ -137,7 +137,8 @@ static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *);
>  static void handle_assertion (cpp_reader *, const char *, int);
>  static void do_pragma_push_macro (cpp_reader *);
>  static void do_pragma_pop_macro (cpp_reader *);
> -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *);
> +static void cpp_pop_definition (cpp_reader *, def_pragma_macro *,
> +                               cpp_hashnode *);
>
>  /* This is the table of directive handlers.  All extensions other than
>     #warning, #include_next, and #import are deprecated.  The name is
> @@ -1595,55 +1596,95 @@ do_pragma_once (cpp_reader *pfile)
>    _cpp_mark_file_once_only (pfile, pfile->buffer->file);
>  }
>
> -/* Handle #pragma push_macro(STRING).  */
> -static void
> -do_pragma_push_macro (cpp_reader *pfile)
> +/* Helper for #pragma {push,pop}_macro.  Destringize STR and
> +   lex it into an identifier, returning the hash node for it.  */
> +
> +static cpp_hashnode *
> +lex_identifier_from_string (cpp_reader *pfile, cpp_string str)
>  {
> +  auto src = (const uchar *) memchr (str.text, '"', str.len);
> +  gcc_checking_assert (src);
> +  ++src;
> +  const auto limit = str.text + str.len - 1;
> +  gcc_checking_assert (*limit == '"' && limit >= src);
> +  const auto ident = XALLOCAVEC (uchar, limit - src + 1);
> +  auto dest = ident;
> +  while (src != limit)
> +    {
> +      /* We know there is a character following the backslash.  */
> +      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
> +       src++;
> +      *dest++ = *src++;
> +    }
> +
> +  /* We reserved a spot for the newline with the + 1 when allocating IDENT.
> +     Push a buffer containing the identifier to lex.  */
> +  *dest = '\n';
> +  cpp_push_buffer (pfile, ident, dest - ident, true);
> +  _cpp_clean_line (pfile);
> +  pfile->cur_token = _cpp_temp_token (pfile);
> +  cpp_token *tok;
> +  {
> +    /* Suppress diagnostics during lexing so that we silently ignore invalid
> +       input, as seems to be the common practice for this pragma.  */
> +    cpp_auto_suppress_diagnostics suppress {pfile};
> +    tok = _cpp_lex_direct (pfile);
> +  }
> +
>    cpp_hashnode *node;
> -  size_t defnlen;
> -  const uchar *defn = NULL;
> -  char *macroname, *dest;
> -  const char *limit, *src;
> -  const cpp_token *txt;
> -  struct def_pragma_macro *c;
> +  if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit)
> +    node = nullptr;
> +  else
> +    node = tok->val.node.node;
>
> -  txt = get__Pragma_string (pfile);
> -  if (!txt)
> +  _cpp_pop_buffer (pfile);
> +  return node;
> +}
> +
> +/* Common processing for #pragma {push,pop}_macro.  */
> +
> +static cpp_hashnode *
> +push_pop_macro_common (cpp_reader *pfile, const char *type)
> +{
> +  const cpp_token *const txt = get__Pragma_string (pfile);
> +  ++pfile->keep_tokens;
> +  cpp_hashnode *node;
> +  if (txt)
>      {
> -      location_t src_loc = pfile->cur_token[-1].src_loc;
> -      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
> -                "invalid #pragma push_macro directive");
>        check_eol (pfile, false);
>        skip_rest_of_line (pfile);
> -      return;
> +      node = lex_identifier_from_string (pfile, txt->val.str);
>      }
> -  dest = macroname = (char *) alloca (txt->val.str.len + 2);
> -  src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L'));
> -  limit = (const char *) (txt->val.str.text + txt->val.str.len - 1);
> -  while (src < limit)
> +  else
>      {
> -      /* We know there is a character following the backslash.  */
> -      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
> -       src++;
> -      *dest++ = *src++;
> +      node = nullptr;
> +      location_t src_loc = pfile->cur_token[-1].src_loc;
> +      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
> +                          "invalid #pragma %s_macro directive", type);
> +      skip_rest_of_line (pfile);
>      }
> -  *dest = 0;
> -  check_eol (pfile, false);
> -  skip_rest_of_line (pfile);
> -  c = XNEW (struct def_pragma_macro);
> -  memset (c, 0, sizeof (struct def_pragma_macro));
> -  c->name = XNEWVAR (char, strlen (macroname) + 1);
> -  strcpy (c->name, macroname);
> +  --pfile->keep_tokens;
> +  return node;
> +}
> +
> +/* Handle #pragma push_macro(STRING).  */
> +static void
> +do_pragma_push_macro (cpp_reader *pfile)
> +{
> +  const auto node = push_pop_macro_common (pfile, "push");
> +  if (!node)
> +    return;
> +  const auto c = XCNEW (def_pragma_macro);
> +  c->name = xstrdup ((const char *) NODE_NAME (node));
>    c->next = pfile->pushed_macros;
> -  node = _cpp_lex_identifier (pfile, c->name);
>    if (node->type == NT_VOID)
>      c->is_undef = 1;
>    else if (node->type == NT_BUILTIN_MACRO)
>      c->is_builtin = 1;
>    else
>      {
> -      defn = cpp_macro_definition (pfile, node);
> -      defnlen = ustrlen (defn);
> +      const auto defn = cpp_macro_definition (pfile, node);
> +      const size_t defnlen = ustrlen (defn);
>        c->definition = XNEWVEC (uchar, defnlen + 2);
>        c->definition[defnlen] = '\n';
>        c->definition[defnlen + 1] = 0;
> @@ -1660,50 +1701,24 @@ do_pragma_push_macro (cpp_reader *pfile)
>  static void
>  do_pragma_pop_macro (cpp_reader *pfile)
>  {
> -  char *macroname, *dest;
> -  const char *limit, *src;
> -  const cpp_token *txt;
> -  struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros;
> -  txt = get__Pragma_string (pfile);
> -  if (!txt)
> -    {
> -      location_t src_loc = pfile->cur_token[-1].src_loc;
> -      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
> -                "invalid #pragma pop_macro directive");
> -      check_eol (pfile, false);
> -      skip_rest_of_line (pfile);
> -      return;
> -    }
> -  dest = macroname = (char *) alloca (txt->val.str.len + 2);
> -  src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L'));
> -  limit = (const char *) (txt->val.str.text + txt->val.str.len - 1);
> -  while (src < limit)
> -    {
> -      /* We know there is a character following the backslash.  */
> -      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
> -       src++;
> -      *dest++ = *src++;
> -    }
> -  *dest = 0;
> -  check_eol (pfile, false);
> -  skip_rest_of_line (pfile);
> -
> -  while (c != NULL)
> +  const auto node = push_pop_macro_common (pfile, "pop");
> +  if (!node)
> +    return;
> +  for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = c->next)
>      {
> -      if (!strcmp (c->name, macroname))
> +      if (!strcmp (c->name, (const char *) NODE_NAME (node)))
>         {
>           if (!l)
>             pfile->pushed_macros = c->next;
>           else
>             l->next = c->next;
> -         cpp_pop_definition (pfile, c);
> +         cpp_pop_definition (pfile, c, node);
>           free (c->definition);
>           free (c->name);
>           free (c);
>           break;
>         }
>        l = c;
> -      c = c->next;
>      }
>  }
>
> @@ -2607,12 +2622,8 @@ cpp_undef (cpp_reader *pfile, const char *macro)
>  /* Replace a previous definition DEF of the macro STR.  If DEF is NULL,
>     or first element is zero, then the macro should be undefined.  */
>  static void
> -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c)
> +cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode *node)
>  {
> -  cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name);
> -  if (node == NULL)
> -    return;
> -
>    if (pfile->cb.before_define)
>      pfile->cb.before_define (pfile);
>
> @@ -2634,29 +2645,23 @@ cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c)
>      }
>
>    {
> -    size_t namelen;
> -    const uchar *dn;
> -    cpp_hashnode *h = NULL;
> -    cpp_buffer *nbuf;
> -
> -    namelen = ustrcspn (c->definition, "( \n");
> -    h = cpp_lookup (pfile, c->definition, namelen);
> -    dn = c->definition + namelen;
> -
> -    nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true);
> +    const auto namelen = ustrcspn (c->definition, "( \n");
> +    const auto dn = c->definition + namelen;
> +    const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn,
> +                                      true);
>      if (nbuf != NULL)
>        {
>         _cpp_clean_line (pfile);
>         nbuf->sysp = 1;
> -       if (!_cpp_create_definition (pfile, h, 0))
> +       if (!_cpp_create_definition (pfile, node, 0))
>           abort ();
>         _cpp_pop_buffer (pfile);
>        }
>      else
>        abort ();
> -    h->value.macro->line = c->line;
> -    h->value.macro->syshdr = c->syshdr;
> -    h->value.macro->used = c->used;
> +    node->value.macro->line = c->line;
> +    node->value.macro->syshdr = c->syshdr;
> +    node->value.macro->used = c->used;
>    }
>  }
>
> diff --git a/libcpp/errors.cc b/libcpp/errors.cc
> index 295496df7ed..3228dcbe7f6 100644
> --- a/libcpp/errors.cc
> +++ b/libcpp/errors.cc
> @@ -350,3 +350,19 @@ cpp_errno_filename (cpp_reader *pfile, enum cpp_diagnostic_level level,
>    return cpp_error_at (pfile, level, loc, "%s: %s", filename,
>                        xstrerror (errno));
>  }
> +
> +cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader *pfile)
> +  : m_pfile (pfile), m_cb (pfile->cb.diagnostic)
> +{
> +  m_pfile->cb.diagnostic
> +    = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason,
> +         rich_location *, const char *, va_list *)
> +    {
> +      return true;
> +    };
> +}
> +
> +cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics ()
> +{
> +  m_pfile->cb.diagnostic = m_cb;
> +}
> diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
> index 5746aac9ea4..50705e3377a 100644
> --- a/libcpp/include/cpplib.h
> +++ b/libcpp/include/cpplib.h
> @@ -1638,4 +1638,17 @@ enum cpp_xid_property {
>
>  unsigned int cpp_check_xid_property (cppchar_t c);
>
> +/* In errors.cc */
> +
> +/* RAII class to suppress CPP diagnostics in the current scope.  */
> +class cpp_auto_suppress_diagnostics
> +{
> + public:
> +  explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile);
> +  ~cpp_auto_suppress_diagnostics ();
> + private:
> +  cpp_reader *const m_pfile;
> +  const decltype (cpp_callbacks::diagnostic) m_cb;
> +};
> +
>  #endif /* ! LIBCPP_CPPLIB_H */
> diff --git a/libcpp/internal.h b/libcpp/internal.h
> index a20215c5709..6221ef0d1e7 100644
> --- a/libcpp/internal.h
> +++ b/libcpp/internal.h
> @@ -753,7 +753,6 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *);
>  extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *);
>  extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *);
>  extern void _cpp_init_tokenrun (tokenrun *, unsigned int);
> -extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *);
>  extern int _cpp_remaining_tokens_num_in_context (cpp_context *);
>  extern void _cpp_init_lexer (void);
>  static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have,
> diff --git a/libcpp/lex.cc b/libcpp/lex.cc
> index 5aa379980cf..ba97377417b 100644
> --- a/libcpp/lex.cc
> +++ b/libcpp/lex.cc
> @@ -2204,39 +2204,6 @@ identifier_diagnostics_on_lex (cpp_reader *pfile, cpp_hashnode *node)
>                  NODE_NAME (node));
>  }
>
> -/* Helper function to get the cpp_hashnode of the identifier BASE.  */
> -static cpp_hashnode *
> -lex_identifier_intern (cpp_reader *pfile, const uchar *base)
> -{
> -  cpp_hashnode *result;
> -  const uchar *cur;
> -  unsigned int len;
> -  unsigned int hash = HT_HASHSTEP (0, *base);
> -
> -  cur = base + 1;
> -  while (ISIDNUM (*cur))
> -    {
> -      hash = HT_HASHSTEP (hash, *cur);
> -      cur++;
> -    }
> -  len = cur - base;
> -  hash = HT_HASHFINISH (hash, len);
> -  result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table,
> -                                             base, len, hash, HT_ALLOC));
> -  identifier_diagnostics_on_lex (pfile, result);
> -  return result;
> -}
> -
> -/* Get the cpp_hashnode of an identifier specified by NAME in
> -   the current cpp_reader object.  If none is found, NULL is returned.  */
> -cpp_hashnode *
> -_cpp_lex_identifier (cpp_reader *pfile, const char *name)
> -{
> -  cpp_hashnode *result;
> -  result = lex_identifier_intern (pfile, (uchar *) name);
> -  return result;
> -}
> -
>  /* Lex an identifier starting at BASE.  BUFFER->CUR is expected to point
>     one past the first character at BASE, which may be a (possibly multi-byte)
>     character if STARTS_UCN is true.  */
> diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
> new file mode 100644
> index 00000000000..c8665960e30
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
> @@ -0,0 +1,203 @@
> +/* { dg-do preprocess } */
> +/* { dg-options "-std=c11 -pedantic" { target c } } */
> +/* { dg-options "-std=c++11 -pedantic" { target c++ } } */
> +/* { dg-additional-options "-Wall" } */
> +
> +/* PR preprocessor/109704 */
> +
> +/* Verify basic operations for different extended identifiers...  */
> +
> +/* ...dollar sign.  */
> +#define $x 1
> +#pragma push_macro("$x")
> +#undef $x
> +#define $x 0
> +#pragma pop_macro("$x")
> +#if !$x
> +#error $x
> +#endif
> +#define $x 1
> +_Pragma("push_macro(\"$x\")")
> +#undef $x
> +#define $x 0
> +_Pragma("pop_macro(\"$x\")")
> +#if !$x
> +#error $x
> +#endif
> +#define x$ 1
> +#pragma push_macro("x$")
> +#undef x$
> +#define x$ 0
> +#pragma pop_macro("x$")
> +#if !x$
> +#error x$
> +#endif
> +#define x$ 1
> +_Pragma("push_macro(\"x$\")")
> +#undef x$
> +#define x$ 0
> +_Pragma("pop_macro(\"x$\")")
> +#if !x$
> +#error x$
> +#endif
> +
> +/* ...UCN.  */
> +#define \u03B1x 1
> +#pragma push_macro("\u03B1x")
> +#undef \u03B1x
> +#define \u03B1x 0
> +#pragma pop_macro("\u03B1x")
> +#if !\u03B1x
> +#error \u03B1x
> +#endif
> +#define \u03B1x 1
> +_Pragma("push_macro(\"\\u03B1x\")")
> +#undef \u03B1x
> +#define \u03B1x 0
> +_Pragma("pop_macro(\"\\u03B1x\")")
> +#if !\u03B1x
> +#error \u03B1x
> +#endif
> +#define x\u03B1 1
> +#pragma push_macro("x\u03B1")
> +#undef x\u03B1
> +#define x\u03B1 0
> +#pragma pop_macro("x\u03B1")
> +#if !x\u03B1
> +#error x\u03B1
> +#endif
> +#define x\u03B1 1
> +_Pragma("push_macro(\"x\\u03B1\")")
> +#undef x\u03B1
> +#define x\u03B1 0
> +_Pragma("pop_macro(\"x\\u03B1\")")
> +#if !x\u03B1
> +#error x\u03B1
> +#endif
> +
> +/* ...UTF-8.  */
> +#define πx 1
> +#pragma push_macro("πx")
> +#undef πx
> +#define πx 0
> +#pragma pop_macro("πx")
> +#if !πx
> +#error πx
> +#endif
> +#define πx 1
> +_Pragma("push_macro(\"πx\")")
> +#undef πx
> +#define πx 0
> +_Pragma("pop_macro(\"πx\")")
> +#if !πx
> +#error πx
> +#endif
> +#define xπ 1
> +#pragma push_macro("xπ")
> +#undef xπ
> +#define xπ 0
> +#pragma pop_macro("xπ")
> +#if !xπ
> +#error xπ
> +#endif
> +#define xπ 1
> +_Pragma("push_macro(\"xπ\")")
> +#undef xπ
> +#define xπ 0
> +_Pragma("pop_macro(\"xπ\")")
> +#if !xπ
> +#error xπ
> +#endif
> +
> +/* Verify UCN and UTF-8 can be intermixed.  */
> +#define ħ_0 1
> +#pragma push_macro("ħ_0")
> +#undef ħ_0
> +#define ħ_0 0
> +#if ħ_0
> +#error ħ_0 ħ_0 \U00000127_0
> +#endif
> +#pragma pop_macro("\U00000127_0")
> +#if !ħ_0
> +#error ħ_0 ħ_0 \U00000127_0
> +#endif
> +#define ħ_1 1
> +#pragma push_macro("\U00000127_1")
> +#undef ħ_1
> +#define ħ_1 0
> +#if ħ_1
> +#error ħ_1 \U00000127_1 ħ_1
> +#endif
> +#pragma pop_macro("ħ_1")
> +#if !ħ_1
> +#error ħ_1 \U00000127_1 ħ_1
> +#endif
> +#define ħ_2 1
> +#pragma push_macro("\U00000127_2")
> +#undef ħ_2
> +#define ħ_2 0
> +#if ħ_2
> +#error ħ_2 \U00000127_2 \U00000127_2
> +#endif
> +#pragma pop_macro("\U00000127_2")
> +#if !ħ_2
> +#error ħ_2 \U00000127_2 \U00000127_2
> +#endif
> +#define \U00000127_3 1
> +#pragma push_macro("ħ_3")
> +#undef \U00000127_3
> +#define \U00000127_3 0
> +#if \U00000127_3
> +#error \U00000127_3 ħ_3 ħ_3
> +#endif
> +#pragma pop_macro("ħ_3")
> +#if !\U00000127_3
> +#error \U00000127_3 ħ_3 ħ_3
> +#endif
> +#define \U00000127_4 1
> +#pragma push_macro("ħ_4")
> +#undef \U00000127_4
> +#define \U00000127_4 0
> +#if \U00000127_4
> +#error \U00000127_4 ħ_4 \U00000127_4
> +#endif
> +#pragma pop_macro("\U00000127_4")
> +#if !\U00000127_4
> +#error \U00000127_4 ħ_4 \U00000127_4
> +#endif
> +#define \U00000127_5 1
> +#pragma push_macro("\U00000127_5")
> +#undef \U00000127_5
> +#define \U00000127_5 0
> +#if \U00000127_5
> +#error \U00000127_5 \U00000127_5 ħ_5
> +#endif
> +#pragma pop_macro("ħ_5")
> +#if !\U00000127_5
> +#error \U00000127_5 \U00000127_5 ħ_5
> +#endif
> +
> +/* Verify invalid input produces no diagnostics.  */
> +#pragma push_macro("") /* { dg-bogus "." } */
> +#pragma push_macro("\u") /* { dg-bogus "." } */
> +#pragma push_macro("\u0000") /* { dg-bogus "." } */
> +#pragma push_macro("not a single identifier") /* { dg-bogus "." } */
> +#pragma push_macro("invalid╬character") /* { dg-bogus "." } */
> +#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */
> +#pragma push_macro("#include <cstdlib>") /* { dg-bogus "." } */
> +
> +/* Verify end-of-line diagnostics for valid and invalid input.  */
> +#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("not a single identifier") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("invalid╬character") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra tokens" } */
> +#pragma push_macro("#include <cstdlib>") oops /* { dg-warning "extra tokens" } */
> +
> +/* Verify expected diagnostics.  */
> +#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */
> +#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */
> +_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */
> +_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } */
> diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C b/gcc/testsuite/g++.dg/pch/pushpop-2.C
> new file mode 100644
> index 00000000000..84886aea985
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C
> @@ -0,0 +1,18 @@
> +/* { dg-options -std=c++11 } */
> +#include "pushpop-2.Hs"
> +
> +#if π != 4
> +#error π != 4
> +#endif
> +#pragma pop_macro("\u03C0")
> +#if π != 3
> +#error π != 3
> +#endif
> +
> +#if \u03B1 != 6
> +#error α != 6
> +#endif
> +_Pragma("pop_macro(\"\\u03B1\")")
> +#if α != 5
> +#error α != 5
> +#endif
> diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs
> new file mode 100644
> index 00000000000..797139a3196
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs
> @@ -0,0 +1,9 @@
> +#define π 3
> +#pragma push_macro ("π")
> +#undef π
> +#define π 4
> +
> +#define \u03B1 5
> +#pragma push_macro ("α")
> +#undef α
> +#define α 6
> diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c b/gcc/testsuite/gcc.dg/pch/pushpop-2.c
> new file mode 100644
> index 00000000000..61b8430c6d2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-options -std=c11 } */
> +#include "pushpop-2.hs"
> +
> +#if π != 4
> +#error π != 4
> +#endif
> +#pragma pop_macro("\u03C0")
> +#if π != 3
> +#error π != 3
> +#endif
> +
> +#if \u03B1 != 6
> +#error α != 6
> +#endif
> +_Pragma("pop_macro(\"\\u03B1\")")
> +#if α != 5
> +#error α != 5
> +#endif
> diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs
> new file mode 100644
> index 00000000000..797139a3196
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs
> @@ -0,0 +1,9 @@
> +#define π 3
> +#pragma push_macro ("π")
> +#undef π
> +#define π 4
> +
> +#define \u03B1 5
> +#pragma push_macro ("α")
> +#undef α
> +#define α 6
  

Patch

diff --git a/libcpp/charset.cc b/libcpp/charset.cc
index 54d7b9e0932..7937df7d78c 100644
--- a/libcpp/charset.cc
+++ b/libcpp/charset.cc
@@ -2590,19 +2590,6 @@  cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count,
   return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL);
 }
 
-/* A "do nothing" diagnostic-handling callback for use by
-   cpp_interpret_string_ranges, so that it can temporarily suppress
-   diagnostic-handling.  */
-
-static bool
-noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level,
-		    enum cpp_warning_reason, rich_location *,
-		    const char *, va_list *)
-{
-  /* no-op.  */
-  return true;
-}
-
 /* This function mimics the behavior of cpp_interpret_string, but
    rather than generating a string in the execution character set,
    *OUT is written to with the source code ranges of the characters
@@ -2642,20 +2629,10 @@  cpp_interpret_string_ranges (cpp_reader *pfile, const cpp_string *from,
      failing, rather than being emitted as a user-visible diagnostic.
      If an diagnostic does occur, we should see it via the return value of
      cpp_interpret_string_1.  */
-  bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level,
-				    enum cpp_warning_reason, rich_location *,
-				    const char *, va_list *)
-    ATTRIBUTE_FPTR_PRINTF(5,0);
-
-  saved_diagnostic_handler = pfile->cb.diagnostic;
-  pfile->cb.diagnostic = noop_diagnostic_cb;
-
+  cpp_auto_suppress_diagnostics suppress {pfile};
   bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type,
 					loc_readers, out);
 
-  /* Restore the saved diagnostic-handler.  */
-  pfile->cb.diagnostic = saved_diagnostic_handler;
-
   if (!result)
     return "cpp_interpret_string_1 failed";
 
@@ -2691,17 +2668,11 @@  static unsigned
 count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type)
 {
   cpp_string str2 = { 0, 0 };
-  bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level,
-				    enum cpp_warning_reason, rich_location *,
-				    const char *, va_list *)
-    ATTRIBUTE_FPTR_PRINTF(5,0);
-  saved_diagnostic_handler = pfile->cb.diagnostic;
-  pfile->cb.diagnostic = noop_diagnostic_cb;
+  cpp_auto_suppress_diagnostics suppress {pfile};
   convert_f save_func = pfile->narrow_cset_desc.func;
   pfile->narrow_cset_desc.func = convert_count_chars;
   bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type);
   pfile->narrow_cset_desc.func = save_func;
-  pfile->cb.diagnostic = saved_diagnostic_handler;
   if (ret)
     {
       if (str2.text != str.text)
diff --git a/libcpp/directives.cc b/libcpp/directives.cc
index 479f8c716e8..019e4009dc9 100644
--- a/libcpp/directives.cc
+++ b/libcpp/directives.cc
@@ -137,7 +137,8 @@  static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *);
 static void handle_assertion (cpp_reader *, const char *, int);
 static void do_pragma_push_macro (cpp_reader *);
 static void do_pragma_pop_macro (cpp_reader *);
-static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *);
+static void cpp_pop_definition (cpp_reader *, def_pragma_macro *,
+				cpp_hashnode *);
 
 /* This is the table of directive handlers.  All extensions other than
    #warning, #include_next, and #import are deprecated.  The name is
@@ -1595,55 +1596,95 @@  do_pragma_once (cpp_reader *pfile)
   _cpp_mark_file_once_only (pfile, pfile->buffer->file);
 }
 
-/* Handle #pragma push_macro(STRING).  */
-static void
-do_pragma_push_macro (cpp_reader *pfile)
+/* Helper for #pragma {push,pop}_macro.  Destringize STR and
+   lex it into an identifier, returning the hash node for it.  */
+
+static cpp_hashnode *
+lex_identifier_from_string (cpp_reader *pfile, cpp_string str)
 {
+  auto src = (const uchar *) memchr (str.text, '"', str.len);
+  gcc_checking_assert (src);
+  ++src;
+  const auto limit = str.text + str.len - 1;
+  gcc_checking_assert (*limit == '"' && limit >= src);
+  const auto ident = XALLOCAVEC (uchar, limit - src + 1);
+  auto dest = ident;
+  while (src != limit)
+    {
+      /* We know there is a character following the backslash.  */
+      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
+	src++;
+      *dest++ = *src++;
+    }
+
+  /* We reserved a spot for the newline with the + 1 when allocating IDENT.
+     Push a buffer containing the identifier to lex.  */
+  *dest = '\n';
+  cpp_push_buffer (pfile, ident, dest - ident, true);
+  _cpp_clean_line (pfile);
+  pfile->cur_token = _cpp_temp_token (pfile);
+  cpp_token *tok;
+  {
+    /* Suppress diagnostics during lexing so that we silently ignore invalid
+       input, as seems to be the common practice for this pragma.  */
+    cpp_auto_suppress_diagnostics suppress {pfile};
+    tok = _cpp_lex_direct (pfile);
+  }
+
   cpp_hashnode *node;
-  size_t defnlen;
-  const uchar *defn = NULL;
-  char *macroname, *dest;
-  const char *limit, *src;
-  const cpp_token *txt;
-  struct def_pragma_macro *c;
+  if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit)
+    node = nullptr;
+  else
+    node = tok->val.node.node;
 
-  txt = get__Pragma_string (pfile);
-  if (!txt)
+  _cpp_pop_buffer (pfile);
+  return node;
+}
+
+/* Common processing for #pragma {push,pop}_macro.  */
+
+static cpp_hashnode *
+push_pop_macro_common (cpp_reader *pfile, const char *type)
+{
+  const cpp_token *const txt = get__Pragma_string (pfile);
+  ++pfile->keep_tokens;
+  cpp_hashnode *node;
+  if (txt)
     {
-      location_t src_loc = pfile->cur_token[-1].src_loc;
-      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
-		 "invalid #pragma push_macro directive");
       check_eol (pfile, false);
       skip_rest_of_line (pfile);
-      return;
+      node = lex_identifier_from_string (pfile, txt->val.str);
     }
-  dest = macroname = (char *) alloca (txt->val.str.len + 2);
-  src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L'));
-  limit = (const char *) (txt->val.str.text + txt->val.str.len - 1);
-  while (src < limit)
+  else
     {
-      /* We know there is a character following the backslash.  */
-      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
-	src++;
-      *dest++ = *src++;
+      node = nullptr;
+      location_t src_loc = pfile->cur_token[-1].src_loc;
+      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
+			   "invalid #pragma %s_macro directive", type);
+      skip_rest_of_line (pfile);
     }
-  *dest = 0;
-  check_eol (pfile, false);
-  skip_rest_of_line (pfile);
-  c = XNEW (struct def_pragma_macro);
-  memset (c, 0, sizeof (struct def_pragma_macro));
-  c->name = XNEWVAR (char, strlen (macroname) + 1);
-  strcpy (c->name, macroname);
+  --pfile->keep_tokens;
+  return node;
+}
+
+/* Handle #pragma push_macro(STRING).  */
+static void
+do_pragma_push_macro (cpp_reader *pfile)
+{
+  const auto node = push_pop_macro_common (pfile, "push");
+  if (!node)
+    return;
+  const auto c = XCNEW (def_pragma_macro);
+  c->name = xstrdup ((const char *) NODE_NAME (node));
   c->next = pfile->pushed_macros;
-  node = _cpp_lex_identifier (pfile, c->name);
   if (node->type == NT_VOID)
     c->is_undef = 1;
   else if (node->type == NT_BUILTIN_MACRO)
     c->is_builtin = 1;
   else
     {
-      defn = cpp_macro_definition (pfile, node);
-      defnlen = ustrlen (defn);
+      const auto defn = cpp_macro_definition (pfile, node);
+      const size_t defnlen = ustrlen (defn);
       c->definition = XNEWVEC (uchar, defnlen + 2);
       c->definition[defnlen] = '\n';
       c->definition[defnlen + 1] = 0;
@@ -1660,50 +1701,24 @@  do_pragma_push_macro (cpp_reader *pfile)
 static void
 do_pragma_pop_macro (cpp_reader *pfile)
 {
-  char *macroname, *dest;
-  const char *limit, *src;
-  const cpp_token *txt;
-  struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros;
-  txt = get__Pragma_string (pfile);
-  if (!txt)
-    {
-      location_t src_loc = pfile->cur_token[-1].src_loc;
-      cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0,
-		 "invalid #pragma pop_macro directive");
-      check_eol (pfile, false);
-      skip_rest_of_line (pfile);
-      return;
-    }
-  dest = macroname = (char *) alloca (txt->val.str.len + 2);
-  src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L'));
-  limit = (const char *) (txt->val.str.text + txt->val.str.len - 1);
-  while (src < limit)
-    {
-      /* We know there is a character following the backslash.  */
-      if (*src == '\\' && (src[1] == '\\' || src[1] == '"'))
-	src++;
-      *dest++ = *src++;
-    }
-  *dest = 0;
-  check_eol (pfile, false);
-  skip_rest_of_line (pfile);
-
-  while (c != NULL)
+  const auto node = push_pop_macro_common (pfile, "pop");
+  if (!node)
+    return;
+  for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = c->next)
     {
-      if (!strcmp (c->name, macroname))
+      if (!strcmp (c->name, (const char *) NODE_NAME (node)))
 	{
 	  if (!l)
 	    pfile->pushed_macros = c->next;
 	  else
 	    l->next = c->next;
-	  cpp_pop_definition (pfile, c);
+	  cpp_pop_definition (pfile, c, node);
 	  free (c->definition);
 	  free (c->name);
 	  free (c);
 	  break;
 	}
       l = c;
-      c = c->next;
     }
 }
 
@@ -2607,12 +2622,8 @@  cpp_undef (cpp_reader *pfile, const char *macro)
 /* Replace a previous definition DEF of the macro STR.  If DEF is NULL,
    or first element is zero, then the macro should be undefined.  */
 static void
-cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c)
+cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode *node)
 {
-  cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name);
-  if (node == NULL)
-    return;
-
   if (pfile->cb.before_define)
     pfile->cb.before_define (pfile);
 
@@ -2634,29 +2645,23 @@  cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c)
     }
 
   {
-    size_t namelen;
-    const uchar *dn;
-    cpp_hashnode *h = NULL;
-    cpp_buffer *nbuf;
-
-    namelen = ustrcspn (c->definition, "( \n");
-    h = cpp_lookup (pfile, c->definition, namelen);
-    dn = c->definition + namelen;
-
-    nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true);
+    const auto namelen = ustrcspn (c->definition, "( \n");
+    const auto dn = c->definition + namelen;
+    const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn,
+				       true);
     if (nbuf != NULL)
       {
 	_cpp_clean_line (pfile);
 	nbuf->sysp = 1;
-	if (!_cpp_create_definition (pfile, h, 0))
+	if (!_cpp_create_definition (pfile, node, 0))
 	  abort ();
 	_cpp_pop_buffer (pfile);
       }
     else
       abort ();
-    h->value.macro->line = c->line;
-    h->value.macro->syshdr = c->syshdr;
-    h->value.macro->used = c->used;
+    node->value.macro->line = c->line;
+    node->value.macro->syshdr = c->syshdr;
+    node->value.macro->used = c->used;
   }
 }
 
diff --git a/libcpp/errors.cc b/libcpp/errors.cc
index 295496df7ed..3228dcbe7f6 100644
--- a/libcpp/errors.cc
+++ b/libcpp/errors.cc
@@ -350,3 +350,19 @@  cpp_errno_filename (cpp_reader *pfile, enum cpp_diagnostic_level level,
   return cpp_error_at (pfile, level, loc, "%s: %s", filename,
 		       xstrerror (errno));
 }
+
+cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader *pfile)
+  : m_pfile (pfile), m_cb (pfile->cb.diagnostic)
+{
+  m_pfile->cb.diagnostic
+    = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason,
+	  rich_location *, const char *, va_list *)
+    {
+      return true;
+    };
+}
+
+cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics ()
+{
+  m_pfile->cb.diagnostic = m_cb;
+}
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 5746aac9ea4..50705e3377a 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -1638,4 +1638,17 @@  enum cpp_xid_property {
 
 unsigned int cpp_check_xid_property (cppchar_t c);
 
+/* In errors.cc */
+
+/* RAII class to suppress CPP diagnostics in the current scope.  */
+class cpp_auto_suppress_diagnostics
+{
+ public:
+  explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile);
+  ~cpp_auto_suppress_diagnostics ();
+ private:
+  cpp_reader *const m_pfile;
+  const decltype (cpp_callbacks::diagnostic) m_cb;
+};
+
 #endif /* ! LIBCPP_CPPLIB_H */
diff --git a/libcpp/internal.h b/libcpp/internal.h
index a20215c5709..6221ef0d1e7 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -753,7 +753,6 @@  extern cpp_token *_cpp_lex_direct (cpp_reader *);
 extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *);
 extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *);
 extern void _cpp_init_tokenrun (tokenrun *, unsigned int);
-extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *);
 extern int _cpp_remaining_tokens_num_in_context (cpp_context *);
 extern void _cpp_init_lexer (void);
 static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have,
diff --git a/libcpp/lex.cc b/libcpp/lex.cc
index 5aa379980cf..ba97377417b 100644
--- a/libcpp/lex.cc
+++ b/libcpp/lex.cc
@@ -2204,39 +2204,6 @@  identifier_diagnostics_on_lex (cpp_reader *pfile, cpp_hashnode *node)
 		 NODE_NAME (node));
 }
 
-/* Helper function to get the cpp_hashnode of the identifier BASE.  */
-static cpp_hashnode *
-lex_identifier_intern (cpp_reader *pfile, const uchar *base)
-{
-  cpp_hashnode *result;
-  const uchar *cur;
-  unsigned int len;
-  unsigned int hash = HT_HASHSTEP (0, *base);
-
-  cur = base + 1;
-  while (ISIDNUM (*cur))
-    {
-      hash = HT_HASHSTEP (hash, *cur);
-      cur++;
-    }
-  len = cur - base;
-  hash = HT_HASHFINISH (hash, len);
-  result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table,
-					      base, len, hash, HT_ALLOC));
-  identifier_diagnostics_on_lex (pfile, result);
-  return result;
-}
-
-/* Get the cpp_hashnode of an identifier specified by NAME in
-   the current cpp_reader object.  If none is found, NULL is returned.  */
-cpp_hashnode *
-_cpp_lex_identifier (cpp_reader *pfile, const char *name)
-{
-  cpp_hashnode *result;
-  result = lex_identifier_intern (pfile, (uchar *) name);
-  return result;
-}
-
 /* Lex an identifier starting at BASE.  BUFFER->CUR is expected to point
    one past the first character at BASE, which may be a (possibly multi-byte)
    character if STARTS_UCN is true.  */
diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
new file mode 100644
index 00000000000..c8665960e30
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c
@@ -0,0 +1,203 @@ 
+/* { dg-do preprocess } */
+/* { dg-options "-std=c11 -pedantic" { target c } } */
+/* { dg-options "-std=c++11 -pedantic" { target c++ } } */
+/* { dg-additional-options "-Wall" } */
+
+/* PR preprocessor/109704 */
+
+/* Verify basic operations for different extended identifiers...  */
+
+/* ...dollar sign.  */
+#define $x 1
+#pragma push_macro("$x")
+#undef $x
+#define $x 0
+#pragma pop_macro("$x")
+#if !$x
+#error $x
+#endif
+#define $x 1
+_Pragma("push_macro(\"$x\")")
+#undef $x
+#define $x 0
+_Pragma("pop_macro(\"$x\")")
+#if !$x
+#error $x
+#endif
+#define x$ 1
+#pragma push_macro("x$")
+#undef x$
+#define x$ 0
+#pragma pop_macro("x$")
+#if !x$
+#error x$
+#endif
+#define x$ 1
+_Pragma("push_macro(\"x$\")")
+#undef x$
+#define x$ 0
+_Pragma("pop_macro(\"x$\")")
+#if !x$
+#error x$
+#endif
+
+/* ...UCN.  */
+#define \u03B1x 1
+#pragma push_macro("\u03B1x")
+#undef \u03B1x
+#define \u03B1x 0
+#pragma pop_macro("\u03B1x")
+#if !\u03B1x
+#error \u03B1x
+#endif
+#define \u03B1x 1
+_Pragma("push_macro(\"\\u03B1x\")")
+#undef \u03B1x
+#define \u03B1x 0
+_Pragma("pop_macro(\"\\u03B1x\")")
+#if !\u03B1x
+#error \u03B1x
+#endif
+#define x\u03B1 1
+#pragma push_macro("x\u03B1")
+#undef x\u03B1
+#define x\u03B1 0
+#pragma pop_macro("x\u03B1")
+#if !x\u03B1
+#error x\u03B1
+#endif
+#define x\u03B1 1
+_Pragma("push_macro(\"x\\u03B1\")")
+#undef x\u03B1
+#define x\u03B1 0
+_Pragma("pop_macro(\"x\\u03B1\")")
+#if !x\u03B1
+#error x\u03B1
+#endif
+
+/* ...UTF-8.  */
+#define πx 1
+#pragma push_macro("πx")
+#undef πx
+#define πx 0
+#pragma pop_macro("πx")
+#if !πx
+#error πx
+#endif
+#define πx 1
+_Pragma("push_macro(\"πx\")")
+#undef πx
+#define πx 0
+_Pragma("pop_macro(\"πx\")")
+#if !πx
+#error πx
+#endif
+#define xπ 1
+#pragma push_macro("xπ")
+#undef xπ
+#define xπ 0
+#pragma pop_macro("xπ")
+#if !xπ
+#error xπ
+#endif
+#define xπ 1
+_Pragma("push_macro(\"xπ\")")
+#undef xπ
+#define xπ 0
+_Pragma("pop_macro(\"xπ\")")
+#if !xπ
+#error xπ
+#endif
+
+/* Verify UCN and UTF-8 can be intermixed.  */
+#define ħ_0 1
+#pragma push_macro("ħ_0")
+#undef ħ_0
+#define ħ_0 0
+#if ħ_0
+#error ħ_0 ħ_0 \U00000127_0
+#endif
+#pragma pop_macro("\U00000127_0")
+#if !ħ_0
+#error ħ_0 ħ_0 \U00000127_0
+#endif
+#define ħ_1 1
+#pragma push_macro("\U00000127_1")
+#undef ħ_1
+#define ħ_1 0
+#if ħ_1
+#error ħ_1 \U00000127_1 ħ_1
+#endif
+#pragma pop_macro("ħ_1")
+#if !ħ_1
+#error ħ_1 \U00000127_1 ħ_1
+#endif
+#define ħ_2 1
+#pragma push_macro("\U00000127_2")
+#undef ħ_2
+#define ħ_2 0
+#if ħ_2
+#error ħ_2 \U00000127_2 \U00000127_2
+#endif
+#pragma pop_macro("\U00000127_2")
+#if !ħ_2
+#error ħ_2 \U00000127_2 \U00000127_2
+#endif
+#define \U00000127_3 1
+#pragma push_macro("ħ_3")
+#undef \U00000127_3
+#define \U00000127_3 0
+#if \U00000127_3
+#error \U00000127_3 ħ_3 ħ_3
+#endif
+#pragma pop_macro("ħ_3")
+#if !\U00000127_3
+#error \U00000127_3 ħ_3 ħ_3
+#endif
+#define \U00000127_4 1
+#pragma push_macro("ħ_4")
+#undef \U00000127_4
+#define \U00000127_4 0
+#if \U00000127_4
+#error \U00000127_4 ħ_4 \U00000127_4
+#endif
+#pragma pop_macro("\U00000127_4")
+#if !\U00000127_4
+#error \U00000127_4 ħ_4 \U00000127_4
+#endif
+#define \U00000127_5 1
+#pragma push_macro("\U00000127_5")
+#undef \U00000127_5
+#define \U00000127_5 0
+#if \U00000127_5
+#error \U00000127_5 \U00000127_5 ħ_5
+#endif
+#pragma pop_macro("ħ_5")
+#if !\U00000127_5
+#error \U00000127_5 \U00000127_5 ħ_5
+#endif
+
+/* Verify invalid input produces no diagnostics.  */
+#pragma push_macro("") /* { dg-bogus "." } */
+#pragma push_macro("\u") /* { dg-bogus "." } */
+#pragma push_macro("\u0000") /* { dg-bogus "." } */
+#pragma push_macro("not a single identifier") /* { dg-bogus "." } */
+#pragma push_macro("invalid╬character") /* { dg-bogus "." } */
+#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */
+#pragma push_macro("#include <cstdlib>") /* { dg-bogus "." } */
+
+/* Verify end-of-line diagnostics for valid and invalid input.  */
+#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("not a single identifier") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("invalid╬character") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra tokens" } */
+#pragma push_macro("#include <cstdlib>") oops /* { dg-warning "extra tokens" } */
+
+/* Verify expected diagnostics.  */
+#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */
+#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */
+_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */
+_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } */
diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C b/gcc/testsuite/g++.dg/pch/pushpop-2.C
new file mode 100644
index 00000000000..84886aea985
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C
@@ -0,0 +1,18 @@ 
+/* { dg-options -std=c++11 } */
+#include "pushpop-2.Hs"
+
+#if π != 4
+#error π != 4
+#endif
+#pragma pop_macro("\u03C0")
+#if π != 3
+#error π != 3
+#endif
+
+#if \u03B1 != 6
+#error α != 6
+#endif
+_Pragma("pop_macro(\"\\u03B1\")")
+#if α != 5
+#error α != 5
+#endif
diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs
new file mode 100644
index 00000000000..797139a3196
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs
@@ -0,0 +1,9 @@ 
+#define π 3
+#pragma push_macro ("π")
+#undef π
+#define π 4
+
+#define \u03B1 5
+#pragma push_macro ("α")
+#undef α
+#define α 6
diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c b/gcc/testsuite/gcc.dg/pch/pushpop-2.c
new file mode 100644
index 00000000000..61b8430c6d2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c
@@ -0,0 +1,18 @@ 
+/* { dg-options -std=c11 } */
+#include "pushpop-2.hs"
+
+#if π != 4
+#error π != 4
+#endif
+#pragma pop_macro("\u03C0")
+#if π != 3
+#error π != 3
+#endif
+
+#if \u03B1 != 6
+#error α != 6
+#endif
+_Pragma("pop_macro(\"\\u03B1\")")
+#if α != 5
+#error α != 5
+#endif
diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs
new file mode 100644
index 00000000000..797139a3196
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs
@@ -0,0 +1,9 @@ 
+#define π 3
+#pragma push_macro ("π")
+#undef π
+#define π 4
+
+#define \u03B1 5
+#pragma push_macro ("α")
+#undef α
+#define α 6