From patchwork Sat Jan 13 22:12:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lewis Hyatt X-Patchwork-Id: 84051 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 183473858299 for ; Sat, 13 Jan 2024 22:13:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by sourceware.org (Postfix) with ESMTPS id AFE7B3858D32 for ; Sat, 13 Jan 2024 22:12:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AFE7B3858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AFE7B3858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::72e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705183981; cv=none; b=lqlYKIkFMGmnqAuGJFGABN9AIBkbp4H7sWn8JbFQp0oFUpGg6EzwY2AsbD8BoMZ76Bwskr3UJoPCFL0YbTJfxw5IUALfZWYDazREPw2rqKDe8CmtqR0UboUw6+JieHXffxQAi6r1QaxrqaSyvwPGiG5SJq2m9cPFbSHzYDempoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705183981; c=relaxed/simple; bh=Vd1QcpKnQwWh81w0rB+78IdeDs/Ow2FksP3dhGLRL0A=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=XNgwxbGLNo3Wy96+hdueCC4Q6G+F5T9PuFfjkIrF7Fst0Q+wGjD8HfKsuIpy9cMLaLldl2EjzahWfR1ZOWBaQmaVDXDg10lO9qQMr+m1ns7p63DiMkZakqfSG2MPuhP177S6kF7VVtyfR2q0UrFsDEe2jYZn+3rW/rh+n2KX1xM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-qk1-x72e.google.com with SMTP id af79cd13be357-78333ad3a17so533752385a.3 for ; Sat, 13 Jan 2024 14:12:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705183976; x=1705788776; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=NCTt722fb1KIvSr5UjXsK3983T4HbGulvqz+7XJy15g=; b=H018JVtt3HecepXCWPpZODPywMkRcMH5LZnOv//WMdABjWQU3prn65le/RBehiJOUL 8AE3QbWMiA10NWIb2z7ZSv/ew8yCUe2X9zZ/usrAvs4fmjKZK/FZ+CbgxsaB9NmbeP2n /1JDy82OaH1IacgrKc/5F2tmQffrZnmGN+oQCrMn3yfrudlDSS/JuPE32gfrcDtmobOE xT21EajWJtvkGllGcAd2XzMV/LWwVrEXmqVBxWw8kLCkk3JB3KYPLCc4lWFrMY4TPjSV LhzirWlio8rUdbdhmPNeYd8IWTQQzfHIutzuobG54AgHeeRN7OM6dq4tvQQVkVKJsijR suQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705183976; x=1705788776; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NCTt722fb1KIvSr5UjXsK3983T4HbGulvqz+7XJy15g=; b=hAfCqyOJng+zIfstfcWDcAx/WIaQfoZGwLwgwWVr9avsassMEkW8wlpDBhto3BR5EG 9WDYkEROtECC7SqqUj7lIAOE1QrwG7os6ujNukzyBPhR1FZftXySllDIvJbXETga2/Qa 23hlsJuCzWD4nKcz82vq12ECI2WxDDAr7ltDcWNsKXTepEtTFbY0bojXRZSlqnNkHSFv YYsubVP4152NgvSTbJpRbJ4r+ACZURNnP1ChklhDPBbfo6ImYuYJcmCp8OYyDqbzXO8w jH8ChcJDfUTDi117WykjljAM6vVpiI6ZHjqxJm4YmdyZ/NkKRFHmNN7pVpcd+swQ3kqy BYAQ== X-Gm-Message-State: AOJu0YyJqmIH6efvHQzc5ZlfxBXgLRb+skCSc1yAzKG3uzyHzJZt2wTa BsLqSexChJBwqOjhir87T3dXcMvB7Fk= X-Google-Smtp-Source: AGHT+IHdVVoEMhM4vqu7JqPusvii0bb9HeVnO5Vr8ojNkMYXckONc2ReuXIefQ4HIBmmFgeAHyREeg== X-Received: by 2002:a05:620a:386:b0:783:4976:7fa with SMTP id q6-20020a05620a038600b00783497607famr3557731qkm.108.1705183975705; Sat, 13 Jan 2024 14:12:55 -0800 (PST) Received: from localhost.localdomain (96-67-140-173-static.hfc.comcastbusiness.net. [96.67.140.173]) by smtp.gmail.com with ESMTPSA id x22-20020a05620a0ed600b007831c7989a4sm1972636qkm.22.2024.01.13.14.12.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 13 Jan 2024 14:12:55 -0800 (PST) From: Lewis Hyatt To: gcc-patches@gcc.gnu.org Cc: Lewis Hyatt Subject: [PATCH] libcpp: Support extended characters for #pragma {push, pop}_macro [PR109704] Date: Sat, 13 Jan 2024 17:12:51 -0500 Message-Id: <20240113221251.2180315-1-lhyatt@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-3038.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109704 The below patch fixes the issue noted in the PR that extended characters cannot appear in the identifier passed to a #pragma push_macro or #pragma pop_macro. Bootstrap + regtest all languages on x86-64 Linux. Is it OK for GCC 13 please? I know we just entered stage 4, however I feel this is kinda like an old regression, given that the issue was not apparent until support for UCNs and UTF-8 in identifiers got added. FWIW, it would be nice if it makes it into GCC 13, because AFAIK all other UTF-8-related bugs are fixed in this release. (The other major one was for extended characters in a user-defined literal, that was fixed by r14-2629). Speaking of just entering stage 4. I do have 4 really short patches sent over the past several months that never got any response. Is there any chance someone may have a few minutes to look at them please? They are really just like 1-3 line fixes for PRs. libcpp (pinged once recently): https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641247.html https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640386.html diagnostics (pinged for 3rd time last week): https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638692.html c-family/PCH (pinged a month ago): https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639467.html Thanks! -Lewis -- >8 -- The implementation of #pragma push_macro and #pragma pop_macro has to date made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an identifier out of a string. When support was added for extended characters in identifiers ($, UCNs, or UTF-8), that support was added only for the "normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and not for the ad-hoc way. Consequently, extended identifiers are not usable with these pragmas. The logic for lexing identifiers has become more complicated than it was when _cpp_lex_identifier() was written -- it now handles things like \N{} escapes in C++, for instance -- and it no longer seems practical to maintain a redundant code path for lexing identifiers. Address the issue by changing the implementation of #pragma {push,pop}_macro to lex identifiers in the expected way, i.e. by pushing a cpp_buffer and lexing the identifier from there. The existing implementation has some quirks because of the ad-hoc parsing logic. For example: #pragma push_macro("X ") ... #pragma pop_macro("X") will not restore macro X (note the extra space in the first string). However: #pragma push_macro("X ") ... #pragma pop_macro("X ") actually does sucessfully restore "X". This is because the key for looking up the saved macro on the push stack is the original string passed, so the string passed to pop_macro needs to match it exactly. It is not that easy to reproduce this logic in the world of extended characters, given that for example it should be valid to pass a UCN to push_macro, and the corresponding UTF-8 to pop_macro. Given that this aspect of the existing behavior seems unintentional and has no tests (and does not match other implementations), I opted to make the new logic more straightforward. The string passed needs to lex to one token, which must be a valid identifier, or else no action is taken and no error is generated. Any diagnostics encountered during lexing (e.g., due to a UTF-8 character not permitted to appear in an identifier) are also suppressed. It could be nice (for GCC 15) to also add a warning if a pop_macro does not match a previous push_macro. libcpp/ChangeLog: PR preprocessor/109704 * include/cpplib.h (class cpp_auto_suppress_diagnostics): New class. * errors.cc (cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New function. (cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New function. * charset.cc (noop_diagnostic_cb): Remove. (cpp_interpret_string_ranges): Refactor diagnostic suppression logic into new class cpp_auto_suppress_diagnostics. (count_source_chars): Likewise. * directives.cc (cpp_pop_definition): Add cpp_hashnode argument. (lex_identifier_from_string): New static helper function. (push_pop_macro_common): Refactor common logic from do_pragma_push_macro and do_pragma_pop_macro; use lex_identifier_from_string instead of _cpp_lex_identifier. (do_pragma_push_macro): Reimplement using push_pop_macro_common. (do_pragma_pop_macro): Likewise. * internal.h (_cpp_lex_identifier): Remove. * lex.cc (lex_identifier_intern): Remove. (_cpp_lex_identifier): Remove. gcc/testsuite/ChangeLog: PR preprocessor/109704 * c-c++-common/cpp/pragma-push-pop-utf8.c: New test. * g++.dg/pch/pushpop-2.C: New test. * g++.dg/pch/pushpop-2.Hs: New test. * gcc.dg/pch/pushpop-2.c: New test. * gcc.dg/pch/pushpop-2.hs: New test. --- libcpp/charset.cc | 33 +-- libcpp/directives.cc | 175 +++++++-------- libcpp/errors.cc | 16 ++ libcpp/include/cpplib.h | 13 ++ libcpp/internal.h | 1 - libcpp/lex.cc | 33 --- .../c-c++-common/cpp/pragma-push-pop-utf8.c | 203 ++++++++++++++++++ gcc/testsuite/g++.dg/pch/pushpop-2.C | 18 ++ gcc/testsuite/g++.dg/pch/pushpop-2.Hs | 9 + gcc/testsuite/gcc.dg/pch/pushpop-2.c | 18 ++ gcc/testsuite/gcc.dg/pch/pushpop-2.hs | 9 + 11 files changed, 378 insertions(+), 150 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.C create mode 100644 gcc/testsuite/g++.dg/pch/pushpop-2.Hs create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.c create mode 100644 gcc/testsuite/gcc.dg/pch/pushpop-2.hs diff --git a/libcpp/charset.cc b/libcpp/charset.cc index 54d7b9e0932..7937df7d78c 100644 --- a/libcpp/charset.cc +++ b/libcpp/charset.cc @@ -2590,19 +2590,6 @@ cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count, return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL); } -/* A "do nothing" diagnostic-handling callback for use by - cpp_interpret_string_ranges, so that it can temporarily suppress - diagnostic-handling. */ - -static bool -noop_diagnostic_cb (cpp_reader *, enum cpp_diagnostic_level, - enum cpp_warning_reason, rich_location *, - const char *, va_list *) -{ - /* no-op. */ - return true; -} - /* This function mimics the behavior of cpp_interpret_string, but rather than generating a string in the execution character set, *OUT is written to with the source code ranges of the characters @@ -2642,20 +2629,10 @@ cpp_interpret_string_ranges (cpp_reader *pfile, const cpp_string *from, failing, rather than being emitted as a user-visible diagnostic. If an diagnostic does occur, we should see it via the return value of cpp_interpret_string_1. */ - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, - enum cpp_warning_reason, rich_location *, - const char *, va_list *) - ATTRIBUTE_FPTR_PRINTF(5,0); - - saved_diagnostic_handler = pfile->cb.diagnostic; - pfile->cb.diagnostic = noop_diagnostic_cb; - + cpp_auto_suppress_diagnostics suppress {pfile}; bool result = cpp_interpret_string_1 (pfile, from, count, NULL, type, loc_readers, out); - /* Restore the saved diagnostic-handler. */ - pfile->cb.diagnostic = saved_diagnostic_handler; - if (!result) return "cpp_interpret_string_1 failed"; @@ -2691,17 +2668,11 @@ static unsigned count_source_chars (cpp_reader *pfile, cpp_string str, cpp_ttype type) { cpp_string str2 = { 0, 0 }; - bool (*saved_diagnostic_handler) (cpp_reader *, enum cpp_diagnostic_level, - enum cpp_warning_reason, rich_location *, - const char *, va_list *) - ATTRIBUTE_FPTR_PRINTF(5,0); - saved_diagnostic_handler = pfile->cb.diagnostic; - pfile->cb.diagnostic = noop_diagnostic_cb; + cpp_auto_suppress_diagnostics suppress {pfile}; convert_f save_func = pfile->narrow_cset_desc.func; pfile->narrow_cset_desc.func = convert_count_chars; bool ret = cpp_interpret_string (pfile, &str, 1, &str2, type); pfile->narrow_cset_desc.func = save_func; - pfile->cb.diagnostic = saved_diagnostic_handler; if (ret) { if (str2.text != str.text) diff --git a/libcpp/directives.cc b/libcpp/directives.cc index 479f8c716e8..019e4009dc9 100644 --- a/libcpp/directives.cc +++ b/libcpp/directives.cc @@ -137,7 +137,8 @@ static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *); static void handle_assertion (cpp_reader *, const char *, int); static void do_pragma_push_macro (cpp_reader *); static void do_pragma_pop_macro (cpp_reader *); -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *); +static void cpp_pop_definition (cpp_reader *, def_pragma_macro *, + cpp_hashnode *); /* This is the table of directive handlers. All extensions other than #warning, #include_next, and #import are deprecated. The name is @@ -1595,55 +1596,95 @@ do_pragma_once (cpp_reader *pfile) _cpp_mark_file_once_only (pfile, pfile->buffer->file); } -/* Handle #pragma push_macro(STRING). */ -static void -do_pragma_push_macro (cpp_reader *pfile) +/* Helper for #pragma {push,pop}_macro. Destringize STR and + lex it into an identifier, returning the hash node for it. */ + +static cpp_hashnode * +lex_identifier_from_string (cpp_reader *pfile, cpp_string str) { + auto src = (const uchar *) memchr (str.text, '"', str.len); + gcc_checking_assert (src); + ++src; + const auto limit = str.text + str.len - 1; + gcc_checking_assert (*limit == '"' && limit >= src); + const auto ident = XALLOCAVEC (uchar, limit - src + 1); + auto dest = ident; + while (src != limit) + { + /* We know there is a character following the backslash. */ + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) + src++; + *dest++ = *src++; + } + + /* We reserved a spot for the newline with the + 1 when allocating IDENT. + Push a buffer containing the identifier to lex. */ + *dest = '\n'; + cpp_push_buffer (pfile, ident, dest - ident, true); + _cpp_clean_line (pfile); + pfile->cur_token = _cpp_temp_token (pfile); + cpp_token *tok; + { + /* Suppress diagnostics during lexing so that we silently ignore invalid + input, as seems to be the common practice for this pragma. */ + cpp_auto_suppress_diagnostics suppress {pfile}; + tok = _cpp_lex_direct (pfile); + } + cpp_hashnode *node; - size_t defnlen; - const uchar *defn = NULL; - char *macroname, *dest; - const char *limit, *src; - const cpp_token *txt; - struct def_pragma_macro *c; + if (tok->type != CPP_NAME || pfile->buffer->cur != pfile->buffer->rlimit) + node = nullptr; + else + node = tok->val.node.node; - txt = get__Pragma_string (pfile); - if (!txt) + _cpp_pop_buffer (pfile); + return node; +} + +/* Common processing for #pragma {push,pop}_macro. */ + +static cpp_hashnode * +push_pop_macro_common (cpp_reader *pfile, const char *type) +{ + const cpp_token *const txt = get__Pragma_string (pfile); + ++pfile->keep_tokens; + cpp_hashnode *node; + if (txt) { - location_t src_loc = pfile->cur_token[-1].src_loc; - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, - "invalid #pragma push_macro directive"); check_eol (pfile, false); skip_rest_of_line (pfile); - return; + node = lex_identifier_from_string (pfile, txt->val.str); } - dest = macroname = (char *) alloca (txt->val.str.len + 2); - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); - while (src < limit) + else { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; + node = nullptr; + location_t src_loc = pfile->cur_token[-1].src_loc; + cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, + "invalid #pragma %s_macro directive", type); + skip_rest_of_line (pfile); } - *dest = 0; - check_eol (pfile, false); - skip_rest_of_line (pfile); - c = XNEW (struct def_pragma_macro); - memset (c, 0, sizeof (struct def_pragma_macro)); - c->name = XNEWVAR (char, strlen (macroname) + 1); - strcpy (c->name, macroname); + --pfile->keep_tokens; + return node; +} + +/* Handle #pragma push_macro(STRING). */ +static void +do_pragma_push_macro (cpp_reader *pfile) +{ + const auto node = push_pop_macro_common (pfile, "push"); + if (!node) + return; + const auto c = XCNEW (def_pragma_macro); + c->name = xstrdup ((const char *) NODE_NAME (node)); c->next = pfile->pushed_macros; - node = _cpp_lex_identifier (pfile, c->name); if (node->type == NT_VOID) c->is_undef = 1; else if (node->type == NT_BUILTIN_MACRO) c->is_builtin = 1; else { - defn = cpp_macro_definition (pfile, node); - defnlen = ustrlen (defn); + const auto defn = cpp_macro_definition (pfile, node); + const size_t defnlen = ustrlen (defn); c->definition = XNEWVEC (uchar, defnlen + 2); c->definition[defnlen] = '\n'; c->definition[defnlen + 1] = 0; @@ -1660,50 +1701,24 @@ do_pragma_push_macro (cpp_reader *pfile) static void do_pragma_pop_macro (cpp_reader *pfile) { - char *macroname, *dest; - const char *limit, *src; - const cpp_token *txt; - struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros; - txt = get__Pragma_string (pfile); - if (!txt) - { - location_t src_loc = pfile->cur_token[-1].src_loc; - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, - "invalid #pragma pop_macro directive"); - check_eol (pfile, false); - skip_rest_of_line (pfile); - return; - } - dest = macroname = (char *) alloca (txt->val.str.len + 2); - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); - while (src < limit) - { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; - } - *dest = 0; - check_eol (pfile, false); - skip_rest_of_line (pfile); - - while (c != NULL) + const auto node = push_pop_macro_common (pfile, "pop"); + if (!node) + return; + for (def_pragma_macro *c = pfile->pushed_macros, *l = nullptr; c; c = c->next) { - if (!strcmp (c->name, macroname)) + if (!strcmp (c->name, (const char *) NODE_NAME (node))) { if (!l) pfile->pushed_macros = c->next; else l->next = c->next; - cpp_pop_definition (pfile, c); + cpp_pop_definition (pfile, c, node); free (c->definition); free (c->name); free (c); break; } l = c; - c = c->next; } } @@ -2607,12 +2622,8 @@ cpp_undef (cpp_reader *pfile, const char *macro) /* Replace a previous definition DEF of the macro STR. If DEF is NULL, or first element is zero, then the macro should be undefined. */ static void -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) +cpp_pop_definition (cpp_reader *pfile, def_pragma_macro *c, cpp_hashnode *node) { - cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name); - if (node == NULL) - return; - if (pfile->cb.before_define) pfile->cb.before_define (pfile); @@ -2634,29 +2645,23 @@ cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) } { - size_t namelen; - const uchar *dn; - cpp_hashnode *h = NULL; - cpp_buffer *nbuf; - - namelen = ustrcspn (c->definition, "( \n"); - h = cpp_lookup (pfile, c->definition, namelen); - dn = c->definition + namelen; - - nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true); + const auto namelen = ustrcspn (c->definition, "( \n"); + const auto dn = c->definition + namelen; + const auto nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, + true); if (nbuf != NULL) { _cpp_clean_line (pfile); nbuf->sysp = 1; - if (!_cpp_create_definition (pfile, h, 0)) + if (!_cpp_create_definition (pfile, node, 0)) abort (); _cpp_pop_buffer (pfile); } else abort (); - h->value.macro->line = c->line; - h->value.macro->syshdr = c->syshdr; - h->value.macro->used = c->used; + node->value.macro->line = c->line; + node->value.macro->syshdr = c->syshdr; + node->value.macro->used = c->used; } } diff --git a/libcpp/errors.cc b/libcpp/errors.cc index 295496df7ed..3228dcbe7f6 100644 --- a/libcpp/errors.cc +++ b/libcpp/errors.cc @@ -350,3 +350,19 @@ cpp_errno_filename (cpp_reader *pfile, enum cpp_diagnostic_level level, return cpp_error_at (pfile, level, loc, "%s: %s", filename, xstrerror (errno)); } + +cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics (cpp_reader *pfile) + : m_pfile (pfile), m_cb (pfile->cb.diagnostic) +{ + m_pfile->cb.diagnostic + = [] (cpp_reader *, cpp_diagnostic_level, cpp_warning_reason, + rich_location *, const char *, va_list *) + { + return true; + }; +} + +cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics () +{ + m_pfile->cb.diagnostic = m_cb; +} diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index 5746aac9ea4..50705e3377a 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -1638,4 +1638,17 @@ enum cpp_xid_property { unsigned int cpp_check_xid_property (cppchar_t c); +/* In errors.cc */ + +/* RAII class to suppress CPP diagnostics in the current scope. */ +class cpp_auto_suppress_diagnostics +{ + public: + explicit cpp_auto_suppress_diagnostics (cpp_reader *pfile); + ~cpp_auto_suppress_diagnostics (); + private: + cpp_reader *const m_pfile; + const decltype (cpp_callbacks::diagnostic) m_cb; +}; + #endif /* ! LIBCPP_CPPLIB_H */ diff --git a/libcpp/internal.h b/libcpp/internal.h index a20215c5709..6221ef0d1e7 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -753,7 +753,6 @@ extern cpp_token *_cpp_lex_direct (cpp_reader *); extern unsigned char *_cpp_spell_ident_ucns (unsigned char *, cpp_hashnode *); extern int _cpp_equiv_tokens (const cpp_token *, const cpp_token *); extern void _cpp_init_tokenrun (tokenrun *, unsigned int); -extern cpp_hashnode *_cpp_lex_identifier (cpp_reader *, const char *); extern int _cpp_remaining_tokens_num_in_context (cpp_context *); extern void _cpp_init_lexer (void); static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have, diff --git a/libcpp/lex.cc b/libcpp/lex.cc index 5aa379980cf..ba97377417b 100644 --- a/libcpp/lex.cc +++ b/libcpp/lex.cc @@ -2204,39 +2204,6 @@ identifier_diagnostics_on_lex (cpp_reader *pfile, cpp_hashnode *node) NODE_NAME (node)); } -/* Helper function to get the cpp_hashnode of the identifier BASE. */ -static cpp_hashnode * -lex_identifier_intern (cpp_reader *pfile, const uchar *base) -{ - cpp_hashnode *result; - const uchar *cur; - unsigned int len; - unsigned int hash = HT_HASHSTEP (0, *base); - - cur = base + 1; - while (ISIDNUM (*cur)) - { - hash = HT_HASHSTEP (hash, *cur); - cur++; - } - len = cur - base; - hash = HT_HASHFINISH (hash, len); - result = CPP_HASHNODE (ht_lookup_with_hash (pfile->hash_table, - base, len, hash, HT_ALLOC)); - identifier_diagnostics_on_lex (pfile, result); - return result; -} - -/* Get the cpp_hashnode of an identifier specified by NAME in - the current cpp_reader object. If none is found, NULL is returned. */ -cpp_hashnode * -_cpp_lex_identifier (cpp_reader *pfile, const char *name) -{ - cpp_hashnode *result; - result = lex_identifier_intern (pfile, (uchar *) name); - return result; -} - /* Lex an identifier starting at BASE. BUFFER->CUR is expected to point one past the first character at BASE, which may be a (possibly multi-byte) character if STARTS_UCN is true. */ diff --git a/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c new file mode 100644 index 00000000000..c8665960e30 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cpp/pragma-push-pop-utf8.c @@ -0,0 +1,203 @@ +/* { dg-do preprocess } */ +/* { dg-options "-std=c11 -pedantic" { target c } } */ +/* { dg-options "-std=c++11 -pedantic" { target c++ } } */ +/* { dg-additional-options "-Wall" } */ + +/* PR preprocessor/109704 */ + +/* Verify basic operations for different extended identifiers... */ + +/* ...dollar sign. */ +#define $x 1 +#pragma push_macro("$x") +#undef $x +#define $x 0 +#pragma pop_macro("$x") +#if !$x +#error $x +#endif +#define $x 1 +_Pragma("push_macro(\"$x\")") +#undef $x +#define $x 0 +_Pragma("pop_macro(\"$x\")") +#if !$x +#error $x +#endif +#define x$ 1 +#pragma push_macro("x$") +#undef x$ +#define x$ 0 +#pragma pop_macro("x$") +#if !x$ +#error x$ +#endif +#define x$ 1 +_Pragma("push_macro(\"x$\")") +#undef x$ +#define x$ 0 +_Pragma("pop_macro(\"x$\")") +#if !x$ +#error x$ +#endif + +/* ...UCN. */ +#define \u03B1x 1 +#pragma push_macro("\u03B1x") +#undef \u03B1x +#define \u03B1x 0 +#pragma pop_macro("\u03B1x") +#if !\u03B1x +#error \u03B1x +#endif +#define \u03B1x 1 +_Pragma("push_macro(\"\\u03B1x\")") +#undef \u03B1x +#define \u03B1x 0 +_Pragma("pop_macro(\"\\u03B1x\")") +#if !\u03B1x +#error \u03B1x +#endif +#define x\u03B1 1 +#pragma push_macro("x\u03B1") +#undef x\u03B1 +#define x\u03B1 0 +#pragma pop_macro("x\u03B1") +#if !x\u03B1 +#error x\u03B1 +#endif +#define x\u03B1 1 +_Pragma("push_macro(\"x\\u03B1\")") +#undef x\u03B1 +#define x\u03B1 0 +_Pragma("pop_macro(\"x\\u03B1\")") +#if !x\u03B1 +#error x\u03B1 +#endif + +/* ...UTF-8. */ +#define πx 1 +#pragma push_macro("πx") +#undef πx +#define πx 0 +#pragma pop_macro("πx") +#if !πx +#error πx +#endif +#define πx 1 +_Pragma("push_macro(\"πx\")") +#undef πx +#define πx 0 +_Pragma("pop_macro(\"πx\")") +#if !πx +#error πx +#endif +#define xπ 1 +#pragma push_macro("xπ") +#undef xπ +#define xπ 0 +#pragma pop_macro("xπ") +#if !xπ +#error xπ +#endif +#define xπ 1 +_Pragma("push_macro(\"xπ\")") +#undef xπ +#define xπ 0 +_Pragma("pop_macro(\"xπ\")") +#if !xπ +#error xπ +#endif + +/* Verify UCN and UTF-8 can be intermixed. */ +#define ħ_0 1 +#pragma push_macro("ħ_0") +#undef ħ_0 +#define ħ_0 0 +#if ħ_0 +#error ħ_0 ħ_0 \U00000127_0 +#endif +#pragma pop_macro("\U00000127_0") +#if !ħ_0 +#error ħ_0 ħ_0 \U00000127_0 +#endif +#define ħ_1 1 +#pragma push_macro("\U00000127_1") +#undef ħ_1 +#define ħ_1 0 +#if ħ_1 +#error ħ_1 \U00000127_1 ħ_1 +#endif +#pragma pop_macro("ħ_1") +#if !ħ_1 +#error ħ_1 \U00000127_1 ħ_1 +#endif +#define ħ_2 1 +#pragma push_macro("\U00000127_2") +#undef ħ_2 +#define ħ_2 0 +#if ħ_2 +#error ħ_2 \U00000127_2 \U00000127_2 +#endif +#pragma pop_macro("\U00000127_2") +#if !ħ_2 +#error ħ_2 \U00000127_2 \U00000127_2 +#endif +#define \U00000127_3 1 +#pragma push_macro("ħ_3") +#undef \U00000127_3 +#define \U00000127_3 0 +#if \U00000127_3 +#error \U00000127_3 ħ_3 ħ_3 +#endif +#pragma pop_macro("ħ_3") +#if !\U00000127_3 +#error \U00000127_3 ħ_3 ħ_3 +#endif +#define \U00000127_4 1 +#pragma push_macro("ħ_4") +#undef \U00000127_4 +#define \U00000127_4 0 +#if \U00000127_4 +#error \U00000127_4 ħ_4 \U00000127_4 +#endif +#pragma pop_macro("\U00000127_4") +#if !\U00000127_4 +#error \U00000127_4 ħ_4 \U00000127_4 +#endif +#define \U00000127_5 1 +#pragma push_macro("\U00000127_5") +#undef \U00000127_5 +#define \U00000127_5 0 +#if \U00000127_5 +#error \U00000127_5 \U00000127_5 ħ_5 +#endif +#pragma pop_macro("ħ_5") +#if !\U00000127_5 +#error \U00000127_5 \U00000127_5 ħ_5 +#endif + +/* Verify invalid input produces no diagnostics. */ +#pragma push_macro("") /* { dg-bogus "." } */ +#pragma push_macro("\u") /* { dg-bogus "." } */ +#pragma push_macro("\u0000") /* { dg-bogus "." } */ +#pragma push_macro("not a single identifier") /* { dg-bogus "." } */ +#pragma push_macro("invalid╬character") /* { dg-bogus "." } */ +#pragma push_macro("\u0300invalid_start") /* { dg-bogus "." } */ +#pragma push_macro("#include ") /* { dg-bogus "." } */ + +/* Verify end-of-line diagnostics for valid and invalid input. */ +#pragma push_macro("ö") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("\u") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("\u0000") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("not a single identifier") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("invalid╬character") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("\u0300invalid_start") oops /* { dg-warning "extra tokens" } */ +#pragma push_macro("#include ") oops /* { dg-warning "extra tokens" } */ + +/* Verify expected diagnostics. */ +#pragma push_macro() /* { dg-error {invalid #pragma push_macro} } */ +#pragma pop_macro() /* { dg-error {invalid #pragma pop_macro} } */ +_Pragma("push_macro(0)") /* { dg-error {invalid #pragma push_macro} } */ +_Pragma("pop_macro(\"oops\"") /* { dg-error {invalid #pragma pop_macro} } */ diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.C b/gcc/testsuite/g++.dg/pch/pushpop-2.C new file mode 100644 index 00000000000..84886aea985 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.C @@ -0,0 +1,18 @@ +/* { dg-options -std=c++11 } */ +#include "pushpop-2.Hs" + +#if π != 4 +#error π != 4 +#endif +#pragma pop_macro("\u03C0") +#if π != 3 +#error π != 3 +#endif + +#if \u03B1 != 6 +#error α != 6 +#endif +_Pragma("pop_macro(\"\\u03B1\")") +#if α != 5 +#error α != 5 +#endif diff --git a/gcc/testsuite/g++.dg/pch/pushpop-2.Hs b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs new file mode 100644 index 00000000000..797139a3196 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/pushpop-2.Hs @@ -0,0 +1,9 @@ +#define π 3 +#pragma push_macro ("π") +#undef π +#define π 4 + +#define \u03B1 5 +#pragma push_macro ("α") +#undef α +#define α 6 diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.c b/gcc/testsuite/gcc.dg/pch/pushpop-2.c new file mode 100644 index 00000000000..61b8430c6d2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.c @@ -0,0 +1,18 @@ +/* { dg-options -std=c11 } */ +#include "pushpop-2.hs" + +#if π != 4 +#error π != 4 +#endif +#pragma pop_macro("\u03C0") +#if π != 3 +#error π != 3 +#endif + +#if \u03B1 != 6 +#error α != 6 +#endif +_Pragma("pop_macro(\"\\u03B1\")") +#if α != 5 +#error α != 5 +#endif diff --git a/gcc/testsuite/gcc.dg/pch/pushpop-2.hs b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs new file mode 100644 index 00000000000..797139a3196 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pch/pushpop-2.hs @@ -0,0 +1,9 @@ +#define π 3 +#pragma push_macro ("π") +#undef π +#define π 4 + +#define \u03B1 5 +#pragma push_macro ("α") +#undef α +#define α 6