From patchwork Mon Dec 13 11:16:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 48862 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 996613858031 for ; Mon, 13 Dec 2021 11:17:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 996613858031 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1639394246; bh=ThRTQyUZujAmCIWd1F1896CypfehcapqkWuU27t/c4Y=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Fq0FOhzXrWi/Ip7ocqZc1Y/VSoi0Cw9GE6NJhMHyB0/7gV1HUhQ/yLRJC3bTWIfIU YJ/pIqys2n8k4vBY8/QPYxb8wQlvzql16olXZ8ci0wDgvSoFueHPMMiaRbPd41Kmh/ XTB8nil9e0KHIM/AeWbZT5k5SrxMwXWxx7kSTQ1E= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id AAC543857C77 for ; Mon, 13 Dec 2021 11:16:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AAC543857C77 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-558-_y_gTkkwNRWtpM3aHoqonA-1; Mon, 13 Dec 2021 06:16:34 -0500 X-MC-Unique: _y_gTkkwNRWtpM3aHoqonA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5CD99802C94; Mon, 13 Dec 2021 11:16:33 +0000 (UTC) Received: from localhost (unknown [10.33.36.71]) by smtp.corp.redhat.com (Postfix) with ESMTP id 07DF05D6D7; Mon, 13 Dec 2021 11:16:32 +0000 (UTC) To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [committed] libstdc++: Fix std::regex_replace for strings with embedded null [PR103664] Date: Mon, 13 Dec 2021 11:16:32 +0000 Message-Id: <20211213111632.1941052-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, URI_HEX autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jonathan Wakely via Gcc-patches From: Jonathan Wakely Reply-To: Jonathan Wakely Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Tested powerpc64le-linux, pushed to trunk. The overload of std::regex_replace that takes a std::basic_string as the fmt argument (for the replacement string) is implemented in terms of the one taking a const C*, which uses std::char_traits to find the length. That means it stops at a null character, even though the basic_string might have additional characters beyond that. Rather than duplicate the implementation of the const C* one for the std::basic_string case, this moves that implementation to a new __regex_replace function which takes a const C* and a length. Then both the std::basic_string and const C* overloads can call that (with the latter using char_traits to find the length to pass to the new function). libstdc++-v3/ChangeLog: PR libstdc++/103664 * include/bits/regex.h (__regex_replace): Declare. (regex_replace): Use it. * include/bits/regex.tcc (__regex_replace): Replace regex_replace definition with __regex_replace. * testsuite/28_regex/algorithms/regex_replace/char/103664.cc: New test. --- libstdc++-v3/include/bits/regex.h | 20 +++++++++++++++++-- libstdc++-v3/include/bits/regex.tcc | 9 ++++----- .../algorithms/regex_replace/char/103664.cc | 11 ++++++++++ 3 files changed, 33 insertions(+), 7 deletions(-) create mode 100644 libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/103664.cc diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 52dcd7f86ae..91c63768581 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -2488,6 +2488,15 @@ _GLIBCXX_END_NAMESPACE_CXX11 = regex_constants::match_default) = delete; // std [28.11.4] Function template regex_replace + + template + _Out_iter + __regex_replace(_Out_iter __out, _Bi_iter __first, _Bi_iter __last, + const basic_regex<_Ch_type, _Rx_traits>& __e, + const _Ch_type* __fmt, size_t __len, + regex_constants::match_flag_type __flags); + /** * @brief Search for a regular expression within a range for multiple times, and replace the matched parts through filling a format string. @@ -2511,7 +2520,8 @@ _GLIBCXX_END_NAMESPACE_CXX11 regex_constants::match_flag_type __flags = regex_constants::match_default) { - return regex_replace(__out, __first, __last, __e, __fmt.c_str(), __flags); + return std::__regex_replace(__out, __first, __last, __e, __fmt.c_str(), + __fmt.length(), __flags); } /** @@ -2534,7 +2544,13 @@ _GLIBCXX_END_NAMESPACE_CXX11 const basic_regex<_Ch_type, _Rx_traits>& __e, const _Ch_type* __fmt, regex_constants::match_flag_type __flags - = regex_constants::match_default); + = regex_constants::match_default) + { + return std::__regex_replace(__out, __first, __last, __e, __fmt, + char_traits<_Ch_type>::length(__fmt), + __flags); + } + /** * @brief Search for a regular expression within a string for multiple times, diff --git a/libstdc++-v3/include/bits/regex.tcc b/libstdc++-v3/include/bits/regex.tcc index c8bdd377c18..12ee9f0a989 100644 --- a/libstdc++-v3/include/bits/regex.tcc +++ b/libstdc++-v3/include/bits/regex.tcc @@ -461,10 +461,10 @@ namespace __detail template _Out_iter - regex_replace(_Out_iter __out, _Bi_iter __first, _Bi_iter __last, - const basic_regex<_Ch_type, _Rx_traits>& __e, - const _Ch_type* __fmt, - regex_constants::match_flag_type __flags) + __regex_replace(_Out_iter __out, _Bi_iter __first, _Bi_iter __last, + const basic_regex<_Ch_type, _Rx_traits>& __e, + const _Ch_type* __fmt, size_t __len, + regex_constants::match_flag_type __flags) { typedef regex_iterator<_Bi_iter, _Ch_type, _Rx_traits> _IterT; _IterT __i(__first, __last, __e, __flags); @@ -477,7 +477,6 @@ namespace __detail else { sub_match<_Bi_iter> __last; - auto __len = char_traits<_Ch_type>::length(__fmt); for (; __i != __end; ++__i) { if (!(__flags & regex_constants::format_no_copy)) diff --git a/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/103664.cc b/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/103664.cc new file mode 100644 index 00000000000..ca75e49ed3e --- /dev/null +++ b/libstdc++-v3/testsuite/28_regex/algorithms/regex_replace/char/103664.cc @@ -0,0 +1,11 @@ +// { dg-do run { target c++11 } } + +#include +#include + +int main() +{ + // PR libstdc++/103664 + std::string a = regex_replace("123", std::regex("2"), std::string("a\0b", 3)); + VERIFY( a == std::string("1a\0b3", 5) ); +}