[committed] libstdc++: Disable over-zealous warnings about std::string copies [PR103332]

Message ID 20211209232453.1568609-1-jwakely@redhat.com
State Committed
Commit f8463b0e3ec2438b4cfb8c9a468d59761db2c8a0
Headers
Series [committed] libstdc++: Disable over-zealous warnings about std::string copies [PR103332] |

Commit Message

Jonathan Wakely Dec. 9, 2021, 11:24 p.m. UTC
  These warnings are triggered by perfectly valid code using std::string.
They're particularly bad when --enable-fully-dynamic-string is used,
because even std::string().begin() will give a warning.

Use pragmas to stop the troublesome warnings for copies done by
std::char_traits.

libstdc++-v3/ChangeLog:

	PR libstdc++/103332
	PR libstdc++/102958
	PR libstdc++/103483
	* include/bits/char_traits.h: Suppress stringop and array-bounds
	warnings.
---
 libstdc++-v3/include/bits/char_traits.h | 7 +++++++
 1 file changed, 7 insertions(+)
  

Comments

Martin Sebor Dec. 10, 2021, 12:38 a.m. UTC | #1
On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote:
> These warnings are triggered by perfectly valid code using std::string.
> They're particularly bad when --enable-fully-dynamic-string is used,
> because even std::string().begin() will give a warning.
> 
> Use pragmas to stop the troublesome warnings for copies done by
> std::char_traits.

I'm still experimenting with some of the approaches we discussed
last week, but based on my findings so far this was going to be
my suggestion at lest for now, until or unless the problem turns
out to affect more code than just std::string.

That said, I noticed a typo in the patch:

> 
> libstdc++-v3/ChangeLog:
> 
> 	PR libstdc++/103332
> 	PR libstdc++/102958
> 	PR libstdc++/103483
> 	* include/bits/char_traits.h: Suppress stringop and array-bounds
> 	warnings.
> ---
>   libstdc++-v3/include/bits/char_traits.h | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/libstdc++-v3/include/bits/char_traits.h b/libstdc++-v3/include/bits/char_traits.h
> index da3e0ffffaa..3f7befcf8b2 100644
> --- a/libstdc++-v3/include/bits/char_traits.h
> +++ b/libstdc++-v3/include/bits/char_traits.h
> @@ -54,6 +54,11 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
>   {
>   _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wstringop-overflow"
> +#pragma GCC diagnostic ignored "-Wstringop-overread"
> +#pragma GCC diagnostic ignored "-Warray-bounds"

(Just for reference, as I mentioned in my private mail, at -O1
the same code also triggers -Wfree-nonheap-object.)

> +
>     /**
>      *  @brief  Mapping from character type to associated types.
>      *
> @@ -990,6 +995,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>     } // namespace __detail
>   #endif // C++20
>   
> +#pragma GCC diagnostic push

This should be pop.

Martin

> +
>   _GLIBCXX_END_NAMESPACE_VERSION
>   } // namespace
>   
>
  
Martin Sebor Dec. 10, 2021, 1:49 a.m. UTC | #2
On 12/9/21 5:38 PM, Martin Sebor wrote:
> On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote:
>> These warnings are triggered by perfectly valid code using std::string.
>> They're particularly bad when --enable-fully-dynamic-string is used,
>> because even std::string().begin() will give a warning.
>>
>> Use pragmas to stop the troublesome warnings for copies done by
>> std::char_traits.
> 
> I'm still experimenting with some of the approaches we discussed
> last week, but based on my findings so far this was going to be
> my suggestion at lest for now, until or unless the problem turns
> out to affect more code than just std::string.

Just minutes after I wrote this I tried following the clue
in the note printed for the test case from PR 103534 with
an enhancement I'm experimenting with:

/build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56: 
warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’ 
specified size between 18446744073709551600 and 18446744073709551615 
exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
   426 |         return static_cast<char_type*>(__builtin_memcpy(__s1, 
__s2, __n));
       | 
~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
/build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56: 
note: when 
‘(<anonymous>.std::__cxx11::basic_string<char>::_M_string_length > 
18446744073709551599)’

and adding an assert to string::size():

       constexpr
       size_type
       size() const noexcept
       {
         if (_M_string_length >= -1LU >> 1)
           __builtin_unreachable ();
         return _M_string_length;
       }

That gets rid of the false positive in this PR.  I realize
the others happen for other reasons but this approach at least
suggests that there might be other ways to suppress them than
the #pragma.  Unlike it, the alternative approaches should also
improve codegen.

> 
> That said, I noticed a typo in the patch:
> 
>>
>> libstdc++-v3/ChangeLog:
>>
>>     PR libstdc++/103332
>>     PR libstdc++/102958
>>     PR libstdc++/103483
>>     * include/bits/char_traits.h: Suppress stringop and array-bounds
>>     warnings.
>> ---
>>   libstdc++-v3/include/bits/char_traits.h | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/libstdc++-v3/include/bits/char_traits.h 
>> b/libstdc++-v3/include/bits/char_traits.h
>> index da3e0ffffaa..3f7befcf8b2 100644
>> --- a/libstdc++-v3/include/bits/char_traits.h
>> +++ b/libstdc++-v3/include/bits/char_traits.h
>> @@ -54,6 +54,11 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
>>   {
>>   _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> +#pragma GCC diagnostic push
>> +#pragma GCC diagnostic ignored "-Wstringop-overflow"
>> +#pragma GCC diagnostic ignored "-Wstringop-overread"
>> +#pragma GCC diagnostic ignored "-Warray-bounds"
> 
> (Just for reference, as I mentioned in my private mail, at -O1
> the same code also triggers -Wfree-nonheap-object.)
> 
>> +
>>     /**
>>      *  @brief  Mapping from character type to associated types.
>>      *
>> @@ -990,6 +995,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>     } // namespace __detail
>>   #endif // C++20
>> +#pragma GCC diagnostic push
> 
> This should be pop.
> 
> Martin
> 
>> +
>>   _GLIBCXX_END_NAMESPACE_VERSION
>>   } // namespace
>>
>
  
Jonathan Wakely Dec. 10, 2021, 9:51 a.m. UTC | #3
On Fri, 10 Dec 2021 at 00:39, Martin Sebor via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
> On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote:
> > These warnings are triggered by perfectly valid code using std::string.
> > They're particularly bad when --enable-fully-dynamic-string is used,
> > because even std::string().begin() will give a warning.
> >
> > Use pragmas to stop the troublesome warnings for copies done by
> > std::char_traits.
>
> I'm still experimenting with some of the approaches we discussed
> last week, but based on my findings so far this was going to be
> my suggestion at lest for now, until or unless the problem turns
> out to affect more code than just std::string.
>
> That said, I noticed a typo in the patch:
>
> >
> > libstdc++-v3/ChangeLog:
> >
> >       PR libstdc++/103332
> >       PR libstdc++/102958
> >       PR libstdc++/103483
> >       * include/bits/char_traits.h: Suppress stringop and array-bounds
> >       warnings.
> > ---
> >   libstdc++-v3/include/bits/char_traits.h | 7 +++++++
> >   1 file changed, 7 insertions(+)
> >
> > diff --git a/libstdc++-v3/include/bits/char_traits.h b/libstdc++-v3/include/bits/char_traits.h
> > index da3e0ffffaa..3f7befcf8b2 100644
> > --- a/libstdc++-v3/include/bits/char_traits.h
> > +++ b/libstdc++-v3/include/bits/char_traits.h
> > @@ -54,6 +54,11 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
> >   {
> >   _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> > +#pragma GCC diagnostic push
> > +#pragma GCC diagnostic ignored "-Wstringop-overflow"
> > +#pragma GCC diagnostic ignored "-Wstringop-overread"
> > +#pragma GCC diagnostic ignored "-Warray-bounds"
>
> (Just for reference, as I mentioned in my private mail, at -O1
> the same code also triggers -Wfree-nonheap-object.)
>
> > +
> >     /**
> >      *  @brief  Mapping from character type to associated types.
> >      *
> > @@ -990,6 +995,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >     } // namespace __detail
> >   #endif // C++20
> >
> > +#pragma GCC diagnostic push
>
> This should be pop.

Oops! thanks, fixed at r12-5888 by the attached patch. Tested
x86_64-linux, pushed to trunk.
commit db184a3453b6fe810e2d9765ef8ed9028f96e968
Author: Jonathan Wakely <jwakely@redhat.com>
Date:   Fri Dec 10 09:06:37 2021

    libstdc++: Fix diagnostic pragma push that should be pop
    
    libstdc++-v3/ChangeLog:
    
            * include/bits/char_traits.h: Change pragma push to pop.

diff --git a/libstdc++-v3/include/bits/char_traits.h b/libstdc++-v3/include/bits/char_traits.h
index 3f7befcf8b2..13239580622 100644
--- a/libstdc++-v3/include/bits/char_traits.h
+++ b/libstdc++-v3/include/bits/char_traits.h
@@ -995,7 +995,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   } // namespace __detail
 #endif // C++20
 
-#pragma GCC diagnostic push
+#pragma GCC diagnostic pop
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
  
Jonathan Wakely Dec. 10, 2021, 10:12 a.m. UTC | #4
On Fri, 10 Dec 2021 at 01:50, Martin Sebor via Libstdc++
<libstdc++@gcc.gnu.org> wrote:
>
> On 12/9/21 5:38 PM, Martin Sebor wrote:
> > On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote:
> >> These warnings are triggered by perfectly valid code using std::string.
> >> They're particularly bad when --enable-fully-dynamic-string is used,
> >> because even std::string().begin() will give a warning.
> >>
> >> Use pragmas to stop the troublesome warnings for copies done by
> >> std::char_traits.
> >
> > I'm still experimenting with some of the approaches we discussed
> > last week, but based on my findings so far this was going to be
> > my suggestion at lest for now, until or unless the problem turns
> > out to affect more code than just std::string.
>
> Just minutes after I wrote this I tried following the clue
> in the note printed for the test case from PR 103534 with
> an enhancement I'm experimenting with:
>
> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56:
> warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’
> specified size between 18446744073709551600 and 18446744073709551615
> exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
>    426 |         return static_cast<char_type*>(__builtin_memcpy(__s1,
> __s2, __n));
>        |
> ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56:
> note: when
> ‘(<anonymous>.std::__cxx11::basic_string<char>::_M_string_length >
> 18446744073709551599)’
>
> and adding an assert to string::size():
>
>        constexpr
>        size_type
>        size() const noexcept
>        {
>          if (_M_string_length >= -1LU >> 1)

I think that will be wrong for 64-bit Windows, where long is 32-bit,
but the string's max size is closer to -1LLU >> 2, and we don't want
to just use -1LLU because that's wrong for 32-bit targets.

Technically the max size is (N - 1)/2 where N is the allocator's max
size, which is PTRDIFF_MAX for std::allocator, SIZE_MAX for allocators
that use the default value from std::allocator_traits, or some other
value that the allocator defines. And the value is reduced
appropriately if sizeof(char_type) > 1.

But if we call this->max_size() which calls _Alloc_traits::max_size()
which potentially calls allocator_type::max_size(), will the compiler
actually make those calls to mark the path unreachable, or will it
just ignore the unreachable if it can't fold all those calls at
compile time?

We could just use __SIZE_MAX__/2 which might be larger than the real
maximum, but will never be smaller.

Jason made a suggestion last week which was that if the warning code
has a range like [2,INT_MAX] or [2UL,ULONG_MAX] that it should assume
that is really [2,infinity] and so not warn. If the upper bound is the
maximum possible value for the type, that's effectively unbounded, and
the warning could assume it's simply not got enough information to
give a useful warning. If it has a range like [2,300] for a buffer of
length 100, that should warn. But [2,infinity] is effectively ranger
saying "I have no idea what this value is" and the warning machinery
should not be issuing warnings on the basis of "I have no idea what
this value is".


>            __builtin_unreachable ();
>          return _M_string_length;
>        }
>
> That gets rid of the false positive in this PR.  I realize
> the others happen for other reasons but this approach at least
> suggests that there might be other ways to suppress them than
> the #pragma.  Unlike it, the alternative approaches should also
> improve codegen.
  
Martin Sebor Dec. 10, 2021, 4:35 p.m. UTC | #5
On 12/10/21 3:12 AM, Jonathan Wakely wrote:
> On Fri, 10 Dec 2021 at 01:50, Martin Sebor via Libstdc++
> <libstdc++@gcc.gnu.org> wrote:
>>
>> On 12/9/21 5:38 PM, Martin Sebor wrote:
>>> On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote:
>>>> These warnings are triggered by perfectly valid code using std::string.
>>>> They're particularly bad when --enable-fully-dynamic-string is used,
>>>> because even std::string().begin() will give a warning.
>>>>
>>>> Use pragmas to stop the troublesome warnings for copies done by
>>>> std::char_traits.
>>>
>>> I'm still experimenting with some of the approaches we discussed
>>> last week, but based on my findings so far this was going to be
>>> my suggestion at lest for now, until or unless the problem turns
>>> out to affect more code than just std::string.
>>
>> Just minutes after I wrote this I tried following the clue
>> in the note printed for the test case from PR 103534 with
>> an enhancement I'm experimenting with:
>>
>> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56:
>> warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’
>> specified size between 18446744073709551600 and 18446744073709551615
>> exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
>>     426 |         return static_cast<char_type*>(__builtin_memcpy(__s1,
>> __s2, __n));
>>         |
>> ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
>> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56:
>> note: when
>> ‘(<anonymous>.std::__cxx11::basic_string<char>::_M_string_length >
>> 18446744073709551599)’
>>
>> and adding an assert to string::size():
>>
>>         constexpr
>>         size_type
>>         size() const noexcept
>>         {
>>           if (_M_string_length >= -1LU >> 1)
> 
> I think that will be wrong for 64-bit Windows, where long is 32-bit,
> but the string's max size is closer to -1LLU >> 2, and we don't want
> to just use -1LLU because that's wrong for 32-bit targets.
> 
> Technically the max size is (N - 1)/2 where N is the allocator's max
> size, which is PTRDIFF_MAX for std::allocator, SIZE_MAX for allocators
> that use the default value from std::allocator_traits, or some other
> value that the allocator defines. And the value is reduced
> appropriately if sizeof(char_type) > 1.
> 
> But if we call this->max_size() which calls _Alloc_traits::max_size()
> which potentially calls allocator_type::max_size(), will the compiler
> actually make those calls to mark the path unreachable, or will it
> just ignore the unreachable if it can't fold all those calls at
> compile time?

The above was just a quick proof of concept experiment.  You're
of course right that the final solution can't be so crude(*).
But if the required functions are always_inline (I think member
functions defined in the body of the class implicitly are) then
I'd expect them to be folded at compile time at all optimization
levels.  That's what makes -Winvalid-memory-order work in C++
even at -O0 in the patch I posted earlier this week.

If I still remember my C++-library-fu, even though the standard
requires containers to call max_size() etc., since std::string
is (or can be) an explicit specialization, there shouldn't be
a way for a conforming program to find out if it does, so it
could take shortcuts.  That makes me wonder if it could even
call malloc instead of operator new to get around the problem
with the operator's replaceability.

> 
> We could just use __SIZE_MAX__/2 which might be larger than the real
> maximum, but will never be smaller.
> 
> Jason made a suggestion last week which was that if the warning code
> has a range like [2,INT_MAX] or [2UL,ULONG_MAX] that it should assume
> that is really [2,infinity] and so not warn. If the upper bound is the
> maximum possible value for the type, that's effectively unbounded, and
> the warning could assume it's simply not got enough information to
> give a useful warning. If it has a range like [2,300] for a buffer of
> length 100, that should warn. But [2,infinity] is effectively ranger
> saying "I have no idea what this value is" and the warning machinery
> should not be issuing warnings on the basis of "I have no idea what
> this value is".

The warnings use the lower bound of the access size and ignore
the upper bound.  They use the upper bound of the space remaining
in the destination(s) and ignore the lower bound.

In a call to memcpy (P + I, s, N) the access size is N, and
the space remaining in A is the upper bound of the size of
the largest array A that P points to (it might point to any
number of them, and some might be dynamically allocated with
a size in some range) minus the lower bound of I or zero,
whichever is greater.

This is the most conservative strategy.  There are other
strategies where we could get more true positives but also
more false positives.  Those might be appropriate for
the "maybe" kind of warnings.

Martin

>>             __builtin_unreachable ();
>>           return _M_string_length;
>>         }
>>
>> That gets rid of the false positive in this PR.  I realize
>> the others happen for other reasons but this approach at least
>> suggests that there might be other ways to suppress them than
>> the #pragma.  Unlike it, the alternative approaches should also
>> improve codegen.
  
Jakub Jelinek Dec. 10, 2021, 4:41 p.m. UTC | #6
On Fri, Dec 10, 2021 at 09:35:50AM -0700, Martin Sebor via Gcc-patches wrote:
> The above was just a quick proof of concept experiment.  You're
> of course right that the final solution can't be so crude(*).
> But if the required functions are always_inline (I think member
> functions defined in the body of the class implicitly are

They are not, and can't be, nothing says that such member functions
can't use constructs that make it uninlinable (with always_inline
that would be an error), or are way too large that inlining is not
desirable, etc.  They are just implicitly inline.

	Jakub
  
Martin Sebor Dec. 10, 2021, 5:11 p.m. UTC | #7
On 12/10/21 9:41 AM, Jakub Jelinek wrote:
> On Fri, Dec 10, 2021 at 09:35:50AM -0700, Martin Sebor via Gcc-patches wrote:
>> The above was just a quick proof of concept experiment.  You're
>> of course right that the final solution can't be so crude(*).
>> But if the required functions are always_inline (I think member
>> functions defined in the body of the class implicitly are
> 
> They are not, and can't be, nothing says that such member functions
> can't use constructs that make it uninlinable (with always_inline
> that would be an error), or are way too large that inlining is not
> desirable, etc.  They are just implicitly inline.

The functions we're talking about are the trivial max_size()
members of std::string and allocator traits.  They just return
a constant.

But I see I was wrong, even member functions have to be explicitly
declared always_inline to be guaranteed to be inlined even at -O0.
I don't think that should be an issue for the trivial max_size()
(at least not for the std::string specialization).

Martin
  
Jakub Jelinek Dec. 10, 2021, 5:17 p.m. UTC | #8
On Fri, Dec 10, 2021 at 10:11:04AM -0700, Martin Sebor via Gcc-patches wrote:
> On 12/10/21 9:41 AM, Jakub Jelinek wrote:
> > On Fri, Dec 10, 2021 at 09:35:50AM -0700, Martin Sebor via Gcc-patches wrote:
> > > The above was just a quick proof of concept experiment.  You're
> > > of course right that the final solution can't be so crude(*).
> > > But if the required functions are always_inline (I think member
> > > functions defined in the body of the class implicitly are
> > 
> > They are not, and can't be, nothing says that such member functions
> > can't use constructs that make it uninlinable (with always_inline
> > that would be an error), or are way too large that inlining is not
> > desirable, etc.  They are just implicitly inline.
> 
> The functions we're talking about are the trivial max_size()
> members of std::string and allocator traits.  They just return
> a constant.
> 
> But I see I was wrong, even member functions have to be explicitly
> declared always_inline to be guaranteed to be inlined even at -O0.
> I don't think that should be an issue for the trivial max_size()
> (at least not for the std::string specialization).

Note, if those functions are declared constexpr, without -fno-inline
(default at -O0), then cp_fold will try to evaluate such calls to constant
expressions already, effectively "inlining" them.

	Jakub
  
Jonathan Wakely Dec. 10, 2021, 6:10 p.m. UTC | #9
On Fri, 10 Dec 2021 at 16:35, Martin Sebor <msebor@gmail.com> wrote:
>
> On 12/10/21 3:12 AM, Jonathan Wakely wrote:
> > On Fri, 10 Dec 2021 at 01:50, Martin Sebor via Libstdc++
> > <libstdc++@gcc.gnu.org> wrote:
> >>
> >> On 12/9/21 5:38 PM, Martin Sebor wrote:
> >>> On 12/9/21 4:24 PM, Jonathan Wakely via Gcc-patches wrote:
> >>>> These warnings are triggered by perfectly valid code using std::string.
> >>>> They're particularly bad when --enable-fully-dynamic-string is used,
> >>>> because even std::string().begin() will give a warning.
> >>>>
> >>>> Use pragmas to stop the troublesome warnings for copies done by
> >>>> std::char_traits.
> >>>
> >>> I'm still experimenting with some of the approaches we discussed
> >>> last week, but based on my findings so far this was going to be
> >>> my suggestion at lest for now, until or unless the problem turns
> >>> out to affect more code than just std::string.
> >>
> >> Just minutes after I wrote this I tried following the clue
> >> in the note printed for the test case from PR 103534 with
> >> an enhancement I'm experimenting with:
> >>
> >> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56:
> >> warning: ‘void* __builtin_memcpy(void*, const void*, long unsigned int)’
> >> specified size between 18446744073709551600 and 18446744073709551615
> >> exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
> >>     426 |         return static_cast<char_type*>(__builtin_memcpy(__s1,
> >> __s2, __n));
> >>         |
> >> ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
> >> /build/gcc-master/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/char_traits.h:426:56:
> >> note: when
> >> ‘(<anonymous>.std::__cxx11::basic_string<char>::_M_string_length >
> >> 18446744073709551599)’
> >>
> >> and adding an assert to string::size():
> >>
> >>         constexpr
> >>         size_type
> >>         size() const noexcept
> >>         {
> >>           if (_M_string_length >= -1LU >> 1)
> >
> > I think that will be wrong for 64-bit Windows, where long is 32-bit,
> > but the string's max size is closer to -1LLU >> 2, and we don't want
> > to just use -1LLU because that's wrong for 32-bit targets.
> >
> > Technically the max size is (N - 1)/2 where N is the allocator's max
> > size, which is PTRDIFF_MAX for std::allocator, SIZE_MAX for allocators
> > that use the default value from std::allocator_traits, or some other
> > value that the allocator defines. And the value is reduced
> > appropriately if sizeof(char_type) > 1.
> >
> > But if we call this->max_size() which calls _Alloc_traits::max_size()
> > which potentially calls allocator_type::max_size(), will the compiler
> > actually make those calls to mark the path unreachable, or will it
> > just ignore the unreachable if it can't fold all those calls at
> > compile time?
>
> The above was just a quick proof of concept experiment.  You're
> of course right that the final solution can't be so crude(*).
> But if the required functions are always_inline (I think member
> functions defined in the body of the class implicitly are) then

They're implicitly inline, but not implicitly always_inline.

> I'd expect them to be folded at compile time at all optimization
> levels.  That's what makes -Winvalid-memory-order work in C++
> even at -O0 in the patch I posted earlier this week.
>
> If I still remember my C++-library-fu, even though the standard
> requires containers to call max_size() etc., since std::string
> is (or can be) an explicit specialization, there shouldn't be
> a way for a conforming program to find out if it does, so it
> could take shortcuts.  That makes me wonder if it could even
> call malloc instead of operator new to get around the problem
> with the operator's replaceability.

std::string is required to use std::allocator<char> and users can't
specialize that to observe its allocations, but it does require:

The storage for the array is obtained by calling ::operator new
(17.6.3), but it is unspecified when or how often this function is
called.

Users can certainly observe whether std::string *never* calls their
replaced operator new (and if they've replaced it specifically to
ensure allocations use their own source, and not malloc, they wouldn't
be happy).
  
Jonathan Wakely Dec. 10, 2021, 6:16 p.m. UTC | #10
On Fri, 10 Dec 2021 at 17:17, Jakub Jelinek wrote:
>
> On Fri, Dec 10, 2021 at 10:11:04AM -0700, Martin Sebor via Gcc-patches wrote:
> > On 12/10/21 9:41 AM, Jakub Jelinek wrote:
> > > On Fri, Dec 10, 2021 at 09:35:50AM -0700, Martin Sebor via Gcc-patches wrote:
> > > > The above was just a quick proof of concept experiment.  You're
> > > > of course right that the final solution can't be so crude(*).
> > > > But if the required functions are always_inline (I think member
> > > > functions defined in the body of the class implicitly are
> > >
> > > They are not, and can't be, nothing says that such member functions
> > > can't use constructs that make it uninlinable (with always_inline
> > > that would be an error), or are way too large that inlining is not
> > > desirable, etc.  They are just implicitly inline.
> >
> > The functions we're talking about are the trivial max_size()
> > members of std::string and allocator traits.  They just return
> > a constant.
> >
> > But I see I was wrong, even member functions have to be explicitly
> > declared always_inline to be guaranteed to be inlined even at -O0.
> > I don't think that should be an issue for the trivial max_size()
> > (at least not for the std::string specialization).
>
> Note, if those functions are declared constexpr, without -fno-inline
> (default at -O0), then cp_fold will try to evaluate such calls to constant
> expressions already, effectively "inlining" them.

Every member function of std::string is constexpr in C++20, but not
before. But we could add constexpr to internal _M_xxx functions if
that benefits optimization.

For std::basic_string::max_size() it has to call another function
provided by the allocator, but for the std::string (i.e.
std::basic_string<char, char_traits<char>, allocator<char>>)
specialization we know what the allocator is going to tell us.

--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -1071,7 +1071,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
      _GLIBCXX20_CONSTEXPR
      size_type
      max_size() const _GLIBCXX_NOEXCEPT
-      { return (_Alloc_traits::max_size(_M_get_allocator()) - 1) / 2; }
+      {
+       if _GLIBCXX17_CONSTEXPR (__are_same<allocator_type, allocator<char>>)
+         return size_t(-1) / 2;
+       else
+         return (_Alloc_traits::max_size(_M_get_allocator()) - 1) / 2;
+      }

      /**
       *  @brief  Resizes the %string to the specified number of characters.
  

Patch

diff --git a/libstdc++-v3/include/bits/char_traits.h b/libstdc++-v3/include/bits/char_traits.h
index da3e0ffffaa..3f7befcf8b2 100644
--- a/libstdc++-v3/include/bits/char_traits.h
+++ b/libstdc++-v3/include/bits/char_traits.h
@@ -54,6 +54,11 @@  namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wstringop-overflow"
+#pragma GCC diagnostic ignored "-Wstringop-overread"
+#pragma GCC diagnostic ignored "-Warray-bounds"
+
   /**
    *  @brief  Mapping from character type to associated types.
    *
@@ -990,6 +995,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
   } // namespace __detail
 #endif // C++20
 
+#pragma GCC diagnostic push
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace