[1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704]

Message ID 20240507140415.3821279-1-jwakely@redhat.com
State New
Headers
Series [1/2] libstdc++: Fix data race in std::basic_ios::fill() [PR77704] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed

Commit Message

Jonathan Wakely May 7, 2024, 1:52 p.m. UTC
  Tested x86_64-linux. This seems "obviously correct", and I'd like to
push it. The current code definitely has a data race, i.e. undefined
behaviour.

-- >8 --

The lazy caching in std::basic_ios::fill() updates a mutable member
without synchronization, which can cause a data race if two threads both
call fill() on the same stream object when _M_fill_init is false.

To avoid this we can just cache the _M_fill member and set _M_fill_init
early in std::basic_ios::init, instead of doing it lazily. As explained
by the comment in init, there's a good reason for doing it lazily. When
char_type is neither char nor wchar_t, the locale might not have a
std::ctype<char_type>, so getting the fill character would throw an
exception. The current lazy init allows using unformatted I/O with such
a stream, because the fill character is never needed and so it doesn't
matter if the locale doesn't have a ctype<char_type> facet. We can
maintain this property by only setting the fill character in
std::basic_ios::init if the ctype facet is present at that time. If
fill() is called later and the fill character wasn't set by init, we can
get it from the stream's current locale at the point when fill() is
called (and not try to cache it without synchronization).

This causes a change in behaviour for the following program:

  std::ostringstream out;
  out.imbue(loc);
  auto fill = out.fill();

Previously the fill character would have been set when fill() is called,
and so would have used the new locale. This commit changes it so that
the fill character is set on construction and isn't affected by the new
locale being imbued later. This new behaviour seems to be what the
standard requires, and matches MSVC.

The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still
possible to use a std::basic_ios without the ctype<char_type> facet
being present at construction.

libstdc++-v3/ChangeLog:

	PR libstdc++/77704
	* include/bits/basic_ios.h (basic_ios::fill()): Do not modify
	_M_fill and _M_fill_init in a const member function.
	(basic_ios::fill(char_type)): Use _M_fill directly instead of
	calling fill(). Set _M_fill_init to true.
	* include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and
	_M_fill_init here instead.
	* testsuite/27_io/basic_ios/fill/char/1.cc: New test.
	* testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test.
---
 libstdc++-v3/include/bits/basic_ios.h         | 10 +--
 libstdc++-v3/include/bits/basic_ios.tcc       | 15 +++-
 .../testsuite/27_io/basic_ios/fill/char/1.cc  | 78 +++++++++++++++++++
 .../27_io/basic_ios/fill/wchar_t/1.cc         | 55 +++++++++++++
 4 files changed, 148 insertions(+), 10 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
 create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
  

Comments

Jonathan Wakely May 15, 2024, 11:22 a.m. UTC | #1
Pushed to trunk.

On Tue, 7 May 2024 at 15:04, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> Tested x86_64-linux. This seems "obviously correct", and I'd like to
> push it. The current code definitely has a data race, i.e. undefined
> behaviour.
>
> -- >8 --
>
> The lazy caching in std::basic_ios::fill() updates a mutable member
> without synchronization, which can cause a data race if two threads both
> call fill() on the same stream object when _M_fill_init is false.
>
> To avoid this we can just cache the _M_fill member and set _M_fill_init
> early in std::basic_ios::init, instead of doing it lazily. As explained
> by the comment in init, there's a good reason for doing it lazily. When
> char_type is neither char nor wchar_t, the locale might not have a
> std::ctype<char_type>, so getting the fill character would throw an
> exception. The current lazy init allows using unformatted I/O with such
> a stream, because the fill character is never needed and so it doesn't
> matter if the locale doesn't have a ctype<char_type> facet. We can
> maintain this property by only setting the fill character in
> std::basic_ios::init if the ctype facet is present at that time. If
> fill() is called later and the fill character wasn't set by init, we can
> get it from the stream's current locale at the point when fill() is
> called (and not try to cache it without synchronization).
>
> This causes a change in behaviour for the following program:
>
>   std::ostringstream out;
>   out.imbue(loc);
>   auto fill = out.fill();
>
> Previously the fill character would have been set when fill() is called,
> and so would have used the new locale. This commit changes it so that
> the fill character is set on construction and isn't affected by the new
> locale being imbued later. This new behaviour seems to be what the
> standard requires, and matches MSVC.
>
> The new 27_io/basic_ios/fill/char/fill.cc test verifies that it's still
> possible to use a std::basic_ios without the ctype<char_type> facet
> being present at construction.
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/77704
>         * include/bits/basic_ios.h (basic_ios::fill()): Do not modify
>         _M_fill and _M_fill_init in a const member function.
>         (basic_ios::fill(char_type)): Use _M_fill directly instead of
>         calling fill(). Set _M_fill_init to true.
>         * include/bits/basic_ios.tcc (basic_ios::init): Set _M_fill and
>         _M_fill_init here instead.
>         * testsuite/27_io/basic_ios/fill/char/1.cc: New test.
>         * testsuite/27_io/basic_ios/fill/wchar_t/1.cc: New test.
> ---
>  libstdc++-v3/include/bits/basic_ios.h         | 10 +--
>  libstdc++-v3/include/bits/basic_ios.tcc       | 15 +++-
>  .../testsuite/27_io/basic_ios/fill/char/1.cc  | 78 +++++++++++++++++++
>  .../27_io/basic_ios/fill/wchar_t/1.cc         | 55 +++++++++++++
>  4 files changed, 148 insertions(+), 10 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
>  create mode 100644 libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
>
> diff --git a/libstdc++-v3/include/bits/basic_ios.h b/libstdc++-v3/include/bits/basic_ios.h
> index 258e6042b8f..bc3be4d2e37 100644
> --- a/libstdc++-v3/include/bits/basic_ios.h
> +++ b/libstdc++-v3/include/bits/basic_ios.h
> @@ -373,11 +373,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        char_type
>        fill() const
>        {
> -       if (!_M_fill_init)
> -         {
> -           _M_fill = this->widen(' ');
> -           _M_fill_init = true;
> -         }
> +       if (__builtin_expect(!_M_fill_init, false))
> +         return this->widen(' ');
>         return _M_fill;
>        }
>
> @@ -393,8 +390,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        char_type
>        fill(char_type __ch)
>        {
> -       char_type __old = this->fill();
> +       char_type __old = _M_fill;
>         _M_fill = __ch;
> +       _M_fill_init = true;
>         return __old;
>        }
>
> diff --git a/libstdc++-v3/include/bits/basic_ios.tcc b/libstdc++-v3/include/bits/basic_ios.tcc
> index a9313736e32..0197bdf8f67 100644
> --- a/libstdc++-v3/include/bits/basic_ios.tcc
> +++ b/libstdc++-v3/include/bits/basic_ios.tcc
> @@ -138,13 +138,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        // return without throwing an exception. Unfortunately,
>        // ctype<char_type> is not necessarily a required facet, so
>        // streams with char_type != [char, wchar_t] will not have it by
> -      // default. Because of this, the correct value for _M_fill is
> -      // constructed on the first call of fill(). That way,
> +      // default. If the ctype<char_type> facet is available now,
> +      // _M_fill is set here, but otherwise no fill character will be
> +      // cached and a call to fill() will check for the facet again later
> +      // (and will throw if the facet is still not present). This way
>        // unformatted input and output with non-required basic_ios
>        // instantiations is possible even without imbuing the expected
>        // ctype<char_type> facet.
> -      _M_fill = _CharT();
> -      _M_fill_init = false;
> +      if (_M_ctype)
> +       {
> +         _M_fill = _M_ctype->widen(' ');
> +         _M_fill_init = true;
> +       }
> +      else
> +       _M_fill_init = false;
>
>        _M_tie = 0;
>        _M_exception = goodbit;
> diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
> new file mode 100644
> index 00000000000..d5747c7507f
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
> @@ -0,0 +1,78 @@
> +// { dg-do run }
> +
> +#include <ios>
> +#include <locale>
> +#include <streambuf>
> +#include <testsuite_hooks.h>
> +
> +typedef char C;
> +
> +struct tabby_mctype : std::ctype<C>
> +{
> +  C do_widen(char c) const { return c == ' ' ? '\t' : c; }
> +
> +  const char*
> +  do_widen(const char* lo, const char* hi, C* to) const
> +  {
> +    while (lo != hi)
> +      *to++ = do_widen(*lo++);
> +    return hi;
> +  }
> +};
> +
> +void
> +test01()
> +{
> +  std::basic_ios<C> out(0);
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  out.imbue(loc);
> +  VERIFY( out.fill() == ' ' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill('*');
> +  VERIFY( out.fill() == '*' ); // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
> +}
> +
> +void
> +test02()
> +{
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  std::locale::global(loc);
> +  std::basic_ios<C> out(0);
> +  VERIFY( out.fill() == '\t' );
> +  out.imbue(std::locale::classic());
> +  VERIFY( out.fill() == '\t' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill('*');
> +  VERIFY( out.fill() == '*' );  // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == '*' );  // Imbuing a new locale doesn't affect fill().
> +}
> +
> +void
> +test03()
> +{
> +  // This function tests a libstdc++ extension: if no ctype<char_type> facet
> +  // is present when the stream is initialized, a fill character will not be
> +  // cached. Calling fill() will obtain a fill character from the locale each
> +  // time it's called.
> +  typedef signed char C2;
> +  std::basic_ios<C2> out(0);
> +#if __cpp_exceptions
> +  try {
> +    (void) out.fill(); // No ctype<signed char> in the locale.
> +    VERIFY( false );
> +  } catch (...) {
> +  }
> +#endif
> +  out.fill('*');
> +  VERIFY( out.fill() == '*' ); // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
> +}
> +
> +int main()
> +{
> +  test01();
> +  test02();
> +  test03();
> +}
> diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
> new file mode 100644
> index 00000000000..2d639a0844d
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
> @@ -0,0 +1,55 @@
> +// { dg-do run }
> +
> +#include <ios>
> +#include <locale>
> +#include <streambuf>
> +#include <testsuite_hooks.h>
> +
> +typedef wchar_t C;
> +
> +struct tabby_mctype : std::ctype<C>
> +{
> +  C do_widen(char c) const { return c == ' ' ? L'\t' : c; }
> +
> +  const char*
> +  do_widen(const char* lo, const char* hi, C* to) const
> +  {
> +    while (lo != hi)
> +      *to++ = do_widen(*lo++);
> +    return hi;
> +  }
> +};
> +
> +void
> +test01()
> +{
> +  std::basic_ios<C> out(0);
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  out.imbue(loc);
> +  VERIFY( out.fill() == L' ' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill(L'*');
> +  VERIFY( out.fill() == L'*' ); // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == L'*' ); // Imbuing a new locale doesn't affect fill().
> +}
> +
> +void
> +test02()
> +{
> +  std::locale loc(std::locale(), new tabby_mctype);
> +  std::locale::global(loc);
> +  std::basic_ios<C> out(0);
> +  VERIFY( out.fill() == L'\t' );
> +  out.imbue(std::locale::classic());
> +  VERIFY( out.fill() == L'\t' ); // Imbuing a new locale doesn't affect fill().
> +  out.fill(L'*');
> +  VERIFY( out.fill() == L'*' );  // This will be cached now.
> +  out.imbue(std::locale());
> +  VERIFY( out.fill() == L'*' );  // Imbuing a new locale doesn't affect fill().
> +}
> +
> +int main()
> +{
> +  test01();
> +  test02();
> +}
> --
> 2.44.0
>
  

Patch

diff --git a/libstdc++-v3/include/bits/basic_ios.h b/libstdc++-v3/include/bits/basic_ios.h
index 258e6042b8f..bc3be4d2e37 100644
--- a/libstdc++-v3/include/bits/basic_ios.h
+++ b/libstdc++-v3/include/bits/basic_ios.h
@@ -373,11 +373,8 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       char_type
       fill() const
       {
-	if (!_M_fill_init)
-	  {
-	    _M_fill = this->widen(' ');
-	    _M_fill_init = true;
-	  }
+	if (__builtin_expect(!_M_fill_init, false))
+	  return this->widen(' ');
 	return _M_fill;
       }
 
@@ -393,8 +390,9 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       char_type
       fill(char_type __ch)
       {
-	char_type __old = this->fill();
+	char_type __old = _M_fill;
 	_M_fill = __ch;
+	_M_fill_init = true;
 	return __old;
       }
 
diff --git a/libstdc++-v3/include/bits/basic_ios.tcc b/libstdc++-v3/include/bits/basic_ios.tcc
index a9313736e32..0197bdf8f67 100644
--- a/libstdc++-v3/include/bits/basic_ios.tcc
+++ b/libstdc++-v3/include/bits/basic_ios.tcc
@@ -138,13 +138,20 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       // return without throwing an exception. Unfortunately,
       // ctype<char_type> is not necessarily a required facet, so
       // streams with char_type != [char, wchar_t] will not have it by
-      // default. Because of this, the correct value for _M_fill is
-      // constructed on the first call of fill(). That way,
+      // default. If the ctype<char_type> facet is available now,
+      // _M_fill is set here, but otherwise no fill character will be
+      // cached and a call to fill() will check for the facet again later
+      // (and will throw if the facet is still not present). This way
       // unformatted input and output with non-required basic_ios
       // instantiations is possible even without imbuing the expected
       // ctype<char_type> facet.
-      _M_fill = _CharT();
-      _M_fill_init = false;
+      if (_M_ctype)
+	{
+	  _M_fill = _M_ctype->widen(' ');
+	  _M_fill_init = true;
+	}
+      else
+	_M_fill_init = false;
 
       _M_tie = 0;
       _M_exception = goodbit;
diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
new file mode 100644
index 00000000000..d5747c7507f
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/char/1.cc
@@ -0,0 +1,78 @@ 
+// { dg-do run }
+
+#include <ios>
+#include <locale>
+#include <streambuf>
+#include <testsuite_hooks.h>
+
+typedef char C;
+
+struct tabby_mctype : std::ctype<C>
+{
+  C do_widen(char c) const { return c == ' ' ? '\t' : c; }
+
+  const char*
+  do_widen(const char* lo, const char* hi, C* to) const
+  {
+    while (lo != hi)
+      *to++ = do_widen(*lo++);
+    return hi;
+  }
+};
+
+void
+test01()
+{
+  std::basic_ios<C> out(0);
+  std::locale loc(std::locale(), new tabby_mctype);
+  out.imbue(loc);
+  VERIFY( out.fill() == ' ' ); // Imbuing a new locale doesn't affect fill().
+  out.fill('*');
+  VERIFY( out.fill() == '*' ); // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
+}
+
+void
+test02()
+{
+  std::locale loc(std::locale(), new tabby_mctype);
+  std::locale::global(loc);
+  std::basic_ios<C> out(0);
+  VERIFY( out.fill() == '\t' );
+  out.imbue(std::locale::classic());
+  VERIFY( out.fill() == '\t' ); // Imbuing a new locale doesn't affect fill().
+  out.fill('*');
+  VERIFY( out.fill() == '*' );  // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == '*' );  // Imbuing a new locale doesn't affect fill().
+}
+
+void
+test03()
+{
+  // This function tests a libstdc++ extension: if no ctype<char_type> facet
+  // is present when the stream is initialized, a fill character will not be
+  // cached. Calling fill() will obtain a fill character from the locale each
+  // time it's called.
+  typedef signed char C2;
+  std::basic_ios<C2> out(0);
+#if __cpp_exceptions
+  try {
+    (void) out.fill(); // No ctype<signed char> in the locale.
+    VERIFY( false );
+  } catch (...) {
+  }
+#endif
+  out.fill('*');
+  VERIFY( out.fill() == '*' ); // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == '*' ); // Imbuing a new locale doesn't affect fill().
+}
+
+int main()
+{
+  test01();
+  test02();
+  test03();
+}
diff --git a/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
new file mode 100644
index 00000000000..2d639a0844d
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_ios/fill/wchar_t/1.cc
@@ -0,0 +1,55 @@ 
+// { dg-do run }
+
+#include <ios>
+#include <locale>
+#include <streambuf>
+#include <testsuite_hooks.h>
+
+typedef wchar_t C;
+
+struct tabby_mctype : std::ctype<C>
+{
+  C do_widen(char c) const { return c == ' ' ? L'\t' : c; }
+
+  const char*
+  do_widen(const char* lo, const char* hi, C* to) const
+  {
+    while (lo != hi)
+      *to++ = do_widen(*lo++);
+    return hi;
+  }
+};
+
+void
+test01()
+{
+  std::basic_ios<C> out(0);
+  std::locale loc(std::locale(), new tabby_mctype);
+  out.imbue(loc);
+  VERIFY( out.fill() == L' ' ); // Imbuing a new locale doesn't affect fill().
+  out.fill(L'*');
+  VERIFY( out.fill() == L'*' ); // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == L'*' ); // Imbuing a new locale doesn't affect fill().
+}
+
+void
+test02()
+{
+  std::locale loc(std::locale(), new tabby_mctype);
+  std::locale::global(loc);
+  std::basic_ios<C> out(0);
+  VERIFY( out.fill() == L'\t' );
+  out.imbue(std::locale::classic());
+  VERIFY( out.fill() == L'\t' ); // Imbuing a new locale doesn't affect fill().
+  out.fill(L'*');
+  VERIFY( out.fill() == L'*' );  // This will be cached now.
+  out.imbue(std::locale());
+  VERIFY( out.fill() == L'*' );  // Imbuing a new locale doesn't affect fill().
+}
+
+int main()
+{
+  test01();
+  test02();
+}