c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4

Message ID 20211130121707.GM2646553@tucnak
State Committed
Headers
Series c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4 |

Commit Message

Jakub Jelinek Nov. 30, 2021, 12:17 p.m. UTC
  On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:
> It's a DR.  Really, it was intended to be part of C++20; at the Cologne
> meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
> could go in as part of that paper.

Ok, changed to be done unconditionally now.

> Also, allowing indeterminate values that are never read was in C++20
> (P1331).

Reading P1331R2 again, I'm still puzzled.
Our current behavior (both before and after this patch) is that if
some variable is scalar and has indeterminate value or if an aggregate
variable has some members (possibly nested) with indeterminate values,
in constexpr contexts we allow copying those into other vars of the
same type (e.g. the testcases in the patch below test mere copying
of the whole structures or unsigned char result of __builtin_bit_cast),
but we reject if we actually use them in some other way (e.g. try to
read a member from a variable that has that member indeterminate,
see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
unsigned char variable.

Then there is P1331R2 which makes the UB on
"an lvalue-to-rvalue conversion that is applied to an object with
indeterminate value ([basic.indet]);"
but isn't even the
  unsigned char a = __builtin_bit_cast (unsigned char, u);
  unsigned char b = a;
case non-constant then when __builtin_bit_cast returns indeterminate value?
__builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
in that case, so supposely
  unsigned char a = __builtin_bit_cast (unsigned char, u);
is fine, but on
  unsigned char b = a;
a is lvalue and is converted to rvalue.
Similarly
  T t = { 1, 2 };
  S s = __builtin_bit_cast (S, t);
  S u = s;
where S s = __builtin_bit_cast (S, t); could be ok even when some or all
members are indeterminate, but u = s; does lvalue-to-rvalue conversion?

Or there is http://eel.is/c++draft/basic.indet that has quite clear rules
what is and isn't UB and if C++ wanted to go further and allow all those
valid cases in there as constant...

Anyway, I hope this can be dealt with incrementally.

> I think in all of them the result of the cast has (some) indeterminate
> value.  So f1-3 are OK because the indeterminate value has unsigned char
> type and is never used; f4() is non-constant because S::f has
> non-byte-access type and so the new wording says it's undefined.

Ok, implemented the bitfield handling then.

Here is an updated patch, so far lightly tested.

2021-11-30  Jakub Jelinek <jakub@redhat.com>

	* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
	(cxx_eval_bit_cast): Don't error about padding bits if target
	type is unsigned char or std::byte, instead return no clearing
	ctor.  Use clear_uchar_or_std_byte_in_mask.

	* g++.dg/cpp2a/bit-cast11.C: New test.
	* g++.dg/cpp2a/bit-cast12.C: New test.
	* g++.dg/cpp2a/bit-cast13.C: New test.
	* g++.dg/cpp2a/bit-cast14.C: New test.



	Jakub
  

Comments

Jason Merrill Nov. 30, 2021, 8:19 p.m. UTC | #1
On 11/30/21 07:17, Jakub Jelinek wrote:
> On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:
>> It's a DR.  Really, it was intended to be part of C++20; at the Cologne
>> meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
>> could go in as part of that paper.
> 
> Ok, changed to be done unconditionally now.
> 
>> Also, allowing indeterminate values that are never read was in C++20
>> (P1331).
> 
> Reading P1331R2 again, I'm still puzzled.
> Our current behavior (both before and after this patch) is that if
> some variable is scalar and has indeterminate value or if an aggregate
> variable has some members (possibly nested) with indeterminate values,
> in constexpr contexts we allow copying those into other vars of the
> same type (e.g. the testcases in the patch below test mere copying
> of the whole structures or unsigned char result of __builtin_bit_cast),

That seems to be a bug, since the copy involves an lvalue-to-rvalue 
conversion.

> but we reject if we actually use them in some other way (e.g. try to
> read a member from a variable that has that member indeterminate,
> see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
> unsigned char variable.

That's correct.

> Then there is P1331R2 which makes the UB on
> "an lvalue-to-rvalue conversion that is applied to an object with
> indeterminate value ([basic.indet]);"
> but isn't even the
>    unsigned char a = __builtin_bit_cast (unsigned char, u);
>    unsigned char b = a;
> case non-constant then when __builtin_bit_cast returns indeterminate value?

Good point.  So it would seem to follow that if the output is going to 
have an indeterminate value, it's non-constant, we don't have to work 
hard in constexpr evaluation, and f1-4 are all non-constant.  And the 
new bit_cast text is only interesting for non-constant evaluation.

> __builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
> in that case, so supposely
>    unsigned char a = __builtin_bit_cast (unsigned char, u);
> is fine, but on

Eh, there's clearly an lvalue-rvalue conversion involved in reading from 
the source value.

>    unsigned char b = a;
> a is lvalue and is converted to rvalue.
> Similarly
>    T t = { 1, 2 };
>    S s = __builtin_bit_cast (S, t);
>    S u = s;
> where S s = __builtin_bit_cast (S, t); could be ok even when some or all
> members are indeterminate, but u = s; does lvalue-to-rvalue conversion?

> Or there is http://eel.is/c++draft/basic.indet that has quite clear rules
> what is and isn't UB and if C++ wanted to go further and allow all those
> valid cases in there as constant...
> 
> Anyway, I hope this can be dealt with incrementally.

>> I think in all of them the result of the cast has (some) indeterminate
>> value.  So f1-3 are OK because the indeterminate value has unsigned char
>> type and is never used; f4() is non-constant because S::f has
>> non-byte-access type and so the new wording says it's undefined.
> 
> Ok, implemented the bitfield handling then.
> 
> Here is an updated patch, so far lightly tested.
> 
> 2021-11-30  Jakub Jelinek <jakub@redhat.com>
> 
> 	* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
> 	(cxx_eval_bit_cast): Don't error about padding bits if target
> 	type is unsigned char or std::byte, instead return no clearing
> 	ctor.  Use clear_uchar_or_std_byte_in_mask.
> 
> 	* g++.dg/cpp2a/bit-cast11.C: New test.
> 	* g++.dg/cpp2a/bit-cast12.C: New test.
> 	* g++.dg/cpp2a/bit-cast13.C: New test.
> 	* g++.dg/cpp2a/bit-cast14.C: New test.
> 
> --- gcc/cp/constexpr.c.jj	2021-11-30 09:44:46.531607444 +0100
> +++ gcc/cp/constexpr.c	2021-11-30 12:20:29.105251443 +0100
> @@ -4268,6 +4268,121 @@ check_bit_cast_type (const constexpr_ctx
>     return false;
>   }
>   
> +/* Helper function for cxx_eval_bit_cast.  For unsigned char or
> +   std::byte members of CONSTRUCTOR (recursively) if they contain
> +   some indeterminate bits (as set in MASK), remove the ctor elts,
> +   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
> +   bits in MASK.  */
> +
> +static void
> +clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
> +{
> +  if (TREE_CODE (t) != CONSTRUCTOR)
> +    return;
> +
> +  unsigned i, j = 0;
> +  tree index, value;
> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
> +    {
> +      tree type = TREE_TYPE (value);
> +      if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
> +	  && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
> +	{
> +	  if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index))
> +	      && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index))
> +		  != char_type_node))
> +	    {
> +	      HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
> +	      gcc_assert (fldsz != 0);
> +	      HOST_WIDE_INT pos = int_byte_position (index);
> +	      HOST_WIDE_INT bpos
> +		= tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
> +	      bpos %= BITS_PER_UNIT;
> +	      HOST_WIDE_INT end
> +		= ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;
> +	      gcc_assert (end == 1 || end == 2);
> +	      unsigned char *p = mask + pos;
> +	      unsigned char mask_save[2];
> +	      mask_save[0] = mask[pos];
> +	      mask_save[1] = end == 2 ? mask[pos + 1] : 0;
> +	      if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
> +		sorry_at (loc, "PDP11 bit-field handling unsupported"
> +			       " in %qs", "__builtin_bit_cast");
> +	      else if (BYTES_BIG_ENDIAN)
> +		{
> +		  /* Big endian.  */
> +		  if (bpos + fldsz <= BITS_PER_UNIT)
> +		    *p &= ~(((1 << fldsz) - 1)
> +			    << (BITS_PER_UNIT - bpos - fldsz));
> +		  else
> +		    {
> +		      gcc_assert (bpos);
> +		      *p &= ~(((1U << BITS_PER_UNIT) - 1) >> bpos);
> +		      p++;
> +		      fldsz -= BITS_PER_UNIT - bpos;
> +		      gcc_assert (fldsz && fldsz < BITS_PER_UNIT);
> +		      *p &= ((1U << BITS_PER_UNIT) - 1) >> fldsz;
> +		    }
> +		}
> +	      else
> +		{
> +		  /* Little endian.  */
> +		  if (bpos + fldsz <= BITS_PER_UNIT)
> +		    *p &= ~(((1 << fldsz) - 1) << bpos);
> +		  else
> +		    {
> +		      gcc_assert (bpos);
> +		      *p &= ~(((1 << BITS_PER_UNIT) - 1) << bpos);
> +		      p++;
> +		      fldsz -= BITS_PER_UNIT - bpos;
> +		      gcc_assert (fldsz && fldsz < BITS_PER_UNIT);
> +		      *p &= ~((1 << fldsz) - 1);
> +		    }
> +		}
> +	      if (mask_save[0] != mask[pos]
> +		  || (end == 2 && mask_save[1] != mask[pos + 1]))
> +		{
> +		  CONSTRUCTOR_NO_CLEARING (t) = 1;
> +		  continue;
> +		}
> +	    }
> +	}
> +      else if (is_byte_access_type (type)
> +	       && TYPE_MAIN_VARIANT (type) != char_type_node)
> +	{
> +	  HOST_WIDE_INT pos;
> +	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
> +	    pos = tree_to_shwi (index);
> +	  else
> +	    pos = int_byte_position (index);
> +	  if (mask[pos])
> +	    {
> +	      CONSTRUCTOR_NO_CLEARING (t) = 1;
> +	      mask[pos] = 0;
> +	      continue;
> +	    }
> +	}
> +      if (TREE_CODE (value) == CONSTRUCTOR)
> +	{
> +	  HOST_WIDE_INT pos;
> +	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
> +	    pos = tree_to_shwi (index)
> +		  * tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (t))));
> +	  else
> +	    pos = int_byte_position (index);
> +	  clear_uchar_or_std_byte_in_mask (loc, value, mask + pos);
> +	}
> +      if (i != j)
> +	{
> +	  CONSTRUCTOR_ELT (t, j)->index = index;
> +	  CONSTRUCTOR_ELT (t, j)->value = value;
> +	}
> +      ++j;
> +    }
> +  if (CONSTRUCTOR_NELTS (t) != j)
> +    vec_safe_truncate (CONSTRUCTOR_ELTS (t), j);
> +}
> +
>   /* Subroutine of cxx_eval_constant_expression.
>      Attempt to evaluate a BIT_CAST_EXPR.  */
>   
> @@ -4344,12 +4459,28 @@ cxx_eval_bit_cast (const constexpr_ctx *
>   
>     tree r = NULL_TREE;
>     if (can_native_interpret_type_p (TREE_TYPE (t)))
> -    r = native_interpret_expr (TREE_TYPE (t), ptr, len);
> +    {
> +      r = native_interpret_expr (TREE_TYPE (t), ptr, len);
> +      if (is_byte_access_type (TREE_TYPE (t))
> +	  && TYPE_MAIN_VARIANT (TREE_TYPE (t)) != char_type_node)
> +	{
> +	  gcc_assert (len == 1);
> +	  if (mask[0])
> +	    {
> +	      memset (mask, 0, len);
> +	      r = build_constructor (TREE_TYPE (r), NULL);
> +	      CONSTRUCTOR_NO_CLEARING (r) = 1;
> +	    }
> +	}
> +    }
>     else if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE)
>       {
>         r = native_interpret_aggregate (TREE_TYPE (t), ptr, 0, len);
>         if (r != NULL_TREE)
> -	clear_type_padding_in_mask (TREE_TYPE (t), mask);
> +	{
> +	  clear_type_padding_in_mask (TREE_TYPE (t), mask);
> +	  clear_uchar_or_std_byte_in_mask (loc, r, mask);
> +	}
>       }
>   
>     if (r != NULL_TREE)
> --- gcc/testsuite/g++.dg/cpp2a/bit-cast11.C.jj	2021-11-30 10:42:21.301491218 +0100
> +++ gcc/testsuite/g++.dg/cpp2a/bit-cast11.C	2021-11-30 12:09:07.967090417 +0100
> @@ -0,0 +1,69 @@
> +// P1272R4
> +// { dg-do compile { target c++14 } }
> +
> +struct S { unsigned char a[2]; alignas(sizeof 0) int b; };
> +struct T { char a; alignas(sizeof 0) int b; };
> +struct U { char a : 1; char : 6; char b : 1; };
> +struct V { int a; S b; };
> +struct W { unsigned a; T b; };
> +
> +constexpr bool
> +f1 ()
> +{
> +  T t = { 1, 2 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return u.a[0] == 1;
> +}
> +
> +constexpr bool
> +f2 ()
> +{
> +  U u = { 0, 0 };
> +  unsigned char a = __builtin_bit_cast (unsigned char, u);
> +  unsigned char b = a;
> +  return true;
> +}
> +
> +constexpr bool
> +f3 ()
> +{
> +  T t = { 1, 2 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return u.a[1] == 0;
> +}
> +
> +constexpr bool
> +f4 ()
> +{
> +  U u = { 0, 0 };
> +  unsigned char a = __builtin_bit_cast (unsigned char, u);
> +  unsigned char b = a;
> +  return b == 0;		// { dg-error "is not a constant expression" }
> +}
> +
> +constexpr bool
> +f5 ()
> +{
> +  W t = { 1, 2 };
> +  V s = __builtin_bit_cast (V, t);
> +  V u = s;
> +  return u.b.a[0] == 1;
> +}
> +
> +constexpr bool
> +f6 ()
> +{
> +  W t = { 1, 2 };
> +  V s = __builtin_bit_cast (V, t);
> +  V u = s;
> +  return u.b.a[1] == 1;
> +}
> +
> +constexpr bool a = f1 ();
> +constexpr bool b = f2 ();
> +constexpr bool c = f3 ();	// { dg-error "accessing uninitialized array element" }
> +constexpr bool d = f4 ();
> +constexpr bool e = f5 ();
> +constexpr bool f = f6 ();	// { dg-error "accessing uninitialized array element" }
> --- gcc/testsuite/g++.dg/cpp2a/bit-cast12.C.jj	2021-11-30 10:42:21.301491218 +0100
> +++ gcc/testsuite/g++.dg/cpp2a/bit-cast12.C	2021-11-30 12:09:52.178449320 +0100
> @@ -0,0 +1,74 @@
> +// P1272R4
> +// { dg-do compile { target c++14 } }
> +
> +namespace std
> +{
> +  enum class byte : unsigned char {};
> +}
> +
> +struct S { unsigned char a[2]; alignas(sizeof 0) int b; };
> +struct T { char a; alignas(sizeof 0) int b; };
> +struct U { char a : 1; char : 6; char b : 1; };
> +struct V { int a; S b; };
> +struct W { unsigned a; T b; };
> +
> +constexpr bool
> +f1 ()
> +{
> +  T t = { 1, 2 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return u.a[0] == 1;
> +}
> +
> +constexpr bool
> +f2 ()
> +{
> +  U u = { 0, 0 };
> +  unsigned char a = __builtin_bit_cast (unsigned char, u);
> +  unsigned char b = a;
> +  return true;
> +}
> +
> +constexpr bool
> +f3 ()
> +{
> +  T t = { 1, 2 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return u.a[1] == 0;
> +}
> +
> +constexpr bool
> +f4 ()
> +{
> +  U u = { 0, 0 };
> +  unsigned char a = __builtin_bit_cast (unsigned char, u);
> +  unsigned char b = a;
> +  return b == 0;		// { dg-error "is not a constant expression" }
> +}
> +
> +constexpr bool
> +f5 ()
> +{
> +  W t = { 1, 2 };
> +  V s = __builtin_bit_cast (V, t);
> +  V u = s;
> +  return u.b.a[0] == 1;
> +}
> +
> +constexpr bool
> +f6 ()
> +{
> +  W t = { 1, 2 };
> +  V s = __builtin_bit_cast (V, t);
> +  V u = s;
> +  return u.b.a[1] == 1;
> +}
> +
> +constexpr bool a = f1 ();
> +constexpr bool b = f2 ();
> +constexpr bool c = f3 ();	// { dg-error "accessing uninitialized array element" }
> +constexpr bool d = f4 ();
> +constexpr bool e = f5 ();
> +constexpr bool f = f6 ();	// { dg-error "accessing uninitialized array element" }
> --- gcc/testsuite/g++.dg/cpp2a/bit-cast13.C.jj	2021-11-30 10:42:21.302491204 +0100
> +++ gcc/testsuite/g++.dg/cpp2a/bit-cast13.C	2021-11-30 10:42:21.302491204 +0100
> @@ -0,0 +1,69 @@
> +// P1272R4
> +// { dg-do compile { target c++14 } }
> +
> +struct S { char a[2]; alignas(sizeof 0) int b; };
> +struct T { char a; alignas(sizeof 0) int b; };
> +struct U { char a : 1; char : 6; char b : 1; };
> +struct V { int a; S b; };
> +struct W { unsigned a; T b; };
> +
> +constexpr bool
> +f1 ()
> +{
> +  T t = { 1, 2 };
> +  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
> +  S u = s;
> +  return u.a[0] == 1;
> +}
> +
> +constexpr bool
> +f2 ()
> +{
> +  U u = { 0, 0 };
> +  char a = __builtin_bit_cast (char, u);	// { dg-error "accessing uninitialized byte" }
> +  char b = a;
> +  return true;
> +}
> +
> +constexpr bool
> +f3 ()
> +{
> +  T t = { 1, 2 };
> +  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
> +  S u = s;
> +  return u.a[1] == 0;
> +}
> +
> +constexpr bool
> +f4 ()
> +{
> +  U u = { 0, 0 };
> +  char a = __builtin_bit_cast (char, u);	// { dg-error "accessing uninitialized byte" }
> +  char b = a;
> +  return b == 0;
> +}
> +
> +constexpr bool
> +f5 ()
> +{
> +  W t = { 1, 2 };
> +  V s = __builtin_bit_cast (V, t);	// { dg-error "accessing uninitialized byte" }
> +  V u = s;
> +  return u.b.a[0] == 1;
> +}
> +
> +constexpr bool
> +f6 ()
> +{
> +  W t = { 1, 2 };
> +  V s = __builtin_bit_cast (V, t);	// { dg-error "accessing uninitialized byte" }
> +  V u = s;
> +  return u.b.a[1] == 1;
> +}
> +
> +constexpr bool a = f1 ();
> +constexpr bool b = f2 ();
> +constexpr bool c = f3 ();
> +constexpr bool d = f4 ();
> +constexpr bool e = f5 ();
> +constexpr bool f = f6 ();
> --- gcc/testsuite/g++.dg/cpp2a/bit-cast14.C.jj	2021-11-30 12:10:24.413981883 +0100
> +++ gcc/testsuite/g++.dg/cpp2a/bit-cast14.C	2021-11-30 12:33:36.343979395 +0100
> @@ -0,0 +1,78 @@
> +// P1272R4
> +// { dg-do compile { target c++14 } }
> +
> +struct S { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; };
> +struct T1 { unsigned char a : 1, : 7, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; };
> +struct T2 { unsigned char a : 8, b : 1, : 4, c : 3, d, e; unsigned int f : 8, g : 24; };
> +struct T3 { unsigned char a : 8, b : 5, c : 1, : 2, d, e; unsigned int f : 8, g : 24; };
> +struct T4 { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 1, : 7, g : 24; };
> +
> +constexpr bool
> +f1 ()
> +{
> +  T1 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return true;
> +}
> +
> +constexpr bool
> +f2 ()
> +{
> +  T2 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return true;
> +}
> +
> +constexpr bool
> +f3 ()
> +{
> +  T3 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);
> +  S u = s;
> +  return true;
> +}
> +
> +constexpr bool
> +f4 ()
> +{
> +  T4 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
> +  return true;
> +}
> +
> +constexpr bool
> +f5 ()
> +{
> +  T1 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);
> +  unsigned char a = s.a;
> +  return true;
> +}
> +
> +constexpr bool
> +f6 ()
> +{
> +  T2 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);
> +  unsigned char b = s.b;
> +  return true;
> +}
> +
> +constexpr bool
> +f7 ()
> +{
> +  T3 t = { 0, 0, 0, 0, 0, 0, 0 };
> +  S s = __builtin_bit_cast (S, t);
> +  unsigned char c = s.c;
> +  return true;
> +}
> +
> +constexpr bool a = f1 ();
> +constexpr bool b = f2 ();
> +constexpr bool c = f3 ();
> +constexpr bool d = f4 ();
> +constexpr bool e = f5 ();	// { dg-error "accessing uninitialized member" }
> +constexpr bool f = f6 ();	// { dg-error "accessing uninitialized member" }
> +constexpr bool g = f7 ();	// { dg-error "accessing uninitialized member" }
> 
> 
> 	Jakub
>
  
Jakub Jelinek Dec. 1, 2021, 9:46 a.m. UTC | #2
On Tue, Nov 30, 2021 at 03:19:19PM -0500, Jason Merrill wrote:
> On 11/30/21 07:17, Jakub Jelinek wrote:
> > On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:
> > > It's a DR.  Really, it was intended to be part of C++20; at the Cologne
> > > meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
> > > could go in as part of that paper.
> > 
> > Ok, changed to be done unconditionally now.
> > 
> > > Also, allowing indeterminate values that are never read was in C++20
> > > (P1331).
> > 
> > Reading P1331R2 again, I'm still puzzled.
> > Our current behavior (both before and after this patch) is that if
> > some variable is scalar and has indeterminate value or if an aggregate
> > variable has some members (possibly nested) with indeterminate values,
> > in constexpr contexts we allow copying those into other vars of the
> > same type (e.g. the testcases in the patch below test mere copying
> > of the whole structures or unsigned char result of __builtin_bit_cast),
> 
> That seems to be a bug, since the copy involves an lvalue-to-rvalue
> conversion.

Yeah.  The questions are
1) where in constexpr.c to check for this and diagnose it
2) if we do it quite often, whether check for it won't be too expensive,
   because we want to check recursively for CONSTRUCTORs with
   CONSTRUCTOR_NO_CLEARING and possibly whether it has any members or
   array elements that are omitted in the CONSTRUCTOR (or can we rely
   on CONSTRUCTOR_NO_CLEARING being set only if it is incomplete?  Then
   we'd need to make sure we clear that flag if we store the last member
   or array element into it)
E.g. for 1), I wonder where exactly the lvalue-to-rvalue conversions occur,
say for
struct S { int a, b, c; };
constexpr int
foo ()
{
  struct S s;
  s.a = 1;
  s.c = 3;
  return s.a;	// Does lvalue-to-rvalue conversion happen here on whole s
		// and then on s.a too, or just on s.a ?
}
E.g. when cxx_eval_component_reference or cxx_eval_array_reference
or cxx_eval_bit_field_ref we cxx_eval_constant_expression the base
first, so if the 1) diagnostics would be in cxx_eval_constant_expression
when it is called on an lvalue_p with !lval, we'd trigger it even when
we might not want to trigger it...

> > but we reject if we actually use them in some other way (e.g. try to
> > read a member from a variable that has that member indeterminate,
> > see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
> > unsigned char variable.
> 
> That's correct.
> 
> > Then there is P1331R2 which makes the UB on
> > "an lvalue-to-rvalue conversion that is applied to an object with
> > indeterminate value ([basic.indet]);"
> > but isn't even the
> >    unsigned char a = __builtin_bit_cast (unsigned char, u);
> >    unsigned char b = a;
> > case non-constant then when __builtin_bit_cast returns indeterminate value?
> 
> Good point.  So it would seem to follow that if the output is going to have
> an indeterminate value, it's non-constant, we don't have to work hard in
> constexpr evaluation, and f1-4 are all non-constant.  And the new bit_cast
> text is only interesting for non-constant evaluation.

I think it isn't that simple.  One thing is that std::bit_cast is described
as a black box that takes a reference argument, so whether lvalue-to-rvalue
conversion happens there is just something that isn't guaranteed (though,
the way __builtin_bit_cast is implemented which is called with the value
rather than its address there is lvalue-to-rvalue conversion on it).
But another more important thing is that at least in the testcases in the
patch if one kills those extra copies of __builtin_bit_cast lhs to other
vars which are non-constant according to P1331R2, I don't see anything
that would imply they are non-constant.  If there is the lvalue-to-rvalue
conversion on what the std::bit_cast argument refers to, that would make
non-constant e.g. class objects with some members indeterminate (e.g.
in the above foo s is like that), but in the testcases the
__builtin_bit_cast arguments have all named members initialized, all they
have is some padding bits in between those members and that can't be
non-constant in itself, otherwise most of constant evaluation would be
non-constant.  And the std::bit_cast rules say what happens in that case.

> > __builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
> > in that case, so supposely
> >    unsigned char a = __builtin_bit_cast (unsigned char, u);
> > is fine, but on
> 
> Eh, there's clearly an lvalue-rvalue conversion involved in reading from the
> source value.

Yes, but if u doesn't contain any indeterminate (leaf) members, then
it shouldn't be non-constant.

So IMHO we need the patch I've posted (with the testcases slightly adjusted
not to do those extra copies afterwards for now),
and try to deal with the lvalue-to-rvalue conversion of indeterminate later,
and once done, perhaps in a copy of those testcases readd those extra copies
afterwards and verify it is rejected.

	Jakub
  
Jason Merrill Dec. 2, 2021, 8:32 p.m. UTC | #3
On 12/1/21 04:46, Jakub Jelinek wrote:
> On Tue, Nov 30, 2021 at 03:19:19PM -0500, Jason Merrill wrote:
>> On 11/30/21 07:17, Jakub Jelinek wrote:
>>> On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:
>>>> It's a DR.  Really, it was intended to be part of C++20; at the Cologne
>>>> meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
>>>> could go in as part of that paper.
>>>
>>> Ok, changed to be done unconditionally now.
>>>
>>>> Also, allowing indeterminate values that are never read was in C++20
>>>> (P1331).
>>>
>>> Reading P1331R2 again, I'm still puzzled.
>>> Our current behavior (both before and after this patch) is that if
>>> some variable is scalar and has indeterminate value or if an aggregate
>>> variable has some members (possibly nested) with indeterminate values,
>>> in constexpr contexts we allow copying those into other vars of the
>>> same type (e.g. the testcases in the patch below test mere copying
>>> of the whole structures or unsigned char result of __builtin_bit_cast),
>>
>> That seems to be a bug, since the copy involves an lvalue-to-rvalue
>> conversion.
> 
> Yeah.  The questions are
> 1) where in constexpr.c to check for this and diagnose it
> 2) if we do it quite often, whether check for it won't be too expensive,
>     because we want to check recursively for CONSTRUCTORs with
>     CONSTRUCTOR_NO_CLEARING and possibly whether it has any members or
>     array elements that are omitted in the CONSTRUCTOR (or can we rely
>     on CONSTRUCTOR_NO_CLEARING being set only if it is incomplete?  Then
>     we'd need to make sure we clear that flag if we store the last member
>     or array element into it)
> E.g. for 1), I wonder where exactly the lvalue-to-rvalue conversions occur,
> say for
> struct S { int a, b, c; };
> constexpr int
> foo ()
> {
>    struct S s;
>    s.a = 1;
>    s.c = 3;
>    return s.a;	// Does lvalue-to-rvalue conversion happen here on whole s
> 		// and then on s.a too, or just on s.a ?

Only on s.a.  Class copy is modeled as memberwise copy, even though we 
optimize it into block copy for trivially copyable classes.

> }
> E.g. when cxx_eval_component_reference or cxx_eval_array_reference
> or cxx_eval_bit_field_ref we cxx_eval_constant_expression the base
> first, so if the 1) diagnostics would be in cxx_eval_constant_expression
> when it is called on an lvalue_p with !lval, we'd trigger it even when
> we might not want to trigger it...

I think we can work around that so that we complain about the full 
CONSTRUCTOR when we're doing a full copy, and otherwise only complain 
about the member.

>>> but we reject if we actually use them in some other way (e.g. try to
>>> read a member from a variable that has that member indeterminate,
>>> see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
>>> unsigned char variable.
>>
>> That's correct.
>>
>>> Then there is P1331R2 which makes the UB on
>>> "an lvalue-to-rvalue conversion that is applied to an object with
>>> indeterminate value ([basic.indet]);"
>>> but isn't even the
>>>     unsigned char a = __builtin_bit_cast (unsigned char, u);
>>>     unsigned char b = a;
>>> case non-constant then when __builtin_bit_cast returns indeterminate value?
>>
>> Good point.  So it would seem to follow that if the output is going to have
>> an indeterminate value, it's non-constant, we don't have to work hard in
>> constexpr evaluation, and f1-4 are all non-constant.  And the new bit_cast
>> text is only interesting for non-constant evaluation.
> 
> I think it isn't that simple.  One thing is that std::bit_cast is described
> as a black box that takes a reference argument, so whether lvalue-to-rvalue
> conversion happens there is just something that isn't guaranteed (though,
> the way __builtin_bit_cast is implemented which is called with the value
> rather than its address there is lvalue-to-rvalue conversion on it).

Eh, all functions in the library are described as black boxes.  But your 
next argument is more persuasive:

> But another more important thing is that at least in the testcases in the
> patch if one kills those extra copies of __builtin_bit_cast lhs to other
> vars which are non-constant according to P1331R2, I don't see anything
> that would imply they are non-constant.  If there is the lvalue-to-rvalue
> conversion on what the std::bit_cast argument refers to, that would make
> non-constant e.g. class objects with some members indeterminate (e.g.
> in the above foo s is like that), but in the testcases the
> __builtin_bit_cast arguments have all named members initialized, all they
> have is some padding bits in between those members and that can't be
> non-constant in itself, otherwise most of constant evaluation would be
> non-constant.  And the std::bit_cast rules say what happens in that case.

Ah, good point.

>>> __builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
>>> in that case, so supposely
>>>     unsigned char a = __builtin_bit_cast (unsigned char, u);
>>> is fine, but on
>>
>> Eh, there's clearly an lvalue-rvalue conversion involved in reading from the
>> source value.
> 
> Yes, but if u doesn't contain any indeterminate (leaf) members, then
> it shouldn't be non-constant.
> 
> So IMHO we need the patch I've posted (with the testcases slightly adjusted
> not to do those extra copies afterwards for now),
> and try to deal with the lvalue-to-rvalue conversion of indeterminate later,
> and once done, perhaps in a copy of those testcases readd those extra copies
> afterwards and verify it is rejected.

Makes sense, except:

> +	      gcc_assert (end == 1 || end == 2);

This seems to assert that the value representation of a bit-field can't 
span more than two bytes, which is wrong for, say,

struct A
{
   int : 1;
   unsigned __int128 c : 128; // value representation spans 17 bytes
};

Jason
  
Jakub Jelinek Dec. 2, 2021, 8:56 p.m. UTC | #4
On Thu, Dec 02, 2021 at 03:32:58PM -0500, Jason Merrill wrote:
> > So IMHO we need the patch I've posted (with the testcases slightly adjusted
> > not to do those extra copies afterwards for now),
> > and try to deal with the lvalue-to-rvalue conversion of indeterminate later,
> > and once done, perhaps in a copy of those testcases readd those extra copies
> > afterwards and verify it is rejected.
> 
> Makes sense, except:
> 
> > +	      gcc_assert (end == 1 || end == 2);
> 
> This seems to assert that the value representation of a bit-field can't span
> more than two bytes, which is wrong for, say,
> 
> struct A
> {
>   int : 1;
>   unsigned __int128 c : 128; // value representation spans 17 bytes
> };

Sure, for arbitrary bitfields yes.  But, the patch only deals with
unsigned char and std::byte TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (fld))
bitfields (others don't have the type listed in the standard and so should
error right away, which is done by keeping the mask bits uncleared).

For PCC_BITFIELD_TYPE_MATTERS I think end must be 1 even, if a bit-field
would cross the byte boundary, it would be allocated in the next byte.
But !PCC_BITFIELD_TYPE_MATTERS targets can cross the boundary, but as it
can't be more than BITS_PER_UNIT bits (oversized bitfields are with warning
downsized to the underlying's type precision), it can occupy at most 2
bytes.

	Jakub
  
Jason Merrill Dec. 3, 2021, 6:56 p.m. UTC | #5
On 12/2/21 15:56, Jakub Jelinek wrote:
> On Thu, Dec 02, 2021 at 03:32:58PM -0500, Jason Merrill wrote:
>>> So IMHO we need the patch I've posted (with the testcases slightly adjusted
>>> not to do those extra copies afterwards for now),
>>> and try to deal with the lvalue-to-rvalue conversion of indeterminate later,
>>> and once done, perhaps in a copy of those testcases readd those extra copies
>>> afterwards and verify it is rejected.
>>
>> Makes sense, except:
>>
>>> +	      gcc_assert (end == 1 || end == 2);
>>
>> This seems to assert that the value representation of a bit-field can't span
>> more than two bytes, which is wrong for, say,
>>
>> struct A
>> {
>>    int : 1;
>>    unsigned __int128 c : 128; // value representation spans 17 bytes
>> };
> 
> Sure, for arbitrary bitfields yes.  But, the patch only deals with
> unsigned char and std::byte TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (fld))
> bitfields (others don't have the type listed in the standard and so should
> error right away, which is done by keeping the mask bits uncleared).

Ah, of course.  The patch is OK, though you might factor

> +      else if (is_byte_access_type (type)
> +	       && TYPE_MAIN_VARIANT (type) != char_type_node)

into is_byte_access_type_not_plain_char.  OK either way.

Jason
  
Jakub Jelinek Dec. 4, 2021, 10:12 a.m. UTC | #6
On Fri, Dec 03, 2021 at 01:56:56PM -0500, Jason Merrill wrote:
> Ah, of course.  The patch is OK, though you might factor
> 
> > +      else if (is_byte_access_type (type)
> > +	       && TYPE_MAIN_VARIANT (type) != char_type_node)
> 
> into is_byte_access_type_not_plain_char.  OK either way.

Thanks, here is what I've committed in the end after another
bootstrap/regtest.

2021-12-04  Jakub Jelinek <jakub@redhat.com>

	* cp-tree.h (is_byte_access_type_not_plain_char): Declare.
	* tree.c (is_byte_access_type_not_plain_char): New function.
	* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
	(cxx_eval_bit_cast): Don't error about padding bits if target
	type is unsigned char or std::byte, instead return no clearing
	ctor.  Use clear_uchar_or_std_byte_in_mask.

	* g++.dg/cpp2a/bit-cast11.C: New test.
	* g++.dg/cpp2a/bit-cast12.C: New test.
	* g++.dg/cpp2a/bit-cast13.C: New test.
	* g++.dg/cpp2a/bit-cast14.C: New test.

--- gcc/cp/cp-tree.h.jj	2021-11-26 10:10:53.216121541 +0100
+++ gcc/cp/cp-tree.h	2021-12-03 21:20:39.880277132 +0100
@@ -7791,6 +7791,7 @@ extern tree build_dummy_object			(tree);
 extern tree maybe_dummy_object			(tree, tree *);
 extern bool is_dummy_object			(const_tree);
 extern bool is_byte_access_type			(tree);
+extern bool is_byte_access_type_not_plain_char	(tree);
 extern const struct attribute_spec cxx_attribute_table[];
 extern tree make_ptrmem_cst			(tree, tree);
 extern tree cp_build_type_attribute_variant     (tree, tree);
--- gcc/cp/tree.c.jj	2021-11-26 10:10:53.218121512 +0100
+++ gcc/cp/tree.c	2021-12-03 21:23:37.373738339 +0100
@@ -4311,6 +4311,18 @@ is_byte_access_type (tree type)
 	  && !strcmp ("byte", TYPE_NAME_STRING (type)));
 }
 
+/* Returns true if TYPE is unsigned char or std::byte.  */
+
+bool
+is_byte_access_type_not_plain_char (tree type)
+{
+  type = TYPE_MAIN_VARIANT (type);
+  if (type == char_type_node)
+    return false;
+
+  return is_byte_access_type (type);
+}
+
 /* Returns 1 iff type T is something we want to treat as a scalar type for
    the purpose of deciding whether it is trivial/POD/standard-layout.  */
 
--- gcc/cp/constexpr.c.jj	2021-12-02 19:41:52.578553507 +0100
+++ gcc/cp/constexpr.c	2021-12-03 21:24:29.870987445 +0100
@@ -4275,6 +4275,118 @@ check_bit_cast_type (const constexpr_ctx
   return false;
 }
 
+/* Helper function for cxx_eval_bit_cast.  For unsigned char or
+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+    return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+    {
+      tree type = TREE_TYPE (value);
+      if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
+	  && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
+	{
+	  if (is_byte_access_type_not_plain_char (DECL_BIT_FIELD_TYPE (index)))
+	    {
+	      HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
+	      gcc_assert (fldsz != 0);
+	      HOST_WIDE_INT pos = int_byte_position (index);
+	      HOST_WIDE_INT bpos
+		= tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
+	      bpos %= BITS_PER_UNIT;
+	      HOST_WIDE_INT end
+		= ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;
+	      gcc_assert (end == 1 || end == 2);
+	      unsigned char *p = mask + pos;
+	      unsigned char mask_save[2];
+	      mask_save[0] = mask[pos];
+	      mask_save[1] = end == 2 ? mask[pos + 1] : 0;
+	      if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
+		sorry_at (loc, "PDP11 bit-field handling unsupported"
+			       " in %qs", "__builtin_bit_cast");
+	      else if (BYTES_BIG_ENDIAN)
+		{
+		  /* Big endian.  */
+		  if (bpos + fldsz <= BITS_PER_UNIT)
+		    *p &= ~(((1 << fldsz) - 1)
+			    << (BITS_PER_UNIT - bpos - fldsz));
+		  else
+		    {
+		      gcc_assert (bpos);
+		      *p &= ~(((1U << BITS_PER_UNIT) - 1) >> bpos);
+		      p++;
+		      fldsz -= BITS_PER_UNIT - bpos;
+		      gcc_assert (fldsz && fldsz < BITS_PER_UNIT);
+		      *p &= ((1U << BITS_PER_UNIT) - 1) >> fldsz;
+		    }
+		}
+	      else
+		{
+		  /* Little endian.  */
+		  if (bpos + fldsz <= BITS_PER_UNIT)
+		    *p &= ~(((1 << fldsz) - 1) << bpos);
+		  else
+		    {
+		      gcc_assert (bpos);
+		      *p &= ~(((1 << BITS_PER_UNIT) - 1) << bpos);
+		      p++;
+		      fldsz -= BITS_PER_UNIT - bpos;
+		      gcc_assert (fldsz && fldsz < BITS_PER_UNIT);
+		      *p &= ~((1 << fldsz) - 1);
+		    }
+		}
+	      if (mask_save[0] != mask[pos]
+		  || (end == 2 && mask_save[1] != mask[pos + 1]))
+		{
+		  CONSTRUCTOR_NO_CLEARING (t) = 1;
+		  continue;
+		}
+	    }
+	}
+      else if (is_byte_access_type_not_plain_char (type))
+	{
+	  HOST_WIDE_INT pos;
+	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+	    pos = tree_to_shwi (index);
+	  else
+	    pos = int_byte_position (index);
+	  if (mask[pos])
+	    {
+	      CONSTRUCTOR_NO_CLEARING (t) = 1;
+	      mask[pos] = 0;
+	      continue;
+	    }
+	}
+      if (TREE_CODE (value) == CONSTRUCTOR)
+	{
+	  HOST_WIDE_INT pos;
+	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+	    pos = tree_to_shwi (index)
+		  * tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (t))));
+	  else
+	    pos = int_byte_position (index);
+	  clear_uchar_or_std_byte_in_mask (loc, value, mask + pos);
+	}
+      if (i != j)
+	{
+	  CONSTRUCTOR_ELT (t, j)->index = index;
+	  CONSTRUCTOR_ELT (t, j)->value = value;
+	}
+      ++j;
+    }
+  if (CONSTRUCTOR_NELTS (t) != j)
+    vec_safe_truncate (CONSTRUCTOR_ELTS (t), j);
+}
+
 /* Subroutine of cxx_eval_constant_expression.
    Attempt to evaluate a BIT_CAST_EXPR.  */
 
@@ -4351,12 +4463,27 @@ cxx_eval_bit_cast (const constexpr_ctx *
 
   tree r = NULL_TREE;
   if (can_native_interpret_type_p (TREE_TYPE (t)))
-    r = native_interpret_expr (TREE_TYPE (t), ptr, len);
+    {
+      r = native_interpret_expr (TREE_TYPE (t), ptr, len);
+      if (is_byte_access_type_not_plain_char (TREE_TYPE (t)))
+	{
+	  gcc_assert (len == 1);
+	  if (mask[0])
+	    {
+	      memset (mask, 0, len);
+	      r = build_constructor (TREE_TYPE (r), NULL);
+	      CONSTRUCTOR_NO_CLEARING (r) = 1;
+	    }
+	}
+    }
   else if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE)
     {
       r = native_interpret_aggregate (TREE_TYPE (t), ptr, 0, len);
       if (r != NULL_TREE)
-	clear_type_padding_in_mask (TREE_TYPE (t), mask);
+	{
+	  clear_type_padding_in_mask (TREE_TYPE (t), mask);
+	  clear_uchar_or_std_byte_in_mask (loc, r, mask);
+	}
     }
 
   if (r != NULL_TREE)
--- gcc/testsuite/g++.dg/cpp2a/bit-cast11.C.jj	2021-12-03 21:19:29.273286900 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast11.C	2021-12-03 21:28:17.871726216 +0100
@@ -0,0 +1,63 @@
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+struct S { unsigned char a[2]; alignas(sizeof 0) int b; };
+struct T { char a; alignas(sizeof 0) int b; };
+struct U { char a : 1; char : 6; char b : 1; };
+struct V { int a; S b; };
+struct W { unsigned a; T b; };
+
+constexpr bool
+f1 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  return s.a[0] == 1;
+}
+
+constexpr bool
+f2 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  return s.a[1] == 0;
+}
+
+constexpr bool
+f4 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  return a == 0;		// { dg-error "is not a constant expression" }
+}
+
+constexpr bool
+f5 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  return s.b.a[0] == 1;
+}
+
+constexpr bool
+f6 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  return s.b.a[1] == 1;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();	// { dg-error "accessing uninitialized array element" }
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();	// { dg-error "accessing uninitialized array element" }
--- gcc/testsuite/g++.dg/cpp2a/bit-cast12.C.jj	2021-12-03 21:19:29.273286900 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast12.C	2021-12-03 21:28:41.353390350 +0100
@@ -0,0 +1,68 @@
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+namespace std
+{
+  enum class byte : unsigned char {};
+}
+
+struct S { unsigned char a[2]; alignas(sizeof 0) int b; };
+struct T { char a; alignas(sizeof 0) int b; };
+struct U { char a : 1; char : 6; char b : 1; };
+struct V { int a; S b; };
+struct W { unsigned a; T b; };
+
+constexpr bool
+f1 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  return s.a[0] == 1;
+}
+
+constexpr bool
+f2 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  return s.a[1] == 0;
+}
+
+constexpr bool
+f4 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  return a == 0;		// { dg-error "is not a constant expression" }
+}
+
+constexpr bool
+f5 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  return s.b.a[0] == 1;
+}
+
+constexpr bool
+f6 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  return s.b.a[1] == 1;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();	// { dg-error "accessing uninitialized array element" }
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();	// { dg-error "accessing uninitialized array element" }
--- gcc/testsuite/g++.dg/cpp2a/bit-cast13.C.jj	2021-12-03 21:19:29.273286900 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast13.C	2021-12-03 21:29:01.434103124 +0100
@@ -0,0 +1,63 @@
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+struct S { char a[2]; alignas(sizeof 0) int b; };
+struct T { char a; alignas(sizeof 0) int b; };
+struct U { char a : 1; char : 6; char b : 1; };
+struct V { int a; S b; };
+struct W { unsigned a; T b; };
+
+constexpr bool
+f1 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
+  return s.a[0] == 1;
+}
+
+constexpr bool
+f2 ()
+{
+  U u = { 0, 0 };
+  char a = __builtin_bit_cast (char, u);	// { dg-error "accessing uninitialized byte" }
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
+  return s.a[1] == 0;
+}
+
+constexpr bool
+f4 ()
+{
+  U u = { 0, 0 };
+  char a = __builtin_bit_cast (char, u);	// { dg-error "accessing uninitialized byte" }
+  return a == 0;
+}
+
+constexpr bool
+f5 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);	// { dg-error "accessing uninitialized byte" }
+  return s.b.a[0] == 1;
+}
+
+constexpr bool
+f6 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);	// { dg-error "accessing uninitialized byte" }
+  return s.b.a[1] == 1;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();
--- gcc/testsuite/g++.dg/cpp2a/bit-cast14.C.jj	2021-12-03 21:19:29.274286886 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast14.C	2021-12-03 21:29:48.724426700 +0100
@@ -0,0 +1,75 @@
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+struct S { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; };
+struct T1 { unsigned char a : 1, : 7, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; };
+struct T2 { unsigned char a : 8, b : 1, : 4, c : 3, d, e; unsigned int f : 8, g : 24; };
+struct T3 { unsigned char a : 8, b : 5, c : 1, : 2, d, e; unsigned int f : 8, g : 24; };
+struct T4 { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 1, : 7, g : 24; };
+
+constexpr bool
+f1 ()
+{
+  T1 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  return true;
+}
+
+constexpr bool
+f2 ()
+{
+  T2 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T3 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  return true;
+}
+
+constexpr bool
+f4 ()
+{
+  T4 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
+  return true;
+}
+
+constexpr bool
+f5 ()
+{
+  T1 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  unsigned char a = s.a;
+  return true;
+}
+
+constexpr bool
+f6 ()
+{
+  T2 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  unsigned char b = s.b;
+  return true;
+}
+
+constexpr bool
+f7 ()
+{
+  T3 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  unsigned char c = s.c;
+  return true;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();	// { dg-error "accessing uninitialized member" }
+constexpr bool f = f6 ();	// { dg-error "accessing uninitialized member" }
+constexpr bool g = f7 ();	// { dg-error "accessing uninitialized member" }


	Jakub
  

Patch

--- gcc/cp/constexpr.c.jj	2021-11-30 09:44:46.531607444 +0100
+++ gcc/cp/constexpr.c	2021-11-30 12:20:29.105251443 +0100
@@ -4268,6 +4268,121 @@  check_bit_cast_type (const constexpr_ctx
   return false;
 }
 
+/* Helper function for cxx_eval_bit_cast.  For unsigned char or
+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+    return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+    {
+      tree type = TREE_TYPE (value);
+      if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
+	  && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
+	{
+	  if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index))
+	      && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index))
+		  != char_type_node))
+	    {
+	      HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
+	      gcc_assert (fldsz != 0);
+	      HOST_WIDE_INT pos = int_byte_position (index);
+	      HOST_WIDE_INT bpos
+		= tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
+	      bpos %= BITS_PER_UNIT;
+	      HOST_WIDE_INT end
+		= ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;
+	      gcc_assert (end == 1 || end == 2);
+	      unsigned char *p = mask + pos;
+	      unsigned char mask_save[2];
+	      mask_save[0] = mask[pos];
+	      mask_save[1] = end == 2 ? mask[pos + 1] : 0;
+	      if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
+		sorry_at (loc, "PDP11 bit-field handling unsupported"
+			       " in %qs", "__builtin_bit_cast");
+	      else if (BYTES_BIG_ENDIAN)
+		{
+		  /* Big endian.  */
+		  if (bpos + fldsz <= BITS_PER_UNIT)
+		    *p &= ~(((1 << fldsz) - 1)
+			    << (BITS_PER_UNIT - bpos - fldsz));
+		  else
+		    {
+		      gcc_assert (bpos);
+		      *p &= ~(((1U << BITS_PER_UNIT) - 1) >> bpos);
+		      p++;
+		      fldsz -= BITS_PER_UNIT - bpos;
+		      gcc_assert (fldsz && fldsz < BITS_PER_UNIT);
+		      *p &= ((1U << BITS_PER_UNIT) - 1) >> fldsz;
+		    }
+		}
+	      else
+		{
+		  /* Little endian.  */
+		  if (bpos + fldsz <= BITS_PER_UNIT)
+		    *p &= ~(((1 << fldsz) - 1) << bpos);
+		  else
+		    {
+		      gcc_assert (bpos);
+		      *p &= ~(((1 << BITS_PER_UNIT) - 1) << bpos);
+		      p++;
+		      fldsz -= BITS_PER_UNIT - bpos;
+		      gcc_assert (fldsz && fldsz < BITS_PER_UNIT);
+		      *p &= ~((1 << fldsz) - 1);
+		    }
+		}
+	      if (mask_save[0] != mask[pos]
+		  || (end == 2 && mask_save[1] != mask[pos + 1]))
+		{
+		  CONSTRUCTOR_NO_CLEARING (t) = 1;
+		  continue;
+		}
+	    }
+	}
+      else if (is_byte_access_type (type)
+	       && TYPE_MAIN_VARIANT (type) != char_type_node)
+	{
+	  HOST_WIDE_INT pos;
+	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+	    pos = tree_to_shwi (index);
+	  else
+	    pos = int_byte_position (index);
+	  if (mask[pos])
+	    {
+	      CONSTRUCTOR_NO_CLEARING (t) = 1;
+	      mask[pos] = 0;
+	      continue;
+	    }
+	}
+      if (TREE_CODE (value) == CONSTRUCTOR)
+	{
+	  HOST_WIDE_INT pos;
+	  if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE)
+	    pos = tree_to_shwi (index)
+		  * tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (t))));
+	  else
+	    pos = int_byte_position (index);
+	  clear_uchar_or_std_byte_in_mask (loc, value, mask + pos);
+	}
+      if (i != j)
+	{
+	  CONSTRUCTOR_ELT (t, j)->index = index;
+	  CONSTRUCTOR_ELT (t, j)->value = value;
+	}
+      ++j;
+    }
+  if (CONSTRUCTOR_NELTS (t) != j)
+    vec_safe_truncate (CONSTRUCTOR_ELTS (t), j);
+}
+
 /* Subroutine of cxx_eval_constant_expression.
    Attempt to evaluate a BIT_CAST_EXPR.  */
 
@@ -4344,12 +4459,28 @@  cxx_eval_bit_cast (const constexpr_ctx *
 
   tree r = NULL_TREE;
   if (can_native_interpret_type_p (TREE_TYPE (t)))
-    r = native_interpret_expr (TREE_TYPE (t), ptr, len);
+    {
+      r = native_interpret_expr (TREE_TYPE (t), ptr, len);
+      if (is_byte_access_type (TREE_TYPE (t))
+	  && TYPE_MAIN_VARIANT (TREE_TYPE (t)) != char_type_node)
+	{
+	  gcc_assert (len == 1);
+	  if (mask[0])
+	    {
+	      memset (mask, 0, len);
+	      r = build_constructor (TREE_TYPE (r), NULL);
+	      CONSTRUCTOR_NO_CLEARING (r) = 1;
+	    }
+	}
+    }
   else if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE)
     {
       r = native_interpret_aggregate (TREE_TYPE (t), ptr, 0, len);
       if (r != NULL_TREE)
-	clear_type_padding_in_mask (TREE_TYPE (t), mask);
+	{
+	  clear_type_padding_in_mask (TREE_TYPE (t), mask);
+	  clear_uchar_or_std_byte_in_mask (loc, r, mask);
+	}
     }
 
   if (r != NULL_TREE)
--- gcc/testsuite/g++.dg/cpp2a/bit-cast11.C.jj	2021-11-30 10:42:21.301491218 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast11.C	2021-11-30 12:09:07.967090417 +0100
@@ -0,0 +1,69 @@ 
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+struct S { unsigned char a[2]; alignas(sizeof 0) int b; };
+struct T { char a; alignas(sizeof 0) int b; };
+struct U { char a : 1; char : 6; char b : 1; };
+struct V { int a; S b; };
+struct W { unsigned a; T b; };
+
+constexpr bool
+f1 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return u.a[0] == 1;
+}
+
+constexpr bool
+f2 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  unsigned char b = a;
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return u.a[1] == 0;
+}
+
+constexpr bool
+f4 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  unsigned char b = a;
+  return b == 0;		// { dg-error "is not a constant expression" }
+}
+
+constexpr bool
+f5 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  V u = s;
+  return u.b.a[0] == 1;
+}
+
+constexpr bool
+f6 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  V u = s;
+  return u.b.a[1] == 1;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();	// { dg-error "accessing uninitialized array element" }
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();	// { dg-error "accessing uninitialized array element" }
--- gcc/testsuite/g++.dg/cpp2a/bit-cast12.C.jj	2021-11-30 10:42:21.301491218 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast12.C	2021-11-30 12:09:52.178449320 +0100
@@ -0,0 +1,74 @@ 
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+namespace std
+{
+  enum class byte : unsigned char {};
+}
+
+struct S { unsigned char a[2]; alignas(sizeof 0) int b; };
+struct T { char a; alignas(sizeof 0) int b; };
+struct U { char a : 1; char : 6; char b : 1; };
+struct V { int a; S b; };
+struct W { unsigned a; T b; };
+
+constexpr bool
+f1 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return u.a[0] == 1;
+}
+
+constexpr bool
+f2 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  unsigned char b = a;
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return u.a[1] == 0;
+}
+
+constexpr bool
+f4 ()
+{
+  U u = { 0, 0 };
+  unsigned char a = __builtin_bit_cast (unsigned char, u);
+  unsigned char b = a;
+  return b == 0;		// { dg-error "is not a constant expression" }
+}
+
+constexpr bool
+f5 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  V u = s;
+  return u.b.a[0] == 1;
+}
+
+constexpr bool
+f6 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);
+  V u = s;
+  return u.b.a[1] == 1;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();	// { dg-error "accessing uninitialized array element" }
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();	// { dg-error "accessing uninitialized array element" }
--- gcc/testsuite/g++.dg/cpp2a/bit-cast13.C.jj	2021-11-30 10:42:21.302491204 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast13.C	2021-11-30 10:42:21.302491204 +0100
@@ -0,0 +1,69 @@ 
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+struct S { char a[2]; alignas(sizeof 0) int b; };
+struct T { char a; alignas(sizeof 0) int b; };
+struct U { char a : 1; char : 6; char b : 1; };
+struct V { int a; S b; };
+struct W { unsigned a; T b; };
+
+constexpr bool
+f1 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
+  S u = s;
+  return u.a[0] == 1;
+}
+
+constexpr bool
+f2 ()
+{
+  U u = { 0, 0 };
+  char a = __builtin_bit_cast (char, u);	// { dg-error "accessing uninitialized byte" }
+  char b = a;
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T t = { 1, 2 };
+  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
+  S u = s;
+  return u.a[1] == 0;
+}
+
+constexpr bool
+f4 ()
+{
+  U u = { 0, 0 };
+  char a = __builtin_bit_cast (char, u);	// { dg-error "accessing uninitialized byte" }
+  char b = a;
+  return b == 0;
+}
+
+constexpr bool
+f5 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);	// { dg-error "accessing uninitialized byte" }
+  V u = s;
+  return u.b.a[0] == 1;
+}
+
+constexpr bool
+f6 ()
+{
+  W t = { 1, 2 };
+  V s = __builtin_bit_cast (V, t);	// { dg-error "accessing uninitialized byte" }
+  V u = s;
+  return u.b.a[1] == 1;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();
+constexpr bool f = f6 ();
--- gcc/testsuite/g++.dg/cpp2a/bit-cast14.C.jj	2021-11-30 12:10:24.413981883 +0100
+++ gcc/testsuite/g++.dg/cpp2a/bit-cast14.C	2021-11-30 12:33:36.343979395 +0100
@@ -0,0 +1,78 @@ 
+// P1272R4
+// { dg-do compile { target c++14 } }
+
+struct S { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; };
+struct T1 { unsigned char a : 1, : 7, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; };
+struct T2 { unsigned char a : 8, b : 1, : 4, c : 3, d, e; unsigned int f : 8, g : 24; };
+struct T3 { unsigned char a : 8, b : 5, c : 1, : 2, d, e; unsigned int f : 8, g : 24; };
+struct T4 { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 1, : 7, g : 24; };
+
+constexpr bool
+f1 ()
+{
+  T1 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return true;
+}
+
+constexpr bool
+f2 ()
+{
+  T2 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return true;
+}
+
+constexpr bool
+f3 ()
+{
+  T3 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  S u = s;
+  return true;
+}
+
+constexpr bool
+f4 ()
+{
+  T4 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);	// { dg-error "accessing uninitialized byte" }
+  return true;
+}
+
+constexpr bool
+f5 ()
+{
+  T1 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  unsigned char a = s.a;
+  return true;
+}
+
+constexpr bool
+f6 ()
+{
+  T2 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  unsigned char b = s.b;
+  return true;
+}
+
+constexpr bool
+f7 ()
+{
+  T3 t = { 0, 0, 0, 0, 0, 0, 0 };
+  S s = __builtin_bit_cast (S, t);
+  unsigned char c = s.c;
+  return true;
+}
+
+constexpr bool a = f1 ();
+constexpr bool b = f2 ();
+constexpr bool c = f3 ();
+constexpr bool d = f4 ();
+constexpr bool e = f5 ();	// { dg-error "accessing uninitialized member" }
+constexpr bool f = f6 ();	// { dg-error "accessing uninitialized member" }
+constexpr bool g = f7 ();	// { dg-error "accessing uninitialized member" }