From patchwork Tue Nov 30 12:17:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 48277 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9E959385781D for ; Tue, 30 Nov 2021 12:18:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9E959385781D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1638274700; bh=SjM5+3mlyipeWwTkHG2WEYNpS3tncNAQIwhvsG6J5zg=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=WOzKtBdHV/dOSqzFTcFAafzzvXUnBlbCaQlX8p5u6ReKia+6s+Cg8Bh1RSZiCPPkL sFO3YB6BXcq2T1vnkg4WB5/5lHTY0FUIrOzHY9rdtjjnkuRXdjUB4mR1NKPyqwFkAu wLs+3H7L1zIhpP6Yvmme5DjZZeT2dkjd5P3AhZCc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 034103858023 for ; Tue, 30 Nov 2021 12:17:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 034103858023 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-83-GuRUz30FO4y98o6QI28ttw-1; Tue, 30 Nov 2021 07:17:17 -0500 X-MC-Unique: GuRUz30FO4y98o6QI28ttw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 34CB281CCB5 for ; Tue, 30 Nov 2021 12:17:16 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.194.188]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B40495D9C0; Tue, 30 Nov 2021 12:17:10 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 1AUCH78u2931902 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 30 Nov 2021 13:17:07 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 1AUCH7tn2931901; Tue, 30 Nov 2021 13:17:07 +0100 Date: Tue, 30 Nov 2021 13:17:07 +0100 To: Jason Merrill Subject: [PATCH] c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4 Message-ID: <20211130121707.GM2646553@tucnak> References: <20211129163142.GG2646553@tucnak> MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=0.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPAM_BODY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: Jonathan Wakely , gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote: > It's a DR. Really, it was intended to be part of C++20; at the Cologne > meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix > could go in as part of that paper. Ok, changed to be done unconditionally now. > Also, allowing indeterminate values that are never read was in C++20 > (P1331). Reading P1331R2 again, I'm still puzzled. Our current behavior (both before and after this patch) is that if some variable is scalar and has indeterminate value or if an aggregate variable has some members (possibly nested) with indeterminate values, in constexpr contexts we allow copying those into other vars of the same type (e.g. the testcases in the patch below test mere copying of the whole structures or unsigned char result of __builtin_bit_cast), but we reject if we actually use them in some other way (e.g. try to read a member from a variable that has that member indeterminate, see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an unsigned char variable. Then there is P1331R2 which makes the UB on "an lvalue-to-rvalue conversion that is applied to an object with indeterminate value ([basic.indet]);" but isn't even the unsigned char a = __builtin_bit_cast (unsigned char, u); unsigned char b = a; case non-constant then when __builtin_bit_cast returns indeterminate value? __builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens in that case, so supposely unsigned char a = __builtin_bit_cast (unsigned char, u); is fine, but on unsigned char b = a; a is lvalue and is converted to rvalue. Similarly T t = { 1, 2 }; S s = __builtin_bit_cast (S, t); S u = s; where S s = __builtin_bit_cast (S, t); could be ok even when some or all members are indeterminate, but u = s; does lvalue-to-rvalue conversion? Or there is http://eel.is/c++draft/basic.indet that has quite clear rules what is and isn't UB and if C++ wanted to go further and allow all those valid cases in there as constant... Anyway, I hope this can be dealt with incrementally. > I think in all of them the result of the cast has (some) indeterminate > value. So f1-3 are OK because the indeterminate value has unsigned char > type and is never used; f4() is non-constant because S::f has > non-byte-access type and so the new wording says it's undefined. Ok, implemented the bitfield handling then. Here is an updated patch, so far lightly tested. 2021-11-30 Jakub Jelinek * constexpr.c (clear_uchar_or_std_byte_in_mask): New function. (cxx_eval_bit_cast): Don't error about padding bits if target type is unsigned char or std::byte, instead return no clearing ctor. Use clear_uchar_or_std_byte_in_mask. * g++.dg/cpp2a/bit-cast11.C: New test. * g++.dg/cpp2a/bit-cast12.C: New test. * g++.dg/cpp2a/bit-cast13.C: New test. * g++.dg/cpp2a/bit-cast14.C: New test. Jakub --- gcc/cp/constexpr.c.jj 2021-11-30 09:44:46.531607444 +0100 +++ gcc/cp/constexpr.c 2021-11-30 12:20:29.105251443 +0100 @@ -4268,6 +4268,121 @@ check_bit_cast_type (const constexpr_ctx return false; } +/* Helper function for cxx_eval_bit_cast. For unsigned char or + std::byte members of CONSTRUCTOR (recursively) if they contain + some indeterminate bits (as set in MASK), remove the ctor elts, + mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the + bits in MASK. */ + +static void +clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask) +{ + if (TREE_CODE (t) != CONSTRUCTOR) + return; + + unsigned i, j = 0; + tree index, value; + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value) + { + tree type = TREE_TYPE (value); + if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE + && DECL_BIT_FIELD_TYPE (index) != NULL_TREE) + { + if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index)) + && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index)) + != char_type_node)) + { + HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index)); + gcc_assert (fldsz != 0); + HOST_WIDE_INT pos = int_byte_position (index); + HOST_WIDE_INT bpos + = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index)); + bpos %= BITS_PER_UNIT; + HOST_WIDE_INT end + = ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT; + gcc_assert (end == 1 || end == 2); + unsigned char *p = mask + pos; + unsigned char mask_save[2]; + mask_save[0] = mask[pos]; + mask_save[1] = end == 2 ? mask[pos + 1] : 0; + if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN) + sorry_at (loc, "PDP11 bit-field handling unsupported" + " in %qs", "__builtin_bit_cast"); + else if (BYTES_BIG_ENDIAN) + { + /* Big endian. */ + if (bpos + fldsz <= BITS_PER_UNIT) + *p &= ~(((1 << fldsz) - 1) + << (BITS_PER_UNIT - bpos - fldsz)); + else + { + gcc_assert (bpos); + *p &= ~(((1U << BITS_PER_UNIT) - 1) >> bpos); + p++; + fldsz -= BITS_PER_UNIT - bpos; + gcc_assert (fldsz && fldsz < BITS_PER_UNIT); + *p &= ((1U << BITS_PER_UNIT) - 1) >> fldsz; + } + } + else + { + /* Little endian. */ + if (bpos + fldsz <= BITS_PER_UNIT) + *p &= ~(((1 << fldsz) - 1) << bpos); + else + { + gcc_assert (bpos); + *p &= ~(((1 << BITS_PER_UNIT) - 1) << bpos); + p++; + fldsz -= BITS_PER_UNIT - bpos; + gcc_assert (fldsz && fldsz < BITS_PER_UNIT); + *p &= ~((1 << fldsz) - 1); + } + } + if (mask_save[0] != mask[pos] + || (end == 2 && mask_save[1] != mask[pos + 1])) + { + CONSTRUCTOR_NO_CLEARING (t) = 1; + continue; + } + } + } + else if (is_byte_access_type (type) + && TYPE_MAIN_VARIANT (type) != char_type_node) + { + HOST_WIDE_INT pos; + if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) + pos = tree_to_shwi (index); + else + pos = int_byte_position (index); + if (mask[pos]) + { + CONSTRUCTOR_NO_CLEARING (t) = 1; + mask[pos] = 0; + continue; + } + } + if (TREE_CODE (value) == CONSTRUCTOR) + { + HOST_WIDE_INT pos; + if (TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) + pos = tree_to_shwi (index) + * tree_to_shwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (t)))); + else + pos = int_byte_position (index); + clear_uchar_or_std_byte_in_mask (loc, value, mask + pos); + } + if (i != j) + { + CONSTRUCTOR_ELT (t, j)->index = index; + CONSTRUCTOR_ELT (t, j)->value = value; + } + ++j; + } + if (CONSTRUCTOR_NELTS (t) != j) + vec_safe_truncate (CONSTRUCTOR_ELTS (t), j); +} + /* Subroutine of cxx_eval_constant_expression. Attempt to evaluate a BIT_CAST_EXPR. */ @@ -4344,12 +4459,28 @@ cxx_eval_bit_cast (const constexpr_ctx * tree r = NULL_TREE; if (can_native_interpret_type_p (TREE_TYPE (t))) - r = native_interpret_expr (TREE_TYPE (t), ptr, len); + { + r = native_interpret_expr (TREE_TYPE (t), ptr, len); + if (is_byte_access_type (TREE_TYPE (t)) + && TYPE_MAIN_VARIANT (TREE_TYPE (t)) != char_type_node) + { + gcc_assert (len == 1); + if (mask[0]) + { + memset (mask, 0, len); + r = build_constructor (TREE_TYPE (r), NULL); + CONSTRUCTOR_NO_CLEARING (r) = 1; + } + } + } else if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE) { r = native_interpret_aggregate (TREE_TYPE (t), ptr, 0, len); if (r != NULL_TREE) - clear_type_padding_in_mask (TREE_TYPE (t), mask); + { + clear_type_padding_in_mask (TREE_TYPE (t), mask); + clear_uchar_or_std_byte_in_mask (loc, r, mask); + } } if (r != NULL_TREE) --- gcc/testsuite/g++.dg/cpp2a/bit-cast11.C.jj 2021-11-30 10:42:21.301491218 +0100 +++ gcc/testsuite/g++.dg/cpp2a/bit-cast11.C 2021-11-30 12:09:07.967090417 +0100 @@ -0,0 +1,69 @@ +// P1272R4 +// { dg-do compile { target c++14 } } + +struct S { unsigned char a[2]; alignas(sizeof 0) int b; }; +struct T { char a; alignas(sizeof 0) int b; }; +struct U { char a : 1; char : 6; char b : 1; }; +struct V { int a; S b; }; +struct W { unsigned a; T b; }; + +constexpr bool +f1 () +{ + T t = { 1, 2 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return u.a[0] == 1; +} + +constexpr bool +f2 () +{ + U u = { 0, 0 }; + unsigned char a = __builtin_bit_cast (unsigned char, u); + unsigned char b = a; + return true; +} + +constexpr bool +f3 () +{ + T t = { 1, 2 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return u.a[1] == 0; +} + +constexpr bool +f4 () +{ + U u = { 0, 0 }; + unsigned char a = __builtin_bit_cast (unsigned char, u); + unsigned char b = a; + return b == 0; // { dg-error "is not a constant expression" } +} + +constexpr bool +f5 () +{ + W t = { 1, 2 }; + V s = __builtin_bit_cast (V, t); + V u = s; + return u.b.a[0] == 1; +} + +constexpr bool +f6 () +{ + W t = { 1, 2 }; + V s = __builtin_bit_cast (V, t); + V u = s; + return u.b.a[1] == 1; +} + +constexpr bool a = f1 (); +constexpr bool b = f2 (); +constexpr bool c = f3 (); // { dg-error "accessing uninitialized array element" } +constexpr bool d = f4 (); +constexpr bool e = f5 (); +constexpr bool f = f6 (); // { dg-error "accessing uninitialized array element" } --- gcc/testsuite/g++.dg/cpp2a/bit-cast12.C.jj 2021-11-30 10:42:21.301491218 +0100 +++ gcc/testsuite/g++.dg/cpp2a/bit-cast12.C 2021-11-30 12:09:52.178449320 +0100 @@ -0,0 +1,74 @@ +// P1272R4 +// { dg-do compile { target c++14 } } + +namespace std +{ + enum class byte : unsigned char {}; +} + +struct S { unsigned char a[2]; alignas(sizeof 0) int b; }; +struct T { char a; alignas(sizeof 0) int b; }; +struct U { char a : 1; char : 6; char b : 1; }; +struct V { int a; S b; }; +struct W { unsigned a; T b; }; + +constexpr bool +f1 () +{ + T t = { 1, 2 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return u.a[0] == 1; +} + +constexpr bool +f2 () +{ + U u = { 0, 0 }; + unsigned char a = __builtin_bit_cast (unsigned char, u); + unsigned char b = a; + return true; +} + +constexpr bool +f3 () +{ + T t = { 1, 2 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return u.a[1] == 0; +} + +constexpr bool +f4 () +{ + U u = { 0, 0 }; + unsigned char a = __builtin_bit_cast (unsigned char, u); + unsigned char b = a; + return b == 0; // { dg-error "is not a constant expression" } +} + +constexpr bool +f5 () +{ + W t = { 1, 2 }; + V s = __builtin_bit_cast (V, t); + V u = s; + return u.b.a[0] == 1; +} + +constexpr bool +f6 () +{ + W t = { 1, 2 }; + V s = __builtin_bit_cast (V, t); + V u = s; + return u.b.a[1] == 1; +} + +constexpr bool a = f1 (); +constexpr bool b = f2 (); +constexpr bool c = f3 (); // { dg-error "accessing uninitialized array element" } +constexpr bool d = f4 (); +constexpr bool e = f5 (); +constexpr bool f = f6 (); // { dg-error "accessing uninitialized array element" } --- gcc/testsuite/g++.dg/cpp2a/bit-cast13.C.jj 2021-11-30 10:42:21.302491204 +0100 +++ gcc/testsuite/g++.dg/cpp2a/bit-cast13.C 2021-11-30 10:42:21.302491204 +0100 @@ -0,0 +1,69 @@ +// P1272R4 +// { dg-do compile { target c++14 } } + +struct S { char a[2]; alignas(sizeof 0) int b; }; +struct T { char a; alignas(sizeof 0) int b; }; +struct U { char a : 1; char : 6; char b : 1; }; +struct V { int a; S b; }; +struct W { unsigned a; T b; }; + +constexpr bool +f1 () +{ + T t = { 1, 2 }; + S s = __builtin_bit_cast (S, t); // { dg-error "accessing uninitialized byte" } + S u = s; + return u.a[0] == 1; +} + +constexpr bool +f2 () +{ + U u = { 0, 0 }; + char a = __builtin_bit_cast (char, u); // { dg-error "accessing uninitialized byte" } + char b = a; + return true; +} + +constexpr bool +f3 () +{ + T t = { 1, 2 }; + S s = __builtin_bit_cast (S, t); // { dg-error "accessing uninitialized byte" } + S u = s; + return u.a[1] == 0; +} + +constexpr bool +f4 () +{ + U u = { 0, 0 }; + char a = __builtin_bit_cast (char, u); // { dg-error "accessing uninitialized byte" } + char b = a; + return b == 0; +} + +constexpr bool +f5 () +{ + W t = { 1, 2 }; + V s = __builtin_bit_cast (V, t); // { dg-error "accessing uninitialized byte" } + V u = s; + return u.b.a[0] == 1; +} + +constexpr bool +f6 () +{ + W t = { 1, 2 }; + V s = __builtin_bit_cast (V, t); // { dg-error "accessing uninitialized byte" } + V u = s; + return u.b.a[1] == 1; +} + +constexpr bool a = f1 (); +constexpr bool b = f2 (); +constexpr bool c = f3 (); +constexpr bool d = f4 (); +constexpr bool e = f5 (); +constexpr bool f = f6 (); --- gcc/testsuite/g++.dg/cpp2a/bit-cast14.C.jj 2021-11-30 12:10:24.413981883 +0100 +++ gcc/testsuite/g++.dg/cpp2a/bit-cast14.C 2021-11-30 12:33:36.343979395 +0100 @@ -0,0 +1,78 @@ +// P1272R4 +// { dg-do compile { target c++14 } } + +struct S { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; }; +struct T1 { unsigned char a : 1, : 7, b : 5, c : 3, d, e; unsigned int f : 8, g : 24; }; +struct T2 { unsigned char a : 8, b : 1, : 4, c : 3, d, e; unsigned int f : 8, g : 24; }; +struct T3 { unsigned char a : 8, b : 5, c : 1, : 2, d, e; unsigned int f : 8, g : 24; }; +struct T4 { unsigned char a : 8, b : 5, c : 3, d, e; unsigned int f : 1, : 7, g : 24; }; + +constexpr bool +f1 () +{ + T1 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return true; +} + +constexpr bool +f2 () +{ + T2 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return true; +} + +constexpr bool +f3 () +{ + T3 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); + S u = s; + return true; +} + +constexpr bool +f4 () +{ + T4 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); // { dg-error "accessing uninitialized byte" } + return true; +} + +constexpr bool +f5 () +{ + T1 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); + unsigned char a = s.a; + return true; +} + +constexpr bool +f6 () +{ + T2 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); + unsigned char b = s.b; + return true; +} + +constexpr bool +f7 () +{ + T3 t = { 0, 0, 0, 0, 0, 0, 0 }; + S s = __builtin_bit_cast (S, t); + unsigned char c = s.c; + return true; +} + +constexpr bool a = f1 (); +constexpr bool b = f2 (); +constexpr bool c = f3 (); +constexpr bool d = f4 (); +constexpr bool e = f5 (); // { dg-error "accessing uninitialized member" } +constexpr bool f = f6 (); // { dg-error "accessing uninitialized member" } +constexpr bool g = f7 (); // { dg-error "accessing uninitialized member" }