From patchwork Tue Jan 18 10:17:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 50150 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 687A63857C4E for ; Tue, 18 Jan 2022 10:19:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 687A63857C4E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642501140; bh=z4Xbhz1t7hB5xRXkDK4V7LkXW8Vnob/OUEGxZGlYOg8=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=hcLC3Ji0RAWQZT3NJuqSCv1ja+p5ofkWGUtkVn4Re1c9HcXZ9T5bkiYtx4f15iXOc OSujarbKzvAP/mzX1Wt6PLyY+MBj6ZTZ7KR6SWqo5Ktda2WRdvv4wqG/UmLrMhL7Re m3p4w6Rpr3LVpGedRwBeJ358vVD9kuQO5tLK2cl4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id DD3CC3857C4C for ; Tue, 18 Jan 2022 10:17:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DD3CC3857C4C Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-263-LNLsCPARPUiX8q-StwIIZA-1; Tue, 18 Jan 2022 05:17:45 -0500 X-MC-Unique: LNLsCPARPUiX8q-StwIIZA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E65442F24 for ; Tue, 18 Jan 2022 10:17:44 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.40.192.74]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3A5F770D3F; Tue, 18 Jan 2022 10:17:44 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.16.1/8.16.1) with ESMTPS id 20IAHgdF3067120 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 18 Jan 2022 11:17:42 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.16.1/8.16.1/Submit) id 20IAHgpE3067119; Tue, 18 Jan 2022 11:17:42 +0100 Date: Tue, 18 Jan 2022 11:17:41 +0100 To: Jason Merrill Subject: [PATCH] c++: Further address_compare fixes [PR89074] Message-ID: <20220118101741.GW2646553@tucnak> References: <20220106092416.GX2646553@tucnak> <494ae254-aff9-8be9-1cee-5ee42b6db593@redhat.com> MIME-Version: 1.0 In-Reply-To: <494ae254-aff9-8be9-1cee-5ee42b6db593@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" On Thu, Jan 13, 2022 at 04:18:33PM -0500, Jason Merrill wrote: > > Note, address_compare has some special cases, e.g. it assumes that > > static vars are never adjacent to automatic vars, which is the case > > for the usual layout where automatic vars are on the stack and after > > .rodata/.data sections there is heap: > > /* Assume that automatic variables can't be adjacent to global > > variables. */ > > else if (is_global_var (base0) != is_global_var (base1)) > > ; > > Is it ok that during constant evaluation we don't treat those as undefined > > behavior, or shall that be with !folding_initializer && too? > > I guess that's undefined as well. Ok, following patch seems to guard that too > > Another special case is: > > if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST) > > || (TREE_CODE (base0) == STRING_CST && DECL_P (base1)) > > || (TREE_CODE (base0) == STRING_CST > > && TREE_CODE (base1) == STRING_CST > > && ioff0 >= 0 && ioff1 >= 0 > > && ioff0 < TREE_STRING_LENGTH (base0) > > && ioff1 < TREE_STRING_LENGTH (base1) > > /* This is a too conservative test that the STRING_CSTs > > will not end up being string-merged. */ > > && strncmp (TREE_STRING_POINTER (base0) + ioff0, > > TREE_STRING_POINTER (base1) + ioff1, > > MIN (TREE_STRING_LENGTH (base0) - ioff0, > > TREE_STRING_LENGTH (base1) - ioff1)) != 0)) > > ; > > else if (!DECL_P (base0) || !DECL_P (base1)) > > return 2; > > Here we similarly assume that vars aren't adjacent to string literals > > or vice versa. Do we need to stick !folding_initializer && to those > > DECL_P vs. STRING_CST cases? > > Seems so. and this too > > Though, because of the return 2; for > > non-DECL_P that would mean rejecting comparisons like &var == &"foobar"[3] > > etc. which ought to be fine, no? So perhaps we need to watch for > > decls. vs. STRING_CSTs like for DECLs whether the address is at the start > > or at the end of the string literal or somewhere in between (at least > > for folding_initializer)? > > Agreed. and this as well. Furthermore I've fixed constexpr-compare2.C by assuming if folding_initializer that addresses of non-aliased (visibly to the compiler) FUNCTION_DECLs are different and that functions are non-zero sized for the purpose of var vs. function comparisons. > > And yet another chapter but probably unsolvable is comparison of > > string literal addresses. I think pedantically in C++ > > &"foo"[0] == &"foo"[0] is undefined behavior, different occurences of > > the same string literals might still not be merged in some implementations. > > I disagree; it's unspecified whether string literals are merged, but I think > the comparison result is well specified depending on that implementation > behavior. Can you please comment on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86369#c1 then? Anyway, the following has been successfully bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-01-18 Jakub Jelinek PR c++/89074 PR c++/104033 * fold-const.c (address_compare): Restrict the decl vs. STRING_CST or vice versa or STRING_CST vs. STRING_CST or is_global_var != is_global_var optimizations to !folding_initializer. Punt for FUNCTION_DECLs with non-zero offsets. If folding_initializer, assume non-aliased functions have non-zero size and have different addresses. For folding_initializer, punt on comparisons of start of some object and end of another one, regardless whether it is a decl or string literal. Also punt for folding_initializer of STRING_CST vs. STRING_CST comparisons if the two literals could be overlapping. * g++.dg/cpp1y/constexpr-89074-3.C: New test. Jakub --- gcc/fold-const.c.jj 2022-01-17 14:19:08.817376382 +0100 +++ gcc/fold-const.c 2022-01-17 15:50:16.687211071 +0100 @@ -16608,21 +16608,27 @@ address_compare (tree_code code, tree ty HOST_WIDE_INT ioff0 = -1, ioff1 = -1; off0.is_constant (&ioff0); off1.is_constant (&ioff1); - if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST) - || (TREE_CODE (base0) == STRING_CST && DECL_P (base1)) - || (TREE_CODE (base0) == STRING_CST - && TREE_CODE (base1) == STRING_CST - && ioff0 >= 0 && ioff1 >= 0 - && ioff0 < TREE_STRING_LENGTH (base0) - && ioff1 < TREE_STRING_LENGTH (base1) - /* This is a too conservative test that the STRING_CSTs - will not end up being string-merged. */ - && strncmp (TREE_STRING_POINTER (base0) + ioff0, - TREE_STRING_POINTER (base1) + ioff1, - MIN (TREE_STRING_LENGTH (base0) - ioff0, - TREE_STRING_LENGTH (base1) - ioff1)) != 0)) + if (!folding_initializer + && ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST) + || (TREE_CODE (base0) == STRING_CST && DECL_P (base1)) + || (TREE_CODE (base0) == STRING_CST + && TREE_CODE (base1) == STRING_CST + && ioff0 >= 0 && ioff1 >= 0 + && ioff0 < TREE_STRING_LENGTH (base0) + && ioff1 < TREE_STRING_LENGTH (base1) + /* This is a too conservative test that the STRING_CSTs + will not end up being string-merged. */ + && strncmp (TREE_STRING_POINTER (base0) + ioff0, + TREE_STRING_POINTER (base1) + ioff1, + MIN (TREE_STRING_LENGTH (base0) - ioff0, + TREE_STRING_LENGTH (base1) - ioff1)) != 0))) ; - else if (!DECL_P (base0) || !DECL_P (base1)) + /* Punt on non-zero offsets from functions. */ + else if ((TREE_CODE (base0) == FUNCTION_DECL && ioff0) + || (TREE_CODE (base1) == FUNCTION_DECL && ioff1)) + return 2; + else if ((!DECL_P (base0) && TREE_CODE (base0) != STRING_CST) + || (!DECL_P (base1) && TREE_CODE (base1) != STRING_CST)) return 2; /* If this is a pointer comparison, ignore for now even valid equalities where one pointer is the offset zero @@ -16631,18 +16637,62 @@ address_compare (tree_code code, tree ty ; /* Assume that automatic variables can't be adjacent to global variables. */ - else if (is_global_var (base0) != is_global_var (base1)) + else if (!folding_initializer + && is_global_var (base0) != is_global_var (base1)) + ; + /* For initializers, assume addresses of different functions are + different. */ + else if (folding_initializer + && TREE_CODE (base0) == FUNCTION_DECL + && TREE_CODE (base1) == FUNCTION_DECL) ; else { - tree sz0 = DECL_SIZE_UNIT (base0); - tree sz1 = DECL_SIZE_UNIT (base1); - /* If sizes are unknown, e.g. VLA or not representable, punt. */ - if (!tree_fits_poly_int64_p (sz0) || !tree_fits_poly_int64_p (sz1)) - return 2; + poly_int64 size0, size1; + if (TREE_CODE (base0) == STRING_CST) + { + if (!folding_initializer + || ioff0 < 0 + || ioff0 > TREE_STRING_LENGTH (base0)) + return 2; + size0 = TREE_STRING_LENGTH (base0); + } + /* For initializers, assume function decls don't overlap and have + non-empty size. */ + else if (folding_initializer && TREE_CODE (base0) == FUNCTION_DECL) + size0 = 1; + else + { + tree sz0 = DECL_SIZE_UNIT (base0); + /* If sizes are unknown, e.g. VLA or not representable, punt. */ + if (!tree_fits_poly_int64_p (sz0)) + return 2; + + size0 = tree_to_poly_int64 (sz0); + } + + if (TREE_CODE (base1) == STRING_CST) + { + if (!folding_initializer + || ioff1 < 0 + || ioff1 > TREE_STRING_LENGTH (base1)) + return 2; + size1 = TREE_STRING_LENGTH (base1); + } + /* For initializers, assume function decls don't overlap and have + non-empty size. */ + else if (folding_initializer && TREE_CODE (base1) == FUNCTION_DECL) + size1 = 1; + else + { + tree sz1 = DECL_SIZE_UNIT (base1); + /* If sizes are unknown, e.g. VLA or not representable, punt. */ + if (!tree_fits_poly_int64_p (sz1)) + return 2; + + size1 = tree_to_poly_int64 (sz1); + } - poly_int64 size0 = tree_to_poly_int64 (sz0); - poly_int64 size1 = tree_to_poly_int64 (sz1); /* If one offset is pointing (or could be) to the beginning of one object and the other is pointing to one past the last byte of the other object, punt. */ @@ -16658,6 +16708,27 @@ address_compare (tree_code code, tree ty && (known_ne (off0, 0) || (known_ne (size0, 0) && known_ne (size1, 0)))) equal = 0; + if (equal == 0 + && TREE_CODE (base0) == STRING_CST + && TREE_CODE (base1) == STRING_CST) + { + /* If the bytes in the string literals starting at the pointers + differ, the pointers need to be different. */ + if (memcmp (TREE_STRING_POINTER (base0) + ioff0, + TREE_STRING_POINTER (base1) + ioff1, + MIN (TREE_STRING_LENGTH (base0) - ioff0, + TREE_STRING_LENGTH (base1) - ioff1)) == 0) + { + HOST_WIDE_INT ioffmin = MIN (ioff0, ioff1); + if (memcmp (TREE_STRING_POINTER (base0) + ioff0 - ioffmin, + TREE_STRING_POINTER (base1) + ioff1 - ioffmin, + ioffmin) == 0) + /* If even the bytes in the string literal before the + pointers are the same, the string literals could be + tail merged. */ + return 2; + } + } } return equal; } --- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-3.C.jj 2022-01-17 15:22:43.743566175 +0100 +++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-3.C 2022-01-17 16:10:19.182230570 +0100 @@ -0,0 +1,132 @@ +// PR c++/89074 +// { dg-do compile { target c++14 } } + +int fn1 (void) { return 0; } +int fn2 (void) { return 1; } + +constexpr bool +f1 () +{ + char a[] = { 1, 2, 3, 4 }; + + if (&a[1] == "foo") + return false; + + if (&a[1] == &"foo"[4]) + return false; + + if (&"foo"[1] == &a[0]) + return false; + + if (&"foo"[3] == &a[4]) + return false; + + if (&a[0] == "foo") + return false; + + // Pointer to start of one object (var) and end of another one (literal) + if (&a[0] == &"foo"[4]) // { dg-error "is not a constant expression" } + return false; + + return true; +} + +constexpr bool +f2 () +{ + char a[] = { 1, 2, 3, 4 }; + + // Pointer to end of one object (var) and start of another one (literal) + if (&a[4] == "foo") // { dg-error "is not a constant expression" } + return false; + + return true; +} + +char v[] = { 1, 2, 3, 4 }; + +constexpr bool +f3 () +{ + char a[] = { 1, 2, 3, 4 }; + + if (&a[1] == &v[1]) + return false; + + if (&a[0] == &v[3]) + return false; + + if (&a[2] == &v[4]) + return false; + + // Pointer to start of one object (automatic var) and end of another one (non-automagic var) + if (&a[0] == &v[4]) // { dg-error "is not a constant expression" } + return false; + + return true; +} + +constexpr bool +f4 () +{ + char a[] = { 1, 2, 3, 4, 5 }; + + // Pointer to end of one object (automatic var) and start of another one (non-automagic var) + if (&a[5] == &v[0]) // { dg-error "is not a constant expression" } + return false; + + return true; +} + +constexpr bool +f5 () +{ + if (fn1 != fn1) + return false; + + if (fn1 == fn2) + return false; + + if (&"abcde"[0] == &"edcba"[1]) + return false; + + if (&"abcde"[1] == &"edcba"[6]) + return false; + + // Pointer to start of one object (literal) and end of another one (literal) + if (&"abcde"[0] == &"edcba"[6]) // { dg-error "is not a constant expression" } + return false; + + return true; +} + +constexpr bool +f6 () +{ + // Pointer to start of one object (literal) and end of another one (literal) + if (&"abcde"[6] == &"edcba"[0]) // { dg-error "is not a constant expression" } + return false; + + return true; +} + +constexpr bool +f7 () +{ + if (&"abcde"[3] == &"fabcde"[3]) + return false; + + // These could be suffix merged, with &"abcde"[0] == &"fabcde"[1]. + if (&"abcde"[3] == &"fabcde"[4]) // { dg-error "is not a constant expression" } + return false; + + return true; +} + +constexpr bool a = f1 (); +constexpr bool b = f2 (); +constexpr bool c = f3 (); +constexpr bool d = f4 (); +constexpr bool e = f5 (); +constexpr bool f = f6 (); +constexpr bool g = f7 ();