Message ID | 20220516091616.7D3DF13ADC@imap2.suse-dmz.suse.de |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8DB14385780F for <patchwork@sourceware.org>; Mon, 16 May 2022 09:24:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8DB14385780F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1652693097; bh=762mO84rsa2CngJrbgUCf376jeqNE+KfI2+PGpF9bJ8=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=eBZ1dwHttCvS6SYcEnPkh9y4b7QZwVPS6J/bex1+yxYm+wzhi5OJKmwKDHOch9tGW +bAXDuf6NmWLGfV0nbGcBOD5eFTDeyW0KxuT2pu3E0wgpe+VO5OUC0NdDhZz6RhhKp h6mMCFl0piaxDaYTGM3UzPyvOOaCVyWldlougioA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id B60893857831 for <gcc-patches@gcc.gnu.org>; Mon, 16 May 2022 09:16:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B60893857831 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 95DDD1FAF7; Mon, 16 May 2022 09:16:16 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 7D3DF13ADC; Mon, 16 May 2022 09:16:16 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 2kKBHWAWgmIVCQAAMHmgww (envelope-from <rguenther@suse.de>); Mon, 16 May 2022 09:16:16 +0000 Date: Mon, 16 May 2022 11:16:16 +0200 (CEST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] middle-end/105604 - snprintf dianostics and non-constant sizes/offsets MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Message-Id: <20220516091616.7D3DF13ADC@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Richard Biener <rguenther@suse.de> Cc: msebor@redhat.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
middle-end/105604 - snprintf dianostics and non-constant sizes/offsets
|
|
Commit Message
Richard Biener
May 16, 2022, 9:16 a.m. UTC
The following tries to correct get_origin_and_offset_r not handling non-constant sizes of array elements in ARRAY_REFs and non-constant offsets of COMPONENT_REFs. It isn't exactly clear how such failures should be treated in this API and existing handling isn't consistent here either. The following applies two different variants, treating non-constant array sizes like non-constant array indices and treating non-constant offsets of COMPONENT_REFs by terminating the recursion (not sure what that means to the callers). Basically the code failed to use component_ref_field_offset and array_ref_element_size and instead relies on inappropriate helpers (that shouldn't exist in the first place ...). The code is also not safe-guarded against overflows in the final offset/size computations but I'm not trying to rectify that. Martin - can you comment on how the API should handle such situations? Bootstrapped and tested on x86_64-unknown-linux-gnu. OK for trunk and branches? Thanks, Richard. 2022-05-16 Richard Biener <rguenther@suse.de> PR middle-end/105604 * gimple-ssa-sprintf.cc (get_origin_and_offset_r): Handle non-constant ARRAY_REF element size and non-constant COMPONENT_REF field offset. * gcc.dg/torture/pr105604.c: New testcase. --- gcc/gimple-ssa-sprintf.cc | 14 +++++++++++--- gcc/testsuite/gcc.dg/torture/pr105604.c | 24 ++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr105604.c
Comments
On 5/16/22 03:16, Richard Biener wrote: > The following tries to correct get_origin_and_offset_r not handling > non-constant sizes of array elements in ARRAY_REFs and non-constant > offsets of COMPONENT_REFs. It isn't exactly clear how such failures > should be treated in this API and existing handling isn't consistent > here either. The following applies two different variants, treating > non-constant array sizes like non-constant array indices and > treating non-constant offsets of COMPONENT_REFs by terminating > the recursion (not sure what that means to the callers). > > Basically the code failed to use component_ref_field_offset and > array_ref_element_size and instead relies on inappropriate > helpers (that shouldn't exist in the first place ...). The code > is also not safe-guarded against overflows in the final offset/size > computations but I'm not trying to rectify that. > > Martin - can you comment on how the API should handle such > situations? It looks like the -Wrestrict warning here ignores offsets equal to HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to that should avoid it. Or maybe to HWI_MAX as it does for variable offsets. It also looks like the function only handles constant offsets and sizes, and I have a vague recollection of enhancing it to work with ranges. That should avoid the overflow problem too. Martin > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > OK for trunk and branches? > > Thanks, > Richard. > > 2022-05-16 Richard Biener <rguenther@suse.de> > > PR middle-end/105604 > * gimple-ssa-sprintf.cc (get_origin_and_offset_r): > Handle non-constant ARRAY_REF element size and non-constant > COMPONENT_REF field offset. > > * gcc.dg/torture/pr105604.c: New testcase. > --- > gcc/gimple-ssa-sprintf.cc | 14 +++++++++++--- > gcc/testsuite/gcc.dg/torture/pr105604.c | 24 ++++++++++++++++++++++++ > 2 files changed, 35 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/pr105604.c > > diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc > index c93f12f90b5..14e215ce69c 100644 > --- a/gcc/gimple-ssa-sprintf.cc > +++ b/gcc/gimple-ssa-sprintf.cc > @@ -2312,14 +2312,16 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize, > HOST_WIDE_INT idx = (tree_fits_uhwi_p (offset) > ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX); > > + tree elsz = array_ref_element_size (x); > tree eltype = TREE_TYPE (x); > if (TREE_CODE (eltype) == INTEGER_TYPE) > { > if (off) > *off = idx; > } > - else if (idx < HOST_WIDE_INT_MAX) > - *fldoff += idx * int_size_in_bytes (eltype); > + else if (idx < HOST_WIDE_INT_MAX > + && tree_fits_shwi_p (elsz)) > + *fldoff += idx * tree_to_shwi (elsz); > else > *fldoff = idx; > > @@ -2350,8 +2352,14 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize, > > case COMPONENT_REF: > { > + tree foff = component_ref_field_offset (x); > tree fld = TREE_OPERAND (x, 1); > - *fldoff += int_byte_position (fld); > + if (!tree_fits_shwi_p (foff) > + || !tree_fits_shwi_p (DECL_FIELD_BIT_OFFSET (fld))) > + return x; > + *fldoff += (tree_to_shwi (foff) > + + (tree_to_shwi (DECL_FIELD_BIT_OFFSET (fld)) > + / BITS_PER_UNIT)); > > get_origin_and_offset_r (fld, fldoff, fldsize, off); > x = TREE_OPERAND (x, 0); > diff --git a/gcc/testsuite/gcc.dg/torture/pr105604.c b/gcc/testsuite/gcc.dg/torture/pr105604.c > new file mode 100644 > index 00000000000..b002251df10 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/pr105604.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-Wall" } */ > + > +struct { > + long users; > + long size; > + char *data; > +} * main_trans; > +void *main___trans_tmp_1; > +int sprintf(char *, char *, ...); > +int main() { > + int users = 0; > + struct { > + long users; > + long size; > + char *data; > + int links[users]; > + char buf[]; > + } *trans = trans; > + trans->data = trans->buf; > + main___trans_tmp_1 = trans; > + main_trans = main___trans_tmp_1; > + sprintf(main_trans->data, "test"); > +}
On Tue, 17 May 2022, Martin Sebor wrote: > On 5/16/22 03:16, Richard Biener wrote: > > The following tries to correct get_origin_and_offset_r not handling > > non-constant sizes of array elements in ARRAY_REFs and non-constant > > offsets of COMPONENT_REFs. It isn't exactly clear how such failures > > should be treated in this API and existing handling isn't consistent > > here either. The following applies two different variants, treating > > non-constant array sizes like non-constant array indices and > > treating non-constant offsets of COMPONENT_REFs by terminating > > the recursion (not sure what that means to the callers). > > > > Basically the code failed to use component_ref_field_offset and > > array_ref_element_size and instead relies on inappropriate > > helpers (that shouldn't exist in the first place ...). The code > > is also not safe-guarded against overflows in the final offset/size > > computations but I'm not trying to rectify that. > > > > Martin - can you comment on how the API should handle such > > situations? > > It looks like the -Wrestrict warning here ignores offsets equal to > HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to > that should avoid it. Or maybe to HWI_MAX as it does for variable > offsets. Can you suggest wording for the function comment as to how it handles the case when offset or size cannot be determined exactly? The comment currently only suggests that the caller possibly cannot trust fldsize or off when the function returns NULL but the actual implementation differs from that. > It also looks like the function only handles constant offsets and > sizes, and I have a vague recollection of enhancing it to work with > ranges. That should avoid the overflow problem too. So the correct thing is to return NULL? Is the patch OK as-is? As said, I'm not sure how the caller interprets the result and how it can distinguish the exact vs. non-exact cases or what a "conservative" inexact answer would be. Please help properly documenting this API. Thanks, Richard. > Martin > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > > > OK for trunk and branches? > > > > Thanks, > > Richard. > > > > 2022-05-16 Richard Biener <rguenther@suse.de> > > > > PR middle-end/105604 > > * gimple-ssa-sprintf.cc (get_origin_and_offset_r): > > Handle non-constant ARRAY_REF element size and non-constant > > COMPONENT_REF field offset. > > > > * gcc.dg/torture/pr105604.c: New testcase. > > --- > > gcc/gimple-ssa-sprintf.cc | 14 +++++++++++--- > > gcc/testsuite/gcc.dg/torture/pr105604.c | 24 ++++++++++++++++++++++++ > > 2 files changed, 35 insertions(+), 3 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/torture/pr105604.c > > > > diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc > > index c93f12f90b5..14e215ce69c 100644 > > --- a/gcc/gimple-ssa-sprintf.cc > > +++ b/gcc/gimple-ssa-sprintf.cc > > @@ -2312,14 +2312,16 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT > > *fldoff, HOST_WIDE_INT *fldsize, > > HOST_WIDE_INT idx = (tree_fits_uhwi_p (offset) > > ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX); > > + tree elsz = array_ref_element_size (x); > > tree eltype = TREE_TYPE (x); > > if (TREE_CODE (eltype) == INTEGER_TYPE) > > { > > if (off) > > *off = idx; > > } > > - else if (idx < HOST_WIDE_INT_MAX) > > - *fldoff += idx * int_size_in_bytes (eltype); > > + else if (idx < HOST_WIDE_INT_MAX > > + && tree_fits_shwi_p (elsz)) > > + *fldoff += idx * tree_to_shwi (elsz); > > else > > *fldoff = idx; > > @@ -2350,8 +2352,14 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT > > *fldoff, HOST_WIDE_INT *fldsize, > > > > case COMPONENT_REF: > > { > > + tree foff = component_ref_field_offset (x); > > tree fld = TREE_OPERAND (x, 1); > > - *fldoff += int_byte_position (fld); > > + if (!tree_fits_shwi_p (foff) > > + || !tree_fits_shwi_p (DECL_FIELD_BIT_OFFSET (fld))) > > + return x; > > + *fldoff += (tree_to_shwi (foff) > > + + (tree_to_shwi (DECL_FIELD_BIT_OFFSET (fld)) > > + / BITS_PER_UNIT)); > > > > get_origin_and_offset_r (fld, fldoff, fldsize, off); > > x = TREE_OPERAND (x, 0); > > diff --git a/gcc/testsuite/gcc.dg/torture/pr105604.c > > b/gcc/testsuite/gcc.dg/torture/pr105604.c > > new file mode 100644 > > index 00000000000..b002251df10 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/torture/pr105604.c > > @@ -0,0 +1,24 @@ > > +/* { dg-do compile } */ > > +/* { dg-additional-options "-Wall" } */ > > + > > +struct { > > + long users; > > + long size; > > + char *data; > > +} * main_trans; > > +void *main___trans_tmp_1; > > +int sprintf(char *, char *, ...); > > +int main() { > > + int users = 0; > > + struct { > > + long users; > > + long size; > > + char *data; > > + int links[users]; > > + char buf[]; > > + } *trans = trans; > > + trans->data = trans->buf; > > + main___trans_tmp_1 = trans; > > + main_trans = main___trans_tmp_1; > > + sprintf(main_trans->data, "test"); > > +} > > >
On 5/18/22 00:26, Richard Biener wrote: > On Tue, 17 May 2022, Martin Sebor wrote: > >> On 5/16/22 03:16, Richard Biener wrote: >>> The following tries to correct get_origin_and_offset_r not handling >>> non-constant sizes of array elements in ARRAY_REFs and non-constant >>> offsets of COMPONENT_REFs. It isn't exactly clear how such failures >>> should be treated in this API and existing handling isn't consistent >>> here either. The following applies two different variants, treating >>> non-constant array sizes like non-constant array indices and >>> treating non-constant offsets of COMPONENT_REFs by terminating >>> the recursion (not sure what that means to the callers). >>> >>> Basically the code failed to use component_ref_field_offset and >>> array_ref_element_size and instead relies on inappropriate >>> helpers (that shouldn't exist in the first place ...). The code >>> is also not safe-guarded against overflows in the final offset/size >>> computations but I'm not trying to rectify that. >>> >>> Martin - can you comment on how the API should handle such >>> situations? >> >> It looks like the -Wrestrict warning here ignores offsets equal to >> HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to >> that should avoid it. Or maybe to HWI_MAX as it does for variable >> offsets. > > Can you suggest wording for the function comment as to how it handles > the case when offset or size cannot be determined exactly? The > comment currently only suggests that the caller possibly cannot > trust fldsize or off when the function returns NULL but the actual > implementation differs from that. > >> It also looks like the function only handles constant offsets and >> sizes, and I have a vague recollection of enhancing it to work with >> ranges. That should avoid the overflow problem too. > > So the correct thing is to return NULL? No, I don't think so. The recursive get_origin_and_offset_r() assumes its own invocations never return null (the one place it does that should probably be moved to the nonrecursive caller). > > Is the patch OK as-is? It's an improvement but it's not complete as the following also ICEs (albeit somewhere else): void* f (void); void g (int n) { struct { char a[n], b[]; } *p = f (); __builtin_sprintf (p->b, "%s", p->a); } With the ICE fixed the warning triggers. That's not ideal but it's unavoidable given the IR (I believe I mentioned this caveat some time back). This is the same as for: struct { char a[8], b[8]; } *p = f (); __builtin_sprintf (&p->b[n], "%s", p->a); because the IR looks more or less the same for &p->a[n] as it is for &p->b[n]. > As said, I'm not sure how the caller interprets > the result and how it can distinguish the exact vs. non-exact cases > or what a "conservative" inexact answer would be. The warning triggers in both the certain cases and the inexact ones like the one above when an overlap cannot be ruled out. To differentiate the two it's phrased as "may overlap". The handling is in maybe_warn_overlap(). > > Please help properly documenting this API. I can spend some time in the next few days to page it all in, see if I can clean it up a bit in addition to fixing the ICEs and improve the comment. Let me know if you have a different preference. Martin > > Thanks, > Richard. > >> Martin >> >>> >>> Bootstrapped and tested on x86_64-unknown-linux-gnu. >>> >>> OK for trunk and branches? >>> >>> Thanks, >>> Richard. >>> >>> 2022-05-16 Richard Biener <rguenther@suse.de> >>> >>> PR middle-end/105604 >>> * gimple-ssa-sprintf.cc (get_origin_and_offset_r): >>> Handle non-constant ARRAY_REF element size and non-constant >>> COMPONENT_REF field offset. >>> >>> * gcc.dg/torture/pr105604.c: New testcase. >>> --- >>> gcc/gimple-ssa-sprintf.cc | 14 +++++++++++--- >>> gcc/testsuite/gcc.dg/torture/pr105604.c | 24 ++++++++++++++++++++++++ >>> 2 files changed, 35 insertions(+), 3 deletions(-) >>> create mode 100644 gcc/testsuite/gcc.dg/torture/pr105604.c >>> >>> diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc >>> index c93f12f90b5..14e215ce69c 100644 >>> --- a/gcc/gimple-ssa-sprintf.cc >>> +++ b/gcc/gimple-ssa-sprintf.cc >>> @@ -2312,14 +2312,16 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT >>> *fldoff, HOST_WIDE_INT *fldsize, >>> HOST_WIDE_INT idx = (tree_fits_uhwi_p (offset) >>> ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX); >>> + tree elsz = array_ref_element_size (x); >>> tree eltype = TREE_TYPE (x); >>> if (TREE_CODE (eltype) == INTEGER_TYPE) >>> { >>> if (off) >>> *off = idx; >>> } >>> - else if (idx < HOST_WIDE_INT_MAX) >>> - *fldoff += idx * int_size_in_bytes (eltype); >>> + else if (idx < HOST_WIDE_INT_MAX >>> + && tree_fits_shwi_p (elsz)) >>> + *fldoff += idx * tree_to_shwi (elsz); >>> else >>> *fldoff = idx; >>> @@ -2350,8 +2352,14 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT >>> *fldoff, HOST_WIDE_INT *fldsize, >>> >>> case COMPONENT_REF: >>> { >>> + tree foff = component_ref_field_offset (x); >>> tree fld = TREE_OPERAND (x, 1); >>> - *fldoff += int_byte_position (fld); >>> + if (!tree_fits_shwi_p (foff) >>> + || !tree_fits_shwi_p (DECL_FIELD_BIT_OFFSET (fld))) >>> + return x; >>> + *fldoff += (tree_to_shwi (foff) >>> + + (tree_to_shwi (DECL_FIELD_BIT_OFFSET (fld)) >>> + / BITS_PER_UNIT)); >>> >>> get_origin_and_offset_r (fld, fldoff, fldsize, off); >>> x = TREE_OPERAND (x, 0); >>> diff --git a/gcc/testsuite/gcc.dg/torture/pr105604.c >>> b/gcc/testsuite/gcc.dg/torture/pr105604.c >>> new file mode 100644 >>> index 00000000000..b002251df10 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.dg/torture/pr105604.c >>> @@ -0,0 +1,24 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-additional-options "-Wall" } */ >>> + >>> +struct { >>> + long users; >>> + long size; >>> + char *data; >>> +} * main_trans; >>> +void *main___trans_tmp_1; >>> +int sprintf(char *, char *, ...); >>> +int main() { >>> + int users = 0; >>> + struct { >>> + long users; >>> + long size; >>> + char *data; >>> + int links[users]; >>> + char buf[]; >>> + } *trans = trans; >>> + trans->data = trans->buf; >>> + main___trans_tmp_1 = trans; >>> + main_trans = main___trans_tmp_1; >>> + sprintf(main_trans->data, "test"); >>> +} >> >> >> >
On Wed, 18 May 2022, Martin Sebor wrote: > On 5/18/22 00:26, Richard Biener wrote: > > On Tue, 17 May 2022, Martin Sebor wrote: > > > >> On 5/16/22 03:16, Richard Biener wrote: > >>> The following tries to correct get_origin_and_offset_r not handling > >>> non-constant sizes of array elements in ARRAY_REFs and non-constant > >>> offsets of COMPONENT_REFs. It isn't exactly clear how such failures > >>> should be treated in this API and existing handling isn't consistent > >>> here either. The following applies two different variants, treating > >>> non-constant array sizes like non-constant array indices and > >>> treating non-constant offsets of COMPONENT_REFs by terminating > >>> the recursion (not sure what that means to the callers). > >>> > >>> Basically the code failed to use component_ref_field_offset and > >>> array_ref_element_size and instead relies on inappropriate > >>> helpers (that shouldn't exist in the first place ...). The code > >>> is also not safe-guarded against overflows in the final offset/size > >>> computations but I'm not trying to rectify that. > >>> > >>> Martin - can you comment on how the API should handle such > >>> situations? > >> > >> It looks like the -Wrestrict warning here ignores offsets equal to > >> HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to > >> that should avoid it. Or maybe to HWI_MAX as it does for variable > >> offsets. > > > > Can you suggest wording for the function comment as to how it handles > > the case when offset or size cannot be determined exactly? The > > comment currently only suggests that the caller possibly cannot > > trust fldsize or off when the function returns NULL but the actual > > implementation differs from that. > > > > > > >> It also looks like the function only handles constant offsets and > >> sizes, and I have a vague recollection of enhancing it to work with > >> ranges. That should avoid the overflow problem too. > > > > So the correct thing is to return NULL? > > No, I don't think so. The recursive get_origin_and_offset_r() assumes > its own invocations never return null (the one place it does that should > probably be moved to the nonrecursive caller). > > > > > Is the patch OK as-is? > > It's an improvement but it's not complete as the following also ICEs > (albeit somewhere else): > > void* f (void); > > void g (int n) > { > struct { > char a[n], b[]; > } *p = f (); > > __builtin_sprintf (p->b, "%s", p->a); > } > > With the ICE fixed the warning triggers. That's not ideal but it's > unavoidable given the IR (I believe I mentioned this caveat some time > back). This is the same as for: > > struct { > char a[8], b[8]; > } *p = f (); > > __builtin_sprintf (&p->b[n], "%s", p->a); > > because the IR looks more or less the same for &p->a[n] as it is for > &p->b[n]. > > > As said, I'm not sure how the caller interprets > > the result and how it can distinguish the exact vs. non-exact cases > > or what a "conservative" inexact answer would be. > > The warning triggers in both the certain cases and the inexact > ones like the one above when an overlap cannot be ruled out. To > differentiate the two it's phrased as "may overlap". The handling > is in maybe_warn_overlap(). > > > > > Please help properly documenting this API. > > I can spend some time in the next few days to page it all in, see > if I can clean it up a bit in addition to fixing the ICEs and > improve the comment. Let me know if you have a different > preference. That works for me - thanks for taking it from here. Richard.
On 5/19/22 05:39, Richard Biener wrote: > On Wed, 18 May 2022, Martin Sebor wrote: > >> On 5/18/22 00:26, Richard Biener wrote: >>> On Tue, 17 May 2022, Martin Sebor wrote: >>> >>>> On 5/16/22 03:16, Richard Biener wrote: >>>>> The following tries to correct get_origin_and_offset_r not handling >>>>> non-constant sizes of array elements in ARRAY_REFs and non-constant >>>>> offsets of COMPONENT_REFs. It isn't exactly clear how such failures >>>>> should be treated in this API and existing handling isn't consistent >>>>> here either. The following applies two different variants, treating >>>>> non-constant array sizes like non-constant array indices and >>>>> treating non-constant offsets of COMPONENT_REFs by terminating >>>>> the recursion (not sure what that means to the callers). >>>>> >>>>> Basically the code failed to use component_ref_field_offset and >>>>> array_ref_element_size and instead relies on inappropriate >>>>> helpers (that shouldn't exist in the first place ...). The code >>>>> is also not safe-guarded against overflows in the final offset/size >>>>> computations but I'm not trying to rectify that. >>>>> >>>>> Martin - can you comment on how the API should handle such >>>>> situations? >>>> >>>> It looks like the -Wrestrict warning here ignores offsets equal to >>>> HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to >>>> that should avoid it. Or maybe to HWI_MAX as it does for variable >>>> offsets. >>> >>> Can you suggest wording for the function comment as to how it handles >>> the case when offset or size cannot be determined exactly? The >>> comment currently only suggests that the caller possibly cannot >>> trust fldsize or off when the function returns NULL but the actual >>> implementation differs from that. >> >> >> >>> >>>> It also looks like the function only handles constant offsets and >>>> sizes, and I have a vague recollection of enhancing it to work with >>>> ranges. That should avoid the overflow problem too. >>> >>> So the correct thing is to return NULL? >> >> No, I don't think so. The recursive get_origin_and_offset_r() assumes >> its own invocations never return null (the one place it does that should >> probably be moved to the nonrecursive caller). >> >>> >>> Is the patch OK as-is? >> >> It's an improvement but it's not complete as the following also ICEs >> (albeit somewhere else): >> >> void* f (void); >> >> void g (int n) >> { >> struct { >> char a[n], b[]; >> } *p = f (); >> >> __builtin_sprintf (p->b, "%s", p->a); >> } >> >> With the ICE fixed the warning triggers. That's not ideal but it's >> unavoidable given the IR (I believe I mentioned this caveat some time >> back). This is the same as for: >> >> struct { >> char a[8], b[8]; >> } *p = f (); >> >> __builtin_sprintf (&p->b[n], "%s", p->a); >> >> because the IR looks more or less the same for &p->a[n] as it is for >> &p->b[n]. >> >>> As said, I'm not sure how the caller interprets >>> the result and how it can distinguish the exact vs. non-exact cases >>> or what a "conservative" inexact answer would be. >> >> The warning triggers in both the certain cases and the inexact >> ones like the one above when an overlap cannot be ruled out. To >> differentiate the two it's phrased as "may overlap". The handling >> is in maybe_warn_overlap(). >> >>> >>> Please help properly documenting this API. >> >> I can spend some time in the next few days to page it all in, see >> if I can clean it up a bit in addition to fixing the ICEs and >> improve the comment. Let me know if you have a different >> preference. > > That works for me - thanks for taking it from here. Attached is a slightly enhanced patch that fixes both of the ICEs, improves the comments, and adds more tests. I tested it on x86_64. Let me know if there's something else you'd like me to do here. Martin
On Mon, 23 May 2022, Martin Sebor wrote: > On 5/19/22 05:39, Richard Biener wrote: > > On Wed, 18 May 2022, Martin Sebor wrote: > > > >> On 5/18/22 00:26, Richard Biener wrote: > >>> On Tue, 17 May 2022, Martin Sebor wrote: > >>> > >>>> On 5/16/22 03:16, Richard Biener wrote: > >>>>> The following tries to correct get_origin_and_offset_r not handling > >>>>> non-constant sizes of array elements in ARRAY_REFs and non-constant > >>>>> offsets of COMPONENT_REFs. It isn't exactly clear how such failures > >>>>> should be treated in this API and existing handling isn't consistent > >>>>> here either. The following applies two different variants, treating > >>>>> non-constant array sizes like non-constant array indices and > >>>>> treating non-constant offsets of COMPONENT_REFs by terminating > >>>>> the recursion (not sure what that means to the callers). > >>>>> > >>>>> Basically the code failed to use component_ref_field_offset and > >>>>> array_ref_element_size and instead relies on inappropriate > >>>>> helpers (that shouldn't exist in the first place ...). The code > >>>>> is also not safe-guarded against overflows in the final offset/size > >>>>> computations but I'm not trying to rectify that. > >>>>> > >>>>> Martin - can you comment on how the API should handle such > >>>>> situations? > >>>> > >>>> It looks like the -Wrestrict warning here ignores offsets equal to > >>>> HOST_WIDE_INT_MIN so presumably setting dst_offset (via *fldoff) to > >>>> that should avoid it. Or maybe to HWI_MAX as it does for variable > >>>> offsets. > >>> > >>> Can you suggest wording for the function comment as to how it handles > >>> the case when offset or size cannot be determined exactly? The > >>> comment currently only suggests that the caller possibly cannot > >>> trust fldsize or off when the function returns NULL but the actual > >>> implementation differs from that. > >> > >> > >> > >>> > >>>> It also looks like the function only handles constant offsets and > >>>> sizes, and I have a vague recollection of enhancing it to work with > >>>> ranges. That should avoid the overflow problem too. > >>> > >>> So the correct thing is to return NULL? > >> > >> No, I don't think so. The recursive get_origin_and_offset_r() assumes > >> its own invocations never return null (the one place it does that should > >> probably be moved to the nonrecursive caller). > >> > >>> > >>> Is the patch OK as-is? > >> > >> It's an improvement but it's not complete as the following also ICEs > >> (albeit somewhere else): > >> > >> void* f (void); > >> > >> void g (int n) > >> { > >> struct { > >> char a[n], b[]; > >> } *p = f (); > >> > >> __builtin_sprintf (p->b, "%s", p->a); > >> } > >> > >> With the ICE fixed the warning triggers. That's not ideal but it's > >> unavoidable given the IR (I believe I mentioned this caveat some time > >> back). This is the same as for: > >> > >> struct { > >> char a[8], b[8]; > >> } *p = f (); > >> > >> __builtin_sprintf (&p->b[n], "%s", p->a); > >> > >> because the IR looks more or less the same for &p->a[n] as it is for > >> &p->b[n]. > >> > >>> As said, I'm not sure how the caller interprets > >>> the result and how it can distinguish the exact vs. non-exact cases > >>> or what a "conservative" inexact answer would be. > >> > >> The warning triggers in both the certain cases and the inexact > >> ones like the one above when an overlap cannot be ruled out. To > >> differentiate the two it's phrased as "may overlap". The handling > >> is in maybe_warn_overlap(). > >> > >>> > >>> Please help properly documenting this API. > >> > >> I can spend some time in the next few days to page it all in, see > >> if I can clean it up a bit in addition to fixing the ICEs and > >> improve the comment. Let me know if you have a different > >> preference. > > > > That works for me - thanks for taking it from here. > Attached is a slightly enhanced patch that fixes both of the ICEs, > improves the comments, and adds more tests. I tested it on x86_64. > Let me know if there's something else you'd like me to do here. Looks good to me! Thanks for fixing. Richard.
diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc index c93f12f90b5..14e215ce69c 100644 --- a/gcc/gimple-ssa-sprintf.cc +++ b/gcc/gimple-ssa-sprintf.cc @@ -2312,14 +2312,16 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize, HOST_WIDE_INT idx = (tree_fits_uhwi_p (offset) ? tree_to_uhwi (offset) : HOST_WIDE_INT_MAX); + tree elsz = array_ref_element_size (x); tree eltype = TREE_TYPE (x); if (TREE_CODE (eltype) == INTEGER_TYPE) { if (off) *off = idx; } - else if (idx < HOST_WIDE_INT_MAX) - *fldoff += idx * int_size_in_bytes (eltype); + else if (idx < HOST_WIDE_INT_MAX + && tree_fits_shwi_p (elsz)) + *fldoff += idx * tree_to_shwi (elsz); else *fldoff = idx; @@ -2350,8 +2352,14 @@ get_origin_and_offset_r (tree x, HOST_WIDE_INT *fldoff, HOST_WIDE_INT *fldsize, case COMPONENT_REF: { + tree foff = component_ref_field_offset (x); tree fld = TREE_OPERAND (x, 1); - *fldoff += int_byte_position (fld); + if (!tree_fits_shwi_p (foff) + || !tree_fits_shwi_p (DECL_FIELD_BIT_OFFSET (fld))) + return x; + *fldoff += (tree_to_shwi (foff) + + (tree_to_shwi (DECL_FIELD_BIT_OFFSET (fld)) + / BITS_PER_UNIT)); get_origin_and_offset_r (fld, fldoff, fldsize, off); x = TREE_OPERAND (x, 0); diff --git a/gcc/testsuite/gcc.dg/torture/pr105604.c b/gcc/testsuite/gcc.dg/torture/pr105604.c new file mode 100644 index 00000000000..b002251df10 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr105604.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-Wall" } */ + +struct { + long users; + long size; + char *data; +} * main_trans; +void *main___trans_tmp_1; +int sprintf(char *, char *, ...); +int main() { + int users = 0; + struct { + long users; + long size; + char *data; + int links[users]; + char buf[]; + } *trans = trans; + trans->data = trans->buf; + main___trans_tmp_1 = trans; + main_trans = main___trans_tmp_1; + sprintf(main_trans->data, "test"); +}