From patchwork Wed May 31 18:06:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 70403 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 844D5385703C for ; Wed, 31 May 2023 18:07:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 844D5385703C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685556430; bh=m+07daGeNZAdZgLfLaXjapcTUd1DSRq4OlXTe3KGKWY=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=PsI3zAUVfr4tNPTIF8oE+7ypcBh6nIMkiqzVutV14RBXj5KrKeqoBNOPJ0PPW/kHF 3rTMp49ms07vJ2uaUIn8FG7s0eDcjndcoLqH/jWVZDqh2dHShOuSxaqo1N3QuVJXDY wY/TfHPC9qPak76dIC49Har3BZsw9UFdKTFcFQ/Q= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 701883858C60 for ; Wed, 31 May 2023 18:06:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 701883858C60 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-623-JeYflZY2O4mr7AoZTOci_A-1; Wed, 31 May 2023 14:06:32 -0400 X-MC-Unique: JeYflZY2O4mr7AoZTOci_A-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 74688802A55 for ; Wed, 31 May 2023 18:06:32 +0000 (UTC) Received: from t14s.localdomain.com (unknown [10.22.17.56]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5230140CFD45; Wed, 31 May 2023 18:06:32 +0000 (UTC) To: gcc-patches@gcc.gnu.org Cc: David Malcolm Subject: [PATCH 1/3] testsuite: move handle-multiline-outputs to before check for blank lines Date: Wed, 31 May 2023 14:06:28 -0400 Message-Id: <20230531180630.3127108-2-dmalcolm@redhat.com> In-Reply-To: <20230531180630.3127108-1-dmalcolm@redhat.com> References: <20230531180630.3127108-1-dmalcolm@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: David Malcolm via Gcc-patches From: David Malcolm Reply-To: David Malcolm Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" I have followup patches that require checking for multiline patterns that have blank lines within them, so this moves the handling of multiline patterns before the check for blank lines, allowing for such multiline patterns. Doing so uncovers some issues with existing multiline directives, which the patch fixes. gcc/testsuite/ChangeLog: * c-c++-common/Wlogical-not-parentheses-2.c: Split up the multiline directive. * gcc.dg/analyzer/malloc-macro-inline-events.c: Remove redundant dg-regexp directives. * gcc.dg/missing-header-fixit-5.c: Split up the multiline directives. * lib/gcc-dg.exp (gcc-dg-prune): Move call to handle-multiline-outputs from prune_gcc_output to here. * lib/multiline.exp (dg-end-multiline-output): Move call to maybe-handle-nn-line-numbers from prune_gcc_output to here. * lib/prune.exp (prune_gcc_output): Move calls to maybe-handle-nn-line-numbers and handle-multiline-outputs from here to the above. --- .../c-c++-common/Wlogical-not-parentheses-2.c | 2 ++ .../gcc.dg/analyzer/malloc-macro-inline-events.c | 5 ----- gcc/testsuite/gcc.dg/missing-header-fixit-5.c | 10 ++++++++-- gcc/testsuite/lib/gcc-dg.exp | 5 +++++ gcc/testsuite/lib/multiline.exp | 7 ++++++- gcc/testsuite/lib/prune.exp | 7 ------- 6 files changed, 21 insertions(+), 15 deletions(-) diff --git a/gcc/testsuite/c-c++-common/Wlogical-not-parentheses-2.c b/gcc/testsuite/c-c++-common/Wlogical-not-parentheses-2.c index ba8dce84f5d..2d9382014c4 100644 --- a/gcc/testsuite/c-c++-common/Wlogical-not-parentheses-2.c +++ b/gcc/testsuite/c-c++-common/Wlogical-not-parentheses-2.c @@ -12,6 +12,8 @@ foo (int aaa, int bbb) /* { dg-begin-multiline-output "" } r += !aaa == bbb; ^~ + { dg-end-multiline-output "" } */ +/* { dg-begin-multiline-output "" } r += !aaa == bbb; ^~~~ ( ) diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-macro-inline-events.c b/gcc/testsuite/gcc.dg/analyzer/malloc-macro-inline-events.c index f08aee626a5..9134bb4781e 100644 --- a/gcc/testsuite/gcc.dg/analyzer/malloc-macro-inline-events.c +++ b/gcc/testsuite/gcc.dg/analyzer/malloc-macro-inline-events.c @@ -12,11 +12,6 @@ int test (void *ptr) WRAPPED_FREE (ptr); /* { dg-message "in expansion of macro 'WRAPPED_FREE'" } */ WRAPPED_FREE (ptr); /* { dg-message "in expansion of macro 'WRAPPED_FREE'" } */ - /* Erase the spans indicating the header file - (to avoid embedding path assumptions). */ - /* { dg-regexp "\[^|\]+/malloc-macro.h:\[0-9\]+:\[0-9\]+:" } */ - /* { dg-regexp "\[^|\]+/malloc-macro.h:\[0-9\]+:\[0-9\]+:" } */ - /* { dg-begin-multiline-output "" } NN | #define WRAPPED_FREE(PTR) free(PTR) | ^~~~~~~~~ diff --git a/gcc/testsuite/gcc.dg/missing-header-fixit-5.c b/gcc/testsuite/gcc.dg/missing-header-fixit-5.c index 916033c689c..bf44feb24a9 100644 --- a/gcc/testsuite/gcc.dg/missing-header-fixit-5.c +++ b/gcc/testsuite/gcc.dg/missing-header-fixit-5.c @@ -12,14 +12,18 @@ foo (char *m, int i) /* { dg-begin-multiline-output "" } 11 | if (isdigit (m[0])) | ^~~~~~~ + { dg-end-multiline-output "" } */ + /* { dg-begin-multiline-output "" } +++ |+#include 1 | { dg-end-multiline-output "" } */ { return abs (i); /* { dg-warning "implicit declaration of function" } */ /* { dg-begin-multiline-output "" } - 19 | return abs (i); + 21 | return abs (i); | ^~~ + { dg-end-multiline-output "" } */ + /* { dg-begin-multiline-output "" } +++ |+#include 1 | { dg-end-multiline-output "" } */ @@ -27,8 +31,10 @@ foo (char *m, int i) else putchar (m[0]); /* { dg-warning "implicit declaration of function" } */ /* { dg-begin-multiline-output "" } - 28 | putchar (m[0]); + 32 | putchar (m[0]); | ^~~~~~~ + { dg-end-multiline-output "" } */ + /* { dg-begin-multiline-output "" } +++ |+#include 1 | { dg-end-multiline-output "" } */ diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp index 4ed4233efff..6475cab46de 100644 --- a/gcc/testsuite/lib/gcc-dg.exp +++ b/gcc/testsuite/lib/gcc-dg.exp @@ -364,6 +364,11 @@ proc gcc-dg-prune { system text } { # Always remember to clear it in .exp file after executed all tests. global dg_runtest_extra_prunes + # Call into multiline.exp to handle any multiline output directives. + # This is done before the check for blank lines so that multiline + # output directives can have blank lines within them. + set text [handle-multiline-outputs $text] + # Complain about blank lines in the output (PR other/69006) global allow_blank_lines if { !$allow_blank_lines } { diff --git a/gcc/testsuite/lib/multiline.exp b/gcc/testsuite/lib/multiline.exp index 73621a0bdbd..4c25bb76f43 100644 --- a/gcc/testsuite/lib/multiline.exp +++ b/gcc/testsuite/lib/multiline.exp @@ -139,7 +139,7 @@ proc dg-end-multiline-output { args } { verbose "within dg-end-multiline-output: multiline_expected_outputs: $multiline_expected_outputs" 3 } -# Hook to be called by prune.exp's prune_gcc_output to +# Hook to be called by gcc-dg.exp's gcc-dg-prune to # look for the expected multiline outputs, pruning them, # reporting PASS for those that are found, and FAIL for # those that weren't found. @@ -149,6 +149,11 @@ proc dg-end-multiline-output { args } { proc handle-multiline-outputs { text } { global multiline_expected_outputs global testname_with_flags + + # If dg-enable-nn-line-numbers was provided, then obscure source-margin + # line numbers by converting them to "NN" form. + set text [maybe-handle-nn-line-numbers $text] + set index 0 foreach entry $multiline_expected_outputs { verbose " entry: $entry" 3 diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp index cfe427c99ac..8d37b24e59b 100644 --- a/gcc/testsuite/lib/prune.exp +++ b/gcc/testsuite/lib/prune.exp @@ -108,13 +108,6 @@ proc prune_gcc_output { text } { # Many tests that use visibility will still pass on platforms that don't support it. regsub -all "(^|\n)\[^\n\]*lto1: warning: visibility attribute not supported in this configuration; ignored\[^\n\]*" $text "" text - # If dg-enable-nn-line-numbers was provided, then obscure source-margin - # line numbers by converting them to "NN" form. - set text [maybe-handle-nn-line-numbers $text] - - # Call into multiline.exp to handle any multiline output directives. - set text [handle-multiline-outputs $text] - #send_user "After:$text\n" return $text From patchwork Wed May 31 18:06:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 70404 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C2A6D3857C71 for ; Wed, 31 May 2023 18:07:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C2A6D3857C71 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685556468; bh=QuN1eHZuQ9EpKv7U+vlYNfY/2QeRQ7eP2bZXprg1vLk=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=mBivrjHbtwGTkHdp58k3gqcR8C3SA72GSBldJX3ntScsg+fFbJ91zAaplUuN/HJn6 kqzsf6QAnQ1C2vO+hi6qz4/LaByPCD5aSWdguO9eVs/U7TA75pVLKb9hwpxABzyTjN JzkHZtmSoBZGca2goBgScFGekS0XvZ29Qi0N7Ssg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id DD3E23858CDB for ; Wed, 31 May 2023 18:06:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DD3E23858CDB Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-622-d5Y8Vw1wPVyUI5idi2m1Zg-1; Wed, 31 May 2023 14:06:33 -0400 X-MC-Unique: d5Y8Vw1wPVyUI5idi2m1Zg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 49B878032FF for ; Wed, 31 May 2023 18:06:33 +0000 (UTC) Received: from t14s.localdomain.com (unknown [10.22.17.56]) by smtp.corp.redhat.com (Postfix) with ESMTP id 06D9140CFD45; Wed, 31 May 2023 18:06:32 +0000 (UTC) To: gcc-patches@gcc.gnu.org Cc: David Malcolm Subject: [PATCH 3/3] analyzer: add text-art visualizations of out-of-bounds accesses [PR106626] Date: Wed, 31 May 2023 14:06:30 -0400 Message-Id: <20230531180630.3127108-4-dmalcolm@redhat.com> In-Reply-To: <20230531180630.3127108-1-dmalcolm@redhat.com> References: <20230531180630.3127108-1-dmalcolm@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, BODY_8BITS, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: David Malcolm via Gcc-patches From: David Malcolm Reply-To: David Malcolm Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch extends -Wanalyzer-out-of-bounds so that, where possible, it will emit a text art diagram visualizing the spatial relationship between (a) the memory region that the analyzer predicts would be accessed, versus (b) the range of memory that is valid to access - whether they overlap, are touching, are close or far apart; which one is before or after in memory, the relative sizes involved, the direction of the access (read vs write), and, in some cases, the values of data involved. This diagram can be suppressed using -fdiagnostics-text-art-charset=none. For example, given: int32_t arr[10]; int32_t int_arr_read_element_before_start_far(void) { return arr[-100]; } it emits: demo-1.c: In function ‘int_arr_read_element_before_start_far’: demo-1.c:7:13: warning: buffer under-read [CWE-127] [-Wanalyzer-out-of-bounds] 7 | return arr[-100]; | ~~~^~~~~~ ‘int_arr_read_element_before_start_far’: event 1 | | 7 | return arr[-100]; | | ~~~^~~~~~ | | | | | (1) out-of-bounds read from byte -400 till byte -397 but ‘arr’ starts at byte 0 | demo-1.c:7:13: note: valid subscripts for ‘arr’ are ‘[0]’ to ‘[9]’ ┌───────────────────────────┐ │read of ‘int32_t’ (4 bytes)│ └───────────────────────────┘ ^ │ │ ┌───────────────────────────┐ ┌────────┬────────┬─────────┐ │ │ │ [0] │ ... │ [9] │ │ before valid range │ ├────────┴────────┴─────────┤ │ │ │‘arr’ (type: ‘int32_t[10]’)│ └───────────────────────────┘ └───────────────────────────┘ ├─────────────┬─────────────┤├─────┬──────┤├─────────────┬─────────────┤ │ │ │ ╭────────────┴───────────╮ ╭────┴────╮ ╭───────┴──────╮ │⚠️ under-read of 4 bytes│ │396 bytes│ │size: 40 bytes│ ╰────────────────────────╯ ╰─────────╯ ╰──────────────╯ and given: #include void test_non_ascii () { char buf[5]; strcpy (buf, "文字化け"); } it emits: demo-2.c: In function ‘test_non_ascii’: demo-2.c:7:3: warning: stack-based buffer overflow [CWE-121] [-Wanalyzer-out-of-bounds] 7 | strcpy (buf, "文字化け"); | ^~~~~~~~~~~~~~~~~~~~~~~~ ‘test_non_ascii’: events 1-2 | | 6 | char buf[5]; | | ^~~ | | | | | (1) capacity: 5 bytes | 7 | strcpy (buf, "文字化け"); | | ~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) out-of-bounds write from byte 5 till byte 12 but ‘buf’ ends at byte 5 | demo-2.c:7:3: note: write of 8 bytes to beyond the end of ‘buf’ 7 | strcpy (buf, "文字化け"); | ^~~~~~~~~~~~~~~~~~~~~~~~ demo-2.c:7:3: note: valid subscripts for ‘buf’ are ‘[0]’ to ‘[4]’ ┌─────┬─────┬─────┬────┬────┐┌────┬────┬────┬────┬────┬────┬────┬──────┐ │ [0] │ [1] │ [2] │[3] │[4] ││[5] │[6] │[7] │[8] │[9] │[10]│[11]│ [12] │ ├─────┼─────┼─────┼────┼────┤├────┼────┼────┼────┼────┼────┼────┼──────┤ │0xe6 │0x96 │0x87 │0xe5│0xad││0x97│0xe5│0x8c│0x96│0xe3│0x81│0x91│ 0x00 │ ├─────┴─────┴─────┼────┴────┴┴────┼────┴────┴────┼────┴────┴────┼──────┤ │ U+6587 │ U+5b57 │ U+5316 │ U+3051 │U+0000│ ├─────────────────┼───────────────┼──────────────┼──────────────┼──────┤ │ 文 │ 字 │ 化 │ け │ NUL │ ├─────────────────┴───────────────┴──────────────┴──────────────┴──────┤ │ string literal (type: ‘char[13]’) │ └──────────────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ v v v v v v v v v v v v v ┌─────┬────────────────┬────┐┌─────────────────────────────────────────┐ │ [0] │ ... │[4] ││ │ ├─────┴────────────────┴────┤│ after valid range │ │ ‘buf’ (type: ‘char[5]’) ││ │ └───────────────────────────┘└─────────────────────────────────────────┘ ├─────────────┬─────────────┤├────────────────────┬────────────────────┤ │ │ ╭────────┴────────╮ ╭───────────┴──────────╮ │capacity: 5 bytes│ │⚠️ overflow of 8 bytes│ ╰─────────────────╯ ╰──────────────────────╯ showing that the overflow occurs partway through the UTF-8 encoding of the U+5b57 code point. There are lots more examples in the test suite. It doesn't show up in this email, but the above diagrams are colorized to constrast the valid and invalid access ranges. gcc/ChangeLog: PR analyzer/106626 * Makefile.in (ANALYZER_OBJS): Add analyzer/access-diagram.o. * doc/invoke.texi (Wanalyzer-out-of-bounds): Add description of text art. (fanalyzer-debug-text-art): New. gcc/analyzer/ChangeLog: PR analyzer/106626 * access-diagram.cc: New file. * access-diagram.h: New file. * analyzer.h (class region_offset): Add default ctor. (region_offset::make_byte_offset): New decl. (region_offset::concrete_p): New. (region_offset::get_concrete_byte_offset): New. (region_offset::calc_symbolic_bit_offset): New decl. (region_offset::calc_symbolic_byte_offset): New decl. (region_offset::dump_to_pp): New decl. (region_offset::dump): New decl. (operator<, operator<=, operator>, operator>=): New decls for region_offset. * analyzer.opt (-param=analyzer-text-art-string-ellipsis-threshold=): New. (-param=analyzer-text-art-string-ellipsis-head-len=): New. (-param=analyzer-text-art-string-ellipsis-tail-len=): New. (-param=analyzer-text-art-ideal-canvas-width=): New. (fanalyzer-debug-text-art): New. * bounds-checking.cc: Include "intl.h", "diagnostic-diagram.h", and "analyzer/access-diagram.h". (class out_of_bounds::oob_region_creation_event_capacity): New. (out_of_bounds::out_of_bounds): Add "model" and "sval_hint" params. (out_of_bounds::mark_interesting_stuff): Use the base region. (out_of_bounds::add_region_creation_events): Use oob_region_creation_event_capacity. (out_of_bounds::get_dir): New pure vfunc. (out_of_bounds::maybe_show_notes): New. (out_of_bounds::maybe_show_diagram): New. (out_of_bounds::make_access_diagram): New. (out_of_bounds::m_model): New field. (out_of_bounds::m_sval_hint): New field. (out_of_bounds::m_region_creation_event_id): New field. (concrete_out_of_bounds::concrete_out_of_bounds): Update for new fields. (concrete_past_the_end::concrete_past_the_end): Likewise. (concrete_past_the_end::add_region_creation_events): Use oob_region_creation_event_capacity. (concrete_buffer_overflow::concrete_buffer_overflow): Update for new fields. (concrete_buffer_overflow::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_overflow::get_dir): New. (concrete_buffer_over_read::concrete_buffer_over_read): Update for new fields. (concrete_buffer_over_read::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_overflow::get_dir): New. (concrete_buffer_underwrite::concrete_buffer_underwrite): Update for new fields. (concrete_buffer_underwrite::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_underwrite::get_dir): New. (concrete_buffer_under_read::concrete_buffer_under_read): Update for new fields. (concrete_buffer_under_read::emit): Replace call to maybe_describe_array_bounds with maybe_show_notes. (concrete_buffer_under_read::get_dir): New. (symbolic_past_the_end::symbolic_past_the_end): Update for new fields. (symbolic_buffer_overflow::symbolic_buffer_overflow): Likewise. (symbolic_buffer_overflow::emit): Call maybe_show_notes. (symbolic_buffer_overflow::get_dir): New. (symbolic_buffer_over_read::symbolic_buffer_over_read): Update for new fields. (symbolic_buffer_over_read::emit): Call maybe_show_notes. (symbolic_buffer_over_read::get_dir): New. (region_model::check_symbolic_bounds): Add "sval_hint" param. Pass it and sized_offset_reg to diagnostics. (region_model::check_region_bounds): Add "sval_hint" param, passing it to diagnostics. * diagnostic-manager.cc (diagnostic_manager::emit_saved_diagnostic): Pass logger to pending_diagnostic::emit. * engine.cc: Add logger param to pending_diagnostic::emit implementations. * infinite-recursion.cc: Likewise. * kf-analyzer.cc: Likewise. * kf.cc: Likewise. Add nullptr for new param of check_region_for_write. * pending-diagnostic.h: Likewise in decl. * region-model-manager.cc (region_model_manager::get_or_create_int_cst): Convert param from poly_int64 to const poly_wide_int_ref &. (region_model_manager::maybe_fold_binop): Support type being NULL when checking for floating-point types. Check for (X + Y) - X => Y. Be less strict about types when folding associative ops. Check for (X + Y) * CST => (X * CST) + (Y * CST). * region-model-manager.h (region_model_manager::get_or_create_int_cst): Convert param from poly_int64 to const poly_wide_int_ref &. * region-model.cc: Add logger param to pending_diagnostic::emit implementations. (region_model::check_external_function_for_access_attr): Update for new param of check_region_for_write. (region_model::deref_rvalue): Use nullptr rather than NULL. (region_model::get_capacity): Handle RK_STRING. (region_model::check_region_access): Add "sval_hint" param; pass it to check_region_bounds. (region_model::check_region_for_write): Add "sval_hint" param; pass it to check_region_access. (region_model::check_region_for_read): Add NULL for new param to check_region_access. (region_model::set_value): Pass rhs_sval to check_region_for_write. (region_model::get_representative_path_var_1): Handle SK_CONSTANT in the check for infinite recursion. * region-model.h (region_model::check_region_for_write): Add "sval_hint" param. (region_model::check_region_access): Likewise. (region_model::check_symbolic_bounds): Likewise. (region_model::check_region_bounds): Likewise. * region.cc (region_offset::make_byte_offset): New. (region_offset::calc_symbolic_bit_offset): New. (region_offset::calc_symbolic_byte_offset): New. (region_offset::dump_to_pp): New. (region_offset::dump): New. (struct linear_op): New. (operator<, operator<=, operator>, operator>=): New, for region_offset. (region::get_next_offset): New. (region::get_relative_symbolic_offset): Use ptrdiff_type_node. (field_region::get_relative_symbolic_offset): Likewise. (element_region::get_relative_symbolic_offset): Likewise. (bit_range_region::get_relative_symbolic_offset): Likewise. * region.h (region::get_next_offset): New decl. * sm-fd.cc: Add logger param to pending_diagnostic::emit implementations. * sm-file.cc: Likewise. * sm-malloc.cc: Likewise. * sm-pattern-test.cc: Likewise. * sm-sensitive.cc: Likewise. * sm-signal.cc: Likewise. * sm-taint.cc: Likewise. * store.cc (bit_range::contains_p): Allow "out" to be null. * store.h (byte_range::get_start_bit_offset): New. (byte_range::get_next_bit_offset): New. * varargs.cc: Add logger param to pending_diagnostic::emit implementations. gcc/testsuite/ChangeLog: PR analyzer/106626 * gcc.dg/analyzer/data-model-1.c (test_16): Update for out-of-bounds working. * gcc.dg/analyzer/out-of-bounds-diagram-1-ascii.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-debug.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-emoji.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-json.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-sarif.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-1-unicode.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-10.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-11.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-12.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-13.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-14.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-15.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-2.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-3.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-4.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-5-ascii.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-5-unicode.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-6.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-7.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-8.c: New test. * gcc.dg/analyzer/out-of-bounds-diagram-9.c: New test. * gcc.dg/analyzer/pattern-test-2.c: Update expected results. * gcc.dg/plugin/analyzer_gil_plugin.c: Add logger param to pending_diagnostic::emit implementations. --- gcc/Makefile.in | 1 + gcc/analyzer/access-diagram.cc | 2405 +++++++++++++++++ gcc/analyzer/access-diagram.h | 165 ++ gcc/analyzer/analyzer.h | 30 + gcc/analyzer/analyzer.opt | 20 + gcc/analyzer/bounds-checking.cc | 270 +- gcc/analyzer/diagnostic-manager.cc | 2 +- gcc/analyzer/engine.cc | 4 +- gcc/analyzer/infinite-recursion.cc | 2 +- gcc/analyzer/kf-analyzer.cc | 2 +- gcc/analyzer/kf.cc | 6 +- gcc/analyzer/pending-diagnostic.h | 2 +- gcc/analyzer/region-model-manager.cc | 32 +- gcc/analyzer/region-model-manager.h | 2 +- gcc/analyzer/region-model.cc | 52 +- gcc/analyzer/region-model.h | 4 + gcc/analyzer/region.cc | 369 ++- gcc/analyzer/region.h | 1 + gcc/analyzer/sm-fd.cc | 14 +- gcc/analyzer/sm-file.cc | 4 +- gcc/analyzer/sm-malloc.cc | 20 +- gcc/analyzer/sm-pattern-test.cc | 2 +- gcc/analyzer/sm-sensitive.cc | 3 +- gcc/analyzer/sm-signal.cc | 2 +- gcc/analyzer/sm-taint.cc | 16 +- gcc/analyzer/store.cc | 11 +- gcc/analyzer/store.h | 9 + gcc/analyzer/varargs.cc | 8 +- gcc/doc/invoke.texi | 15 + gcc/testsuite/gcc.dg/analyzer/data-model-1.c | 4 +- .../analyzer/out-of-bounds-diagram-1-ascii.c | 55 + .../analyzer/out-of-bounds-diagram-1-debug.c | 40 + .../analyzer/out-of-bounds-diagram-1-emoji.c | 55 + .../analyzer/out-of-bounds-diagram-1-json.c | 13 + .../analyzer/out-of-bounds-diagram-1-sarif.c | 24 + .../out-of-bounds-diagram-1-unicode.c | 55 + .../analyzer/out-of-bounds-diagram-10.c | 29 + .../analyzer/out-of-bounds-diagram-11.c | 82 + .../analyzer/out-of-bounds-diagram-12.c | 54 + .../analyzer/out-of-bounds-diagram-13.c | 43 + .../analyzer/out-of-bounds-diagram-14.c | 110 + .../analyzer/out-of-bounds-diagram-15.c | 42 + .../gcc.dg/analyzer/out-of-bounds-diagram-2.c | 30 + .../gcc.dg/analyzer/out-of-bounds-diagram-3.c | 45 + .../gcc.dg/analyzer/out-of-bounds-diagram-4.c | 45 + .../analyzer/out-of-bounds-diagram-5-ascii.c | 40 + .../out-of-bounds-diagram-5-unicode.c | 42 + .../gcc.dg/analyzer/out-of-bounds-diagram-6.c | 125 + .../gcc.dg/analyzer/out-of-bounds-diagram-7.c | 36 + .../gcc.dg/analyzer/out-of-bounds-diagram-8.c | 34 + .../gcc.dg/analyzer/out-of-bounds-diagram-9.c | 42 + .../gcc.dg/analyzer/pattern-test-2.c | 4 +- .../gcc.dg/plugin/analyzer_gil_plugin.c | 6 +- 53 files changed, 4382 insertions(+), 146 deletions(-) create mode 100644 gcc/analyzer/access-diagram.cc create mode 100644 gcc/analyzer/access-diagram.h create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-1-ascii.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-1-debug.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-1-emoji.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-1-json.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-1-sarif.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-1-unicode.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-10.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-11.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-12.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-13.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-14.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-15.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-2.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-3.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-4.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-5-ascii.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-5-unicode.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-6.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-7.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-8.c create mode 100644 gcc/testsuite/gcc.dg/analyzer/out-of-bounds-diagram-9.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index c1e7257ed24..1be7460b9d0 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1275,6 +1275,7 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \ # Analyzer object files ANALYZER_OBJS = \ + analyzer/access-diagram.o \ analyzer/analysis-plan.o \ analyzer/analyzer.o \ analyzer/analyzer-language.o \ diff --git a/gcc/analyzer/access-diagram.cc b/gcc/analyzer/access-diagram.cc new file mode 100644 index 00000000000..968ff50a0b7 --- /dev/null +++ b/gcc/analyzer/access-diagram.cc @@ -0,0 +1,2405 @@ +/* Text art visualizations within -fanalyzer. + Copyright (C) 2023 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#include "config.h" +#define INCLUDE_ALGORITHM +#define INCLUDE_MEMORY +#define INCLUDE_MAP +#define INCLUDE_SET +#include "system.h" +#include "coretypes.h" +#include "coretypes.h" +#include "tree.h" +#include "function.h" +#include "basic-block.h" +#include "gimple.h" +#include "diagnostic.h" +#include "intl.h" +#include "make-unique.h" +#include "tree-diagnostic.h" /* for default_tree_printer. */ +#include "analyzer/analyzer.h" +#include "analyzer/region-model.h" +#include "analyzer/access-diagram.h" +#include "text-art/ruler.h" +#include "fold-const.h" + +#if ENABLE_ANALYZER + +/* Consider this code: + int32_t arr[10]; + arr[10] = x; + where we've emitted a buffer overflow diagnostic like this: + out-of-bounds write from byte 40 till byte 43 but 'arr' ends at byte 40 + + We want to emit a diagram that visualizes: + - the spatial relationship between the valid region to access, versus + the region that was actually accessed: does it overlap, was it touching, + close, or far away? Was it before or after in memory? What are the + relative sizes involved? + - the direction of the access (read vs write) + + The following code supports emitting diagrams similar to the following: + + # +--------------------------------+ + # |write from ‘x’ (type: ‘int32_t’)| + # +--------------------------------+ + # | + # | + # v + # +---------+-----------+-----------+ +--------------------------------+ + # | [0] | ... | [9] | | after valid range | + # +---------+-----------+-----------+ | | + # | ‘arr’ (type: ‘int32_t[10]’) | | | + # +---------------------------------+ +--------------------------------+ + # |~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~| |~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~| + # | | + # +---------+--------+ +---------+---------+ + # |capacity: 40 bytes| |overflow of 4 bytes| + # +------------------+ +-------------------+ + + where the diagram is laid out via table columns where each table column + represents either a range of bits/bytes, or is a spacing column (to highlight + the boundary between valid vs invalid accesses). The table columns can be + seen via -fanalyzer-debug-text-art. For example, here there are 5 table + columns ("tc0" through "tc4"): + + # +---------+-----------+-----------+---+--------------------------------+ + # | tc0 | tc1 | tc2 |tc3| tc4 | + # +---------+-----------+-----------+---+--------------------------------+ + # |bytes 0-3|bytes 4-35 |bytes 36-39| | bytes 40-43 | + # +---------+-----------+-----------+ +--------------------------------+ + # + # +--------------------------------+ + # |write from ‘x’ (type: ‘int32_t’)| + # +--------------------------------+ + # | + # | + # v + # +---------+-----------+-----------+ +--------------------------------+ + # | [0] | ... | [9] | | after valid range | + # +---------+-----------+-----------+ | | + # | ‘arr’ (type: ‘int32_t[10]’) | | | + # +---------------------------------+ +--------------------------------+ + # |~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~| |~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~| + # | | + # +---------+--------+ +---------+---------+ + # |capacity: 40 bytes| |overflow of 4 bytes| + # +------------------+ +-------------------+ + + The diagram is built up from the following: + + # +--------------------------------+ + # | ITEM FOR SVALUE/ACCESSED REGION| + # +--------------------------------+ + # | + # | DIRECTION WIDGET + # v + # +---------------------------------+ +--------------------------------+ + # | VALID REGION | | INVALID ACCESS | + # +---------------------------------+ +--------------------------------+ + # + # | VALID-VS-INVALID RULER | + + i.e. a vbox_widget containing 4 child widgets laid out vertically: + - ALIGNED CHILD WIDGET: ITEM FOR SVALUE/ACCESSED REGION + - DIRECTION WIDGET + - ALIGNED CHILD WIDGET: VALID AND INVALID ACCESSES + - VALID-VS-INVALID RULER. + + A more complicated example, given this overflow: + char buf[100]; + strcpy (buf, LOREM_IPSUM); + + 01| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 02| |[0]|[1]|[2]|[3]|[4]|[5]| ... |[440]|[441]|[442]|[443]|[444]|[445]| + 03| +---+---+---+---+---+---+ +-----+-----+-----+-----+-----+-----+ + 04| |'L'|'o'|'r'|'e'|'m'|' '| | 'o' | 'r' | 'u' | 'm' | '.' | NUL | + 05| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 06| | string literal (type: 'char[446]') | + 07| +----------------------------------------------------------------------+ + 08| | | | | | | | | | | | | | | | + 09| | | | | | | | | | | | | | | | + 10| v v v v v v v v v v v v v v v + 11| +---+---------------------+----++--------------------------------------+ + 12| |[0]| ... |[99]|| after valid range | + 13| +---+---------------------+----+| | + 14| | 'buf' (type: 'char[100]') || | + 15| +------------------------------++--------------------------------------+ + 16| |~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~||~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~| + 17| | | + 18| +---------+---------+ +----------+----------+ + 19| |capacity: 100 bytes| |overflow of 346 bytes| + 20| +-------------------+ +---------------------+ + + which is: + + 01| ALIGNED CHILD WIDGET (lines 01-07): (string_region_spatial_item)-+-----+ + 02| |[0]|[1]|[2]|[3]|[4]|[5]| ... |[440]|[441]|[442]|[443]|[444]|[445]| + 03| +---+---+---+---+---+---+ +-----+-----+-----+-----+-----+-----+ + 04| |'L'|'o'|'r'|'e'|'m'|' '| | 'o' | 'r' | 'u' | 'm' | '.' | NUL | + 05| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 06| | string literal (type: 'char[446]') | + 07| +----------------------------------------------------------------------+ + 08| DIRECTION WIDGET (lines 08-10) | | | | | | | + 09| | | | | | | | | | | | | | | | + 10| v v v v v v v v v v v v v v v + 11| ALIGNED CHILD WIDGET (lines 11-15)-------------------------------------+ + 12| VALID REGION ... |[99]|| INVALID ACCESS | + 13| +---+---------------------+----+| | + 14| | 'buf' (type: 'char[100]') || | + 15| +------------------------------++--------------------------------------+ + 16| VALID-VS-INVALID RULER (lines 16-20): ~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~| + 17| | | + 18| +---------+---------+ +----------+----------+ + 19| |capacity: 100 bytes| |overflow of 346 bytes| + 20| +-------------------+ +---------------------+ + + We build the diagram in several phases: + - (1) we construct an access_diagram_impl widget. Within the ctor, we have + these subphases: + - (1.1) find all of the boundaries of interest + - (1.2) use the boundaries to build a bit_table_map, associating bit ranges + with table columns (e.g. "byte 0 is column 0, bytes 1-98 are column 2" etc) + - (1.3) create child widgets that share this table-based geometry + - (2) ask the widget for its size request + - (2.1) column widths and row heights for the table are computed by + access_diagram_impl::calc_req_size + - (2.2) child widgets request sizes based on these widths/heights + - (3) create a canvas of the appropriate size + - (4) paint the widget hierarchy to the canvas. */ + + +using namespace text_art; + +namespace ana { + +static styled_string +fmt_styled_string (style_manager &sm, + const char *fmt, ...) + ATTRIBUTE_GCC_DIAG(2, 3); + +static styled_string +fmt_styled_string (style_manager &sm, + const char *fmt, ...) +{ + va_list ap; + va_start (ap, fmt); + styled_string result + = styled_string::from_fmt_va (sm, default_tree_printer, fmt, &ap); + va_end (ap); + return result; +} + +class access_diagram_impl; +class bit_to_table_map; + +static void +pp_bit_size_t (pretty_printer *pp, bit_size_t num_bits) +{ + if (num_bits % BITS_PER_UNIT == 0) + { + byte_size_t num_bytes = num_bits / BITS_PER_UNIT; + if (num_bytes == 1) + pp_printf (pp, _("%wi byte"), num_bytes.to_uhwi ()); + else + pp_printf (pp, _("%wi bytes"), num_bytes.to_uhwi ()); + } + else + { + if (num_bits == 1) + pp_printf (pp, _("%wi bit"), num_bits.to_uhwi ()); + else + pp_printf (pp, _("%wi bits"), num_bits.to_uhwi ()); + } +} + +static styled_string +get_access_size_str (style_manager &sm, + const access_operation &op, + access_range accessed_range, + tree type) +{ + bit_size_expr num_bits; + if (accessed_range.get_size (op.m_model, &num_bits)) + { + if (type) + { + styled_string s; + + pretty_printer pp; + num_bits.print (&pp); + + if (op.m_dir == DIR_READ) + return fmt_styled_string (sm, + _("read of %qT (%s)"), + type, + pp_formatted_text (&pp)); + else + return fmt_styled_string (sm, + _("write of %qT (%s)"), + type, + pp_formatted_text (&pp)); + } + if (op.m_dir == DIR_READ) + return num_bits.get_formatted_str (sm, + _("read of %wi bit"), + _("read of %wi bits"), + _("read of %wi byte"), + _("read of %wi bytes"), + _("read of %qE bits"), + _("read of %qE bytes")); + else + return num_bits.get_formatted_str (sm, + _("write of %wi bit"), + _("write of %wi bits"), + _("write of %wi byte"), + _("write of %wi bytes"), + _("write of %qE bits"), + _("write of %qE bytes")); + } + + if (type) + { + if (op.m_dir == DIR_READ) + return fmt_styled_string (sm, _("read of %qT"), type); + else + return fmt_styled_string (sm, _("write of %qT"), type); + } + + if (op.m_dir == DIR_READ) + return styled_string (sm, _("read")); + else + return styled_string (sm, _("write")); +} + +/* Subroutine of clean_up_for_diagram. */ + +static tree +strip_any_cast (tree expr) +{ + if (TREE_CODE (expr) == NOP_EXPR + || TREE_CODE (expr) == NON_LVALUE_EXPR) + expr = TREE_OPERAND (expr, 0); + return expr; +} + +/* Subroutine of clean_up_for_diagram. */ + +static tree +remove_ssa_names (tree expr) +{ + if (TREE_CODE (expr) == SSA_NAME + && SSA_NAME_VAR (expr)) + return SSA_NAME_VAR (expr); + tree t = copy_node (expr); + for (int i = 0; i < TREE_OPERAND_LENGTH (expr); i++) + TREE_OPERAND (t, i) = remove_ssa_names (TREE_OPERAND (expr, i)); + return t; +} + +/* We want to be able to print tree expressions from the analyzer, + which is in the middle end. + + We could use the front-end pretty_printer's formatting routine, + but: + (a) some have additional state in a pretty_printer subclass, so we'd + need to clone global_dc->printer + (b) the "aka" type information added by the C and C++ frontends are + too verbose when building a diagram, and there isn't a good way to ask + for a less verbose version of them. + + Hence we use default_tree_printer. + However, we want to avoid printing SSA names, and instead print the + underlying var name. + Ideally there would be a better tree printer for use by middle end + warnings, but as workaround, this function clones a tree, replacing + SSA names with the var names. */ + +tree +clean_up_for_diagram (tree expr) +{ + tree without_ssa_names = remove_ssa_names (expr); + return strip_any_cast (without_ssa_names); +} + +/* struct bit_size_expr. */ + +text_art::styled_string +bit_size_expr::get_formatted_str (text_art::style_manager &sm, + const char *concrete_single_bit_fmt, + const char *concrete_plural_bits_fmt, + const char *concrete_single_byte_fmt, + const char *concrete_plural_bytes_fmt, + const char *symbolic_bits_fmt, + const char *symbolic_bytes_fmt) const +{ + if (TREE_CODE (m_num_bits) == INTEGER_CST) + { + bit_size_t concrete_num_bits = wi::to_offset (m_num_bits); + if (concrete_num_bits % BITS_PER_UNIT == 0) + { + byte_size_t concrete_num_bytes = concrete_num_bits / BITS_PER_UNIT; + if (concrete_num_bytes == 1) + return fmt_styled_string (sm, concrete_single_byte_fmt, + concrete_num_bytes.to_uhwi ()); + else + return fmt_styled_string (sm, concrete_plural_bytes_fmt, + concrete_num_bytes.to_uhwi ()); + } + else + { + if (concrete_num_bits == 1) + return fmt_styled_string (sm, concrete_single_bit_fmt, + concrete_num_bits.to_uhwi ()); + else + return fmt_styled_string (sm, concrete_plural_bits_fmt, + concrete_num_bits.to_uhwi ()); + } + } + else + { + if (tree bytes_expr = maybe_get_as_bytes ()) + return fmt_styled_string (sm, + symbolic_bytes_fmt, + clean_up_for_diagram (bytes_expr)); + return fmt_styled_string (sm, + symbolic_bits_fmt, + clean_up_for_diagram (m_num_bits)); + } +} + +void +bit_size_expr::print (pretty_printer *pp) const +{ + if (TREE_CODE (m_num_bits) == INTEGER_CST) + { + bit_size_t concrete_num_bits = wi::to_offset (m_num_bits); + pp_bit_size_t (pp, concrete_num_bits); + } + else + { + if (tree bytes_expr = maybe_get_as_bytes ()) + pp_printf (pp, _("%qE bytes"), bytes_expr); + else + pp_printf (pp, _("%qE bits"), m_num_bits); + } +} + +tree +bit_size_expr::maybe_get_as_bytes () const +{ + switch (TREE_CODE (m_num_bits)) + { + default: + break; + case INTEGER_CST: + { + const bit_size_t num_bits = wi::to_offset (m_num_bits); + if (num_bits % BITS_PER_UNIT != 0) + return NULL_TREE; + const bit_size_t num_bytes = num_bits / BITS_PER_UNIT; + return wide_int_to_tree (size_type_node, num_bytes); + } + break; + case PLUS_EXPR: + case MINUS_EXPR: + { + bit_size_expr op0 + = bit_size_expr (TREE_OPERAND (m_num_bits, 0)); + tree op0_as_bytes = op0.maybe_get_as_bytes (); + if (!op0_as_bytes) + return NULL_TREE; + bit_size_expr op1 + = bit_size_expr (TREE_OPERAND (m_num_bits, 1)); + tree op1_as_bytes = op1.maybe_get_as_bytes (); + if (!op1_as_bytes) + return NULL_TREE; + return fold_build2 (TREE_CODE (m_num_bits), size_type_node, + op0_as_bytes, op1_as_bytes); + } + break; + case MULT_EXPR: + { + bit_size_expr op1 + = bit_size_expr (TREE_OPERAND (m_num_bits, 1)); + if (tree op1_as_bytes = op1.maybe_get_as_bytes ()) + return fold_build2 (MULT_EXPR, size_type_node, + TREE_OPERAND (m_num_bits, 0), + op1_as_bytes); + } + break; + } + return NULL_TREE; +} + +/* struct access_range. */ + +access_range::access_range (const region *base_region, const bit_range &bits) +: m_start (region_offset::make_concrete (base_region, + bits.get_start_bit_offset ())), + m_next (region_offset::make_concrete (base_region, + bits.get_next_bit_offset ())) +{ +} + +access_range::access_range (const region *base_region, const byte_range &bytes) +: m_start (region_offset::make_concrete (base_region, + bytes.get_start_bit_offset ())), + m_next (region_offset::make_concrete (base_region, + bytes.get_next_bit_offset ())) +{ +} + +access_range::access_range (const region ®, region_model_manager *mgr) +: m_start (reg.get_offset (mgr)), + m_next (reg.get_next_offset (mgr)) +{ +} + +bool +access_range::get_size (const region_model &model, bit_size_expr *out) const +{ + tree start_expr = m_start.calc_symbolic_bit_offset (model); + if (!start_expr) + return false; + tree next_expr = m_next.calc_symbolic_bit_offset (model); + if (!next_expr) + return false; + *out = bit_size_expr (fold_build2 (MINUS_EXPR, size_type_node, + next_expr, start_expr)); + return true; +} + +bool +access_range::contains_p (const access_range &other) const +{ + return (m_start <= other.m_start + && other.m_next <= m_next); +} + +bool +access_range::empty_p () const +{ + bit_range concrete_bits (0, 0); + if (!as_concrete_bit_range (&concrete_bits)) + return false; + return concrete_bits.empty_p (); +} + +void +access_range::dump_to_pp (pretty_printer *pp, bool simple) const +{ + if (m_start.concrete_p () && m_next.concrete_p ()) + { + bit_range bits (m_start.get_bit_offset (), + m_next.get_bit_offset () - m_start.get_bit_offset ()); + bits.dump_to_pp (pp); + return; + } + pp_character (pp, '['); + m_start.dump_to_pp (pp, simple); + pp_string (pp, " to "); + m_next.dump_to_pp (pp, simple); + pp_character (pp, ')'); +} + +DEBUG_FUNCTION void +access_range::dump (bool simple) const +{ + pretty_printer pp; + pp_format_decoder (&pp) = default_tree_printer; + pp_show_color (&pp) = pp_show_color (global_dc->printer); + pp.buffer->stream = stderr; + dump_to_pp (&pp, simple); + pp_newline (&pp); + pp_flush (&pp); +} + +void +access_range::log (const char *title, logger &logger) const +{ + logger.start_log_line (); + logger.log_partial ("%s: ", title); + dump_to_pp (logger.get_printer (), true); + logger.end_log_line (); +} + +/* struct access_operation. */ + +access_range +access_operation::get_valid_bits () const +{ + const svalue *capacity_in_bytes_sval = m_model.get_capacity (m_base_region); + return access_range + (region_offset::make_concrete (m_base_region, 0), + region_offset::make_byte_offset (m_base_region, capacity_in_bytes_sval)); +} + +access_range +access_operation::get_actual_bits () const +{ + return access_range (m_reg, get_manager ()); +} + +/* If there are any bits accessed invalidly before the valid range, + return true and write their range to *OUT. + Return false if there aren't, or if there's a problem + (e.g. symbolic ranges. */ + +bool +access_operation::maybe_get_invalid_before_bits (access_range *out) const +{ + access_range valid_bits (get_valid_bits ()); + access_range actual_bits (get_actual_bits ()); + + if (actual_bits.m_start >= valid_bits.m_start) + { + /* No part of accessed range is before the valid range. */ + return false; + } + else if (actual_bits.m_next > valid_bits.m_start) + { + /* Get part of accessed range that's before the valid range. */ + *out = access_range (actual_bits.m_start, valid_bits.m_start); + return true; + } + else + { + /* Accessed range is fully before valid range. */ + *out = actual_bits; + return true; + } +} + +/* If there are any bits accessed invalidly after the valid range, + return true and write their range to *OUT. + Return false if there aren't, or if there's a problem. */ + +bool +access_operation::maybe_get_invalid_after_bits (access_range *out) const +{ + access_range valid_bits (get_valid_bits ()); + access_range actual_bits (get_actual_bits ()); + + if (actual_bits.m_next <= valid_bits.m_next) + { + /* No part of accessed range is after the valid range. */ + return false; + } + else if (actual_bits.m_start < valid_bits.m_next) + { + /* Get part of accessed range that's after the valid range. */ + *out = access_range (valid_bits.m_next, actual_bits.m_next); + return true; + } + else + { + /* Accessed range is fully after valid range. */ + *out = actual_bits; + return true; + } +} + +/* A class for capturing all of the region offsets of interest (both concrete + and symbolic), to help align everything in the diagram. + Boundaries can be soft or hard; hard boundaries are emphasized visually + (e.g. the boundary between valid vs invalid accesses). + + Offsets in the boundaries are all expressed relative to the base + region of the access_operation. */ + +class boundaries +{ +public: + enum class kind { HARD, SOFT}; + + boundaries (const region &base_reg) + : m_base_reg (base_reg) + { + } + + void add (region_offset offset, enum kind k) + { + m_all_offsets.insert (offset); + if (k == kind::HARD) + m_hard_offsets.insert (offset); + } + + void add (const access_range &range, enum kind kind) + { + add (range.m_start, kind); + add (range.m_next, kind); + } + + void add (const region ®, region_model_manager *mgr, enum kind kind) + { + add (access_range (reg.get_offset (mgr), + reg.get_next_offset (mgr)), + kind); + } + + void add (const byte_range bytes, enum kind kind) + { + add (access_range (&m_base_reg, bytes), kind); + } + + void add_all_bytes_in_range (const byte_range &bytes) + { + for (byte_offset_t byte_idx = bytes.get_start_byte_offset (); + byte_idx <= bytes.get_next_byte_offset (); + byte_idx = byte_idx + 1) + add (region_offset::make_concrete (&m_base_reg, byte_idx * 8), + kind::SOFT); + } + + void add_all_bytes_in_range (const access_range &range) + { + byte_range bytes (0, 0); + bool valid = range.as_concrete_byte_range (&bytes); + gcc_assert (valid); + add_all_bytes_in_range (bytes); + } + + void log (logger &logger) const + { + logger.log ("boundaries:"); + logger.inc_indent (); + for (auto offset : m_all_offsets) + { + enum kind k = get_kind (offset); + logger.start_log_line (); + logger.log_partial ("%s: ", (k == kind::HARD) ? "HARD" : "soft"); + offset.dump_to_pp (logger.get_printer (), true); + logger.end_log_line (); + } + logger.dec_indent (); + } + + enum kind get_kind (region_offset offset) const + { + gcc_assert (m_all_offsets.find (offset) != m_all_offsets.end ()); + if (m_hard_offsets.find (offset) != m_hard_offsets.end ()) + return kind::HARD; + else + return kind::SOFT; + } + + std::set::const_iterator begin () const + { + return m_all_offsets.begin (); + } + std::set::const_iterator end () const + { + return m_all_offsets.end (); + } + std::set::size_type size () const + { + return m_all_offsets.size (); + } + +private: + const region &m_base_reg; + std::set m_all_offsets; + std::set m_hard_offsets; +}; + +/* A widget that wraps a table but offloads column-width calculation + to a shared object, so that we can vertically line up multiple tables + and have them all align their columns. + + For example, in: + + 01| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 02| |[0]|[1]|[2]|[3]|[4]|[5]| ... |[440]|[441]|[442]|[443]|[444]|[445]| + 03| +---+---+---+---+---+---+ +-----+-----+-----+-----+-----+-----+ + 04| |'L'|'o'|'r'|'e'|'m'|' '| | 'o' | 'r' | 'u' | 'm' | '.' | NUL | + 05| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 06| | string literal (type: 'char[446]') | + 07| +----------------------------------------------------------------------+ + 08| | | | | | | | | | | | | | | | + 09| | | | | | | | | | | | | | | | + 10| v v v v v v v v v v v v v v v + 11|+---+---------------------+----++--------------------------------------+ + 12||[0]| ... |[99]|| after valid range | + 13|+---+---------------------+----+| | + 14|| 'buf' (type: 'char[100]') || | + 15|+------------------------------++--------------------------------------+ + 16||~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~||~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~| + 17| | | + 18| +---------+---------+ +----------+----------+ + 19| |capacity: 100 bytes| |overflow of 346 bytes| + 20| +-------------------+ +---------------------+ + + rows 01-07 and rows 11-15 are x_aligned_table_widget instances. */ + +class x_aligned_table_widget : public leaf_widget +{ +public: + x_aligned_table_widget (table t, + const theme &theme, + table_dimension_sizes &col_widths) + : m_table (std::move (t)), + m_theme (theme), + m_col_widths (col_widths), + m_row_heights (t.get_size ().h), + m_cell_sizes (m_col_widths, m_row_heights), + m_tg (m_table, m_cell_sizes) + { + } + + const char *get_desc () const override + { + return "x_aligned_table_widget"; + } + + canvas::size_t calc_req_size () final override + { + /* We don't compute the size requirements; + the parent should have done this. */ + return m_tg.get_canvas_size (); + } + + void paint_to_canvas (canvas &canvas) final override + { + m_table.paint_to_canvas (canvas, + get_top_left (), + m_tg, + m_theme); + } + + const table &get_table () const { return m_table; } + table_cell_sizes &get_cell_sizes () { return m_cell_sizes; } + void recalc_coords () + { + m_tg.recalc_coords (); + } + +private: + table m_table; + const theme &m_theme; + table_dimension_sizes &m_col_widths; // Reference to shared column widths + table_dimension_sizes m_row_heights; // Unique row heights + table_cell_sizes m_cell_sizes; + table_geometry m_tg; +}; + +/* A widget for printing arrows between the accessed region + and the svalue, showing the direction of the access. + + For example, in: + + 01| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 02| |[0]|[1]|[2]|[3]|[4]|[5]| ... |[440]|[441]|[442]|[443]|[444]|[445]| + 03| +---+---+---+---+---+---+ +-----+-----+-----+-----+-----+-----+ + 04| |'L'|'o'|'r'|'e'|'m'|' '| | 'o' | 'r' | 'u' | 'm' | '.' | NUL | + 05| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 06| | string literal (type: 'char[446]') | + 07| +----------------------------------------------------------------------+ + 08| | | | | | | | | | | | | | | | + 09| | | | | | | | | | | | | | | | + 10| v v v v v v v v v v v v v v v + 11|+---+---------------------+----++--------------------------------------+ + 12||[0]| ... |[99]|| after valid range | + 13|+---+---------------------+----+| | + 14|| 'buf' (type: 'char[100]') || | + 15|+------------------------------++--------------------------------------+ + 16||~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~||~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~| + 17| | | + 18| +---------+---------+ +----------+----------+ + 19| |capacity: 100 bytes| |overflow of 346 bytes| + 20| +-------------------+ +---------------------+ + + rows 8-10 are the direction widget. */ + +class direction_widget : public leaf_widget +{ +public: + direction_widget (const access_diagram_impl &dia_impl, + const bit_to_table_map &btm) + : leaf_widget (), + m_dia_impl (dia_impl), + m_btm (btm) + { + } + const char *get_desc () const override + { + return "direction_widget"; + } + canvas::size_t calc_req_size () final override + { + /* Get our width from our siblings. */ + return canvas::size_t (0, 3); + } + void paint_to_canvas (canvas &canvas) final override; + +private: + const access_diagram_impl &m_dia_impl; + const bit_to_table_map &m_btm; +}; + +/* A widget for adding an x_ruler to a diagram based on table columns, + offloading column-width calculation to shared objects, so that the ruler + lines up with other tables in the diagram. + + For example, in: + + 01| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 02| |[0]|[1]|[2]|[3]|[4]|[5]| ... |[440]|[441]|[442]|[443]|[444]|[445]| + 03| +---+---+---+---+---+---+ +-----+-----+-----+-----+-----+-----+ + 04| |'L'|'o'|'r'|'e'|'m'|' '| | 'o' | 'r' | 'u' | 'm' | '.' | NUL | + 05| +---+---+---+---+---+---+----------+-----+-----+-----+-----+-----+-----+ + 06| | string literal (type: 'char[446]') | + 07| +----------------------------------------------------------------------+ + 08| | | | | | | | | | | | | | | | + 09| | | | | | | | | | | | | | | | + 10| v v v v v v v v v v v v v v v + 11|+---+---------------------+----++--------------------------------------+ + 12||[0]| ... |[99]|| after valid range | + 13|+---+---------------------+----+| | + 14|| 'buf' (type: 'char[100]') || | + 15|+------------------------------++--------------------------------------+ + 16||~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~||~~~~~~~~~~~~~~~~~~+~~~~~~~~~~~~~~~~~~~| + 17| | | + 18| +---------+---------+ +----------+----------+ + 19| |capacity: 100 bytes| |overflow of 346 bytes| + 20| +-------------------+ +---------------------+ + + rows 16-20 are the x_aligned_x_ruler_widget. */ + +class x_aligned_x_ruler_widget : public leaf_widget +{ +public: + x_aligned_x_ruler_widget (const access_diagram_impl &dia_impl, + const theme &theme, + table_dimension_sizes &col_widths) + : m_dia_impl (dia_impl), + m_theme (theme), + m_col_widths (col_widths) + { + } + + const char *get_desc () const override + { + return "x_aligned_ruler_widget"; + } + + void add_range (const table::range_t &x_range, + styled_string text, + style::id_t style_id) + { + m_labels.push_back (label (x_range, std::move (text), style_id)); + } + + canvas::size_t calc_req_size () final override + { + x_ruler r (make_x_ruler ()); + return r.get_size (); + } + + void paint_to_canvas (canvas &canvas) final override + { + x_ruler r (make_x_ruler ()); + r.paint_to_canvas (canvas, + get_top_left (), + m_theme); + } + +private: + struct label + { + label (const table::range_t &table_x_range, + styled_string text, + style::id_t style_id) + : m_table_x_range (table_x_range), + m_text (std::move (text)), + m_style_id (style_id) + { + } + table::range_t m_table_x_range; + styled_string m_text; + style::id_t m_style_id; + }; + + x_ruler make_x_ruler () const; + + const access_diagram_impl &m_dia_impl; + const theme &m_theme; + table_dimension_sizes &m_col_widths; + std::vector