From patchwork Thu Dec 1 02:41:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 61290 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 98C043852C6B for ; Thu, 1 Dec 2022 02:42:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 98C043852C6B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1669862570; bh=0j6IjnC99gYT6+jWppzOTtQ2xMzJqivwHVEou76BKS8=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=XIo97sWEi86SkksgJ3Q2S2SbEJzBn+An6BAPjrSw+xZJ8t2liFEoixnpEirJCFb27 5x49Dkxsm0Gsaxut5HHZ/silSZ7Q+31vjHxo92JgcIr6xl1bJDWzRpWlvgouEv1cHA VLRG+uJ3ah65ZUT9rc2qwYtfgJLy0r2p/YDn1ntA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id A23A13858417 for ; Thu, 1 Dec 2022 02:42:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A23A13858417 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-617-kIUibVsxPFuHGvRwW7Jr-Q-1; Wed, 30 Nov 2022 21:42:05 -0500 X-MC-Unique: kIUibVsxPFuHGvRwW7Jr-Q-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A5F9C185A78B for ; Thu, 1 Dec 2022 02:42:05 +0000 (UTC) Received: from t14s.localdomain.com (unknown [10.2.16.65]) by smtp.corp.redhat.com (Postfix) with ESMTP id 818B82024CBE; Thu, 1 Dec 2022 02:42:05 +0000 (UTC) To: gcc-patches@gcc.gnu.org Cc: David Malcolm Subject: [committed 3/7] analyzer: add note about valid subscripts [PR106626] Date: Wed, 30 Nov 2022 21:41:56 -0500 Message-Id: <20221201024200.3722982-3-dmalcolm@redhat.com> In-Reply-To: <20221201024200.3722982-1-dmalcolm@redhat.com> References: <20221201024200.3722982-1-dmalcolm@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: David Malcolm via Gcc-patches From: David Malcolm Reply-To: David Malcolm Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Consider -fanalyzer on: #include int32_t arr[10]; void int_arr_write_element_after_end_off_by_one(int32_t x) { arr[10] = x; } Trunk x86_64: https://godbolt.org/z/17zn3qYY4 Currently we emit: : In function 'int_arr_write_element_after_end_off_by_one': :7:11: warning: buffer overflow [CWE-787] [-Wanalyzer-out-of-bounds] 7 | arr[10] = x; | ~~~~~~~~^~~ event 1 | | 3 | int32_t arr[10]; | | ^~~ | | | | | (1) capacity is 40 bytes | +--> 'int_arr_write_element_after_end_off_by_one': events 2-3 | | 5 | void int_arr_write_element_after_end_off_by_one(int32_t x) | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | (2) entry to 'int_arr_write_element_after_end_off_by_one' | 6 | { | 7 | arr[10] = x; | | ~~~~~~~~~~~ | | | | | (3) out-of-bounds write from byte 40 till byte 43 but 'arr' ends at byte 40 | :7:11: note: write of 4 bytes to beyond the end of 'arr' 7 | arr[10] = x; | ~~~~~~~~^~~ This is worded in terms of bytes, due to the way -Wanalyzer-out-of-bounds is implemented, but this isn't what the user wrote. This patch tries to get closer to the user's code by adding a note about array bounds when we're referring to an array. In the above example it adds this trailing note: note: valid subscripts for 'arr' are '[0]' to '[9]' Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r13-4427-g7c655699ed51b0. gcc/analyzer/ChangeLog: PR analyzer/106626 * bounds-checking.cc (out_of_bounds::maybe_describe_array_bounds): New. (buffer_overflow::emit): Call maybe_describe_array_bounds. (buffer_overread::emit): Likewise. (buffer_underflow::emit): Likewise. (buffer_underread::emit): Likewise. gcc/testsuite/ChangeLog: PR analyzer/106626 * gcc.dg/analyzer/call-summaries-2.c: Add dg-message for expected note about valid indexes. * gcc.dg/analyzer/out-of-bounds-1.c: Likewise, fixing up existing dg-message directives. * gcc.dg/analyzer/out-of-bounds-write-char-arr.c: Likewise. * gcc.dg/analyzer/out-of-bounds-write-int-arr.c: Likewise. Signed-off-by: David Malcolm --- gcc/analyzer/bounds-checking.cc | 46 +++++++++++++++++-- .../gcc.dg/analyzer/call-summaries-2.c | 1 + .../gcc.dg/analyzer/out-of-bounds-1.c | 16 ++++--- .../analyzer/out-of-bounds-write-char-arr.c | 6 +++ .../analyzer/out-of-bounds-write-int-arr.c | 6 +++ 5 files changed, 64 insertions(+), 11 deletions(-) diff --git a/gcc/analyzer/bounds-checking.cc b/gcc/analyzer/bounds-checking.cc index ad7f431ea2f..b02bc79a926 100644 --- a/gcc/analyzer/bounds-checking.cc +++ b/gcc/analyzer/bounds-checking.cc @@ -71,6 +71,34 @@ public: } protected: + /* Potentially add a note about valid ways to index this array, such + as (given "int arr[10];"): + note: valid subscripts for 'arr' are '[0]' to '[9]' + We print the '[' and ']' characters so as to express the valid + subscripts using C syntax, rather than just as byte ranges, + which hopefully is more clear to the user. */ + void + maybe_describe_array_bounds (location_t loc) const + { + if (!m_diag_arg) + return; + tree t = TREE_TYPE (m_diag_arg); + if (!t) + return; + if (TREE_CODE (t) != ARRAY_TYPE) + return; + tree domain = TYPE_DOMAIN (t); + if (!domain) + return; + tree max_idx = TYPE_MAX_VALUE (domain); + if (!max_idx) + return; + tree min_idx = TYPE_MIN_VALUE (domain); + inform (loc, + "valid subscripts for %qE are %<[%E]%> to %<[%E]%>", + m_diag_arg, min_idx, max_idx); + } + const region *m_reg; tree m_diag_arg; byte_range m_out_of_bounds_range; @@ -165,6 +193,8 @@ public: inform (rich_loc->get_loc (), "write to beyond the end of %qE", m_diag_arg); + + maybe_describe_array_bounds (rich_loc->get_loc ()); } return warned; @@ -245,6 +275,8 @@ public: inform (rich_loc->get_loc (), "read from after the end of %qE", m_diag_arg); + + maybe_describe_array_bounds (rich_loc->get_loc ()); } return warned; @@ -297,8 +329,11 @@ public: { diagnostic_metadata m; m.add_cwe (124); - return warning_meta (rich_loc, m, get_controlling_option (), - "buffer underflow"); + bool warned = warning_meta (rich_loc, m, get_controlling_option (), + "buffer underflow"); + if (warned) + maybe_describe_array_bounds (rich_loc->get_loc ()); + return warned; } label_text describe_final_event (const evdesc::final_event &ev) @@ -346,8 +381,11 @@ public: { diagnostic_metadata m; m.add_cwe (127); - return warning_meta (rich_loc, m, get_controlling_option (), - "buffer underread"); + bool warned = warning_meta (rich_loc, m, get_controlling_option (), + "buffer underread"); + if (warned) + maybe_describe_array_bounds (rich_loc->get_loc ()); + return warned; } label_text describe_final_event (const evdesc::final_event &ev) diff --git a/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c b/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c index 953cbd32f5a..a7a17dbd358 100644 --- a/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c +++ b/gcc/testsuite/gcc.dg/analyzer/call-summaries-2.c @@ -330,6 +330,7 @@ int test_returns_element_ptr (int j) __analyzer_eval (*returns_element_ptr (1) == 8); /* { dg-warning "TRUE" } */ __analyzer_eval (*returns_element_ptr (2) == 9); /* { dg-warning "TRUE" } */ return *returns_element_ptr (3); /* { dg-warning "buffer overread" } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[2\\\]'" "valid subscript note" { target *-*-* } .-1 } */ } int returns_offset (int arr[3], int i) diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c index 9f3cda6e02b..dc4de9b28a6 100644 --- a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c +++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-1.c @@ -25,8 +25,9 @@ void test1 (void) id_sequence[2] = 345; id_sequence[3] = 456; /* { dg-line test1 } */ - /* { dg-warning "overflow" "warning" { target *-*-* } test1 } */ - /* { dg-message "" "note" { target *-*-* } test1 } */ + /* { dg-warning "stack-based buffer overflow" "warning" { target *-*-* } test1 } */ + /* { dg-message "write of 4 bytes to beyond the end of 'id_sequence'" "num bad bytes note" { target *-*-* } test1 } */ + /* { dg-message "valid subscripts for 'id_sequence' are '\\\[0\\\]' to '\\\[2\\\]'" "valid subscript note" { target *-*-* } test1 } */ } void test2 (void) @@ -46,8 +47,9 @@ void test3 (void) for (int i = n; i >= 0; i--) arr[i] = i; /* { dg-line test3 } */ - /* { dg-warning "overflow" "warning" { target *-*-* } test3 } */ - /* { dg-message "" "note" { target *-*-* } test3 } */ + /* { dg-warning "stack-based buffer overflow" "warning" { target *-*-* } test3 } */ + /* { dg-message "write of 4 bytes to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } test3 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[3\\\]'" "valid subscript note" { target *-*-* } test3 } */ } void test4 (void) @@ -72,7 +74,7 @@ void test5 (void) *last_el = 4; /* { dg-line test5 } */ free (arr); - /* { dg-warning "overflow" "warning" { target *-*-* } test5 } */ + /* { dg-warning "heap-based buffer overflow" "warning" { target *-*-* } test5 } */ /* { dg-message "" "note" { target *-*-* } test5 } */ } @@ -89,9 +91,9 @@ void test6 (void) printf ("x=%d y=%d *p=%d *q=%d\n" , x, y, *p, *q); /* { dg-line test6c } */ } - /* { dg-warning "overflow" "warning" { target *-*-* } test6b } */ + /* { dg-warning "buffer overflow" "warning" { target *-*-* } test6b } */ /* { dg-message "" "note" { target *-*-* } test6b } */ - /* { dg-warning "overread" "warning" { target *-*-* } test6c } */ + /* { dg-warning "buffer overread" "warning" { target *-*-* } test6c } */ /* { dg-message "" "note" { target *-*-* } test6c } */ } diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-char-arr.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-char-arr.c index 3564476c322..739ebb11590 100644 --- a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-char-arr.c +++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-char-arr.c @@ -4,18 +4,21 @@ void int_arr_write_element_before_start_far(char x) { arr[-100] = x; /* { dg-warning "buffer underflow" "warning" } */ /* { dg-message "out-of-bounds write at byte -100 but 'arr' starts at byte 0" "final event" { target *-*-* } .-1 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-2 } */ } void int_arr_write_element_before_start_near(char x) { arr[-2] = x; /* { dg-warning "buffer underflow" "warning" } */ /* { dg-message "out-of-bounds write at byte -2 but 'arr' starts at byte 0" "final event" { target *-*-* } .-1 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-2 } */ } void int_arr_write_element_before_start_off_by_one(char x) { arr[-1] = x; /* { dg-warning "buffer underflow" "warning" } */ /* { dg-message "out-of-bounds write at byte -1 but 'arr' starts at byte 0" "final event" { target *-*-* } .-1 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-2 } */ } void int_arr_write_element_at_start(char x) @@ -33,6 +36,7 @@ void int_arr_write_element_after_end_off_by_one(char x) arr[10] = x; /* { dg-warning "buffer overflow" "warning" } */ /* { dg-message "out-of-bounds write at byte 10 but 'arr' ends at byte 10" "final event" { target *-*-* } .-1 } */ /* { dg-message "write of 1 byte to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } .-2 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-3 } */ } void int_arr_write_element_after_end_near(char x) @@ -40,6 +44,7 @@ void int_arr_write_element_after_end_near(char x) arr[11] = x; /* { dg-warning "buffer overflow" "warning" } */ /* { dg-message "out-of-bounds write at byte 11 but 'arr' ends at byte 10" "final event" { target *-*-* } .-1 } */ /* { dg-message "write of 1 byte to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } .-2 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-3 } */ } void int_arr_write_element_after_end_far(char x) @@ -47,4 +52,5 @@ void int_arr_write_element_after_end_far(char x) arr[100] = x; /* { dg-warning "buffer overflow" "warning" } */ /* { dg-message "out-of-bounds write at byte 100 but 'arr' ends at byte 10" "final event" { target *-*-* } .-1 } */ /* { dg-message "write of 1 byte to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } .-2 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-3 } */ } diff --git a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-int-arr.c b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-int-arr.c index 24a9a6bfa18..b2b37b92e01 100644 --- a/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-int-arr.c +++ b/gcc/testsuite/gcc.dg/analyzer/out-of-bounds-write-int-arr.c @@ -6,18 +6,21 @@ void int_arr_write_element_before_start_far(int32_t x) { arr[-100] = x; /* { dg-warning "buffer underflow" "warning" } */ /* { dg-message "out-of-bounds write from byte -400 till byte -397 but 'arr' starts at byte 0" "final event" { target *-*-* } .-1 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-2 } */ } void int_arr_write_element_before_start_near(int32_t x) { arr[-2] = x; /* { dg-warning "buffer underflow" "warning" } */ /* { dg-message "out-of-bounds write from byte -8 till byte -5 but 'arr' starts at byte 0" "final event" { target *-*-* } .-1 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-2 } */ } void int_arr_write_element_before_start_off_by_one(int32_t x) { arr[-1] = x; /* { dg-warning "buffer underflow" "warning" } */ /* { dg-message "out-of-bounds write from byte -4 till byte -1 but 'arr' starts at byte 0" "final event" { target *-*-* } .-1 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-2 } */ } void int_arr_write_element_at_start(int32_t x) @@ -35,6 +38,7 @@ void int_arr_write_element_after_end_off_by_one(int32_t x) arr[10] = x; /* { dg-warning "buffer overflow" "warning" } */ /* { dg-message "out-of-bounds write from byte 40 till byte 43 but 'arr' ends at byte 40" "final event" { target *-*-* } .-1 } */ /* { dg-message "write of 4 bytes to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } .-2 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-3 } */ } void int_arr_write_element_after_end_near(int32_t x) @@ -42,6 +46,7 @@ void int_arr_write_element_after_end_near(int32_t x) arr[11] = x; /* { dg-warning "buffer overflow" "warning" } */ /* { dg-message "out-of-bounds write from byte 44 till byte 47 but 'arr' ends at byte 40" "final event" { target *-*-* } .-1 } */ /* { dg-message "write of 4 bytes to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } .-2 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-3 } */ } void int_arr_write_element_after_end_far(int32_t x) @@ -49,4 +54,5 @@ void int_arr_write_element_after_end_far(int32_t x) arr[100] = x; /* { dg-warning "buffer overflow" "warning" } */ /* { dg-message "out-of-bounds write from byte 400 till byte 403 but 'arr' ends at byte 40" "final event" { target *-*-* } .-1 } */ /* { dg-message "write of 4 bytes to beyond the end of 'arr'" "num bad bytes note" { target *-*-* } .-2 } */ + /* { dg-message "valid subscripts for 'arr' are '\\\[0\\\]' to '\\\[9\\\]'" "valid subscript note" { target *-*-* } .-3 } */ }