From patchwork Tue Mar 12 08:51:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 87069 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EA7AE38582B6 for ; Tue, 12 Mar 2024 08:53:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 6BE00385841E for ; Tue, 12 Mar 2024 08:51:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6BE00385841E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6BE00385841E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710233507; cv=none; b=WnSG7HevLZJ11PXEado7PUS32dK1T7vxXaDzrDIyce3CX06Xr8Yc16/3ePYCKq5kv+hp+LNLBr3M1sq+yYxQ988XoE15bOU1N/78O349nKNaFRsfYfSGk8+IRPmFWOjD+bLP3fsZvmCNnIF8SNYHoc/1IpGCf8ftYUePIsvkYaA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710233507; c=relaxed/simple; bh=J8C+OPCXUJ8u5UgCjRYHemYuJUIFhq53UXdD9YrFV4I=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=DVdPATLOriXo2BmkwJorDEHecC7XGDDgyS/gIJHFA1/X90h61rvZUk8UZvBp3W/jFqD5b7MybW2I1WL/xIjnqgfGntDLFyIjLBb++yt5d76KR4173JQx3i1VgrRDLN7EnspALmCLHgRLUH448I4F2xNe/RuaIelC6nrECBg8pW8= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1710233505; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=amGCJ7Zyemfdw02NjK7wYuKid61i/gVSr5tkz5LHwX8=; b=bVX9D4ahN7ayg8e3TVM5ey6tKMYr9Sxzb/4y3JaITS+PuWYizVykIrkxqoplkFhgOvitd+ AN1NH85lXEGnfLnbmLwyZhIV3gVUwT/oRsB71fIc0rAsWu6jSIMtLUcvigYTmcQNdj0LbN hdTt9YkE9BspRzlju4wWA5raL0V4lqQ= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-542-ugXnDgrZO8i2dqeS6iLnbQ-1; Tue, 12 Mar 2024 04:51:43 -0400 X-MC-Unique: ugXnDgrZO8i2dqeS6iLnbQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2F0271C07F3F; Tue, 12 Mar 2024 08:51:43 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.225.36]) by smtp.corp.redhat.com (Postfix) with ESMTPS id E1BDB2022EDB; Tue, 12 Mar 2024 08:51:42 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 42C8pfcJ3184439 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Tue, 12 Mar 2024 09:51:41 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 42C8pfhL3184438; Tue, 12 Mar 2024 09:51:41 +0100 Date: Tue, 12 Mar 2024 09:51:41 +0100 From: Jakub Jelinek To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] asan: Instrument stores in callees rather than callers [PR112709] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-4.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi! asan currently instruments since PR69276 r6-6758 fix calls which store the return value into memory on the caller side, before the call it verifies the memory is writable. Now PR112709 where we ICE on trying to instrument such calls made me think about whether that is what we want to do. There are 3 different cases. One is when a function returns an aggregate which is passed e.g. in registers, say like struct S { int a[4]; }; returning on x86_64. That would be ideally instrumented in between the actual call and storing of the aggregate into memory, but asan currently mostly works as a GIMPLE pass and arranging for the instrumentation to happen at that spot would be really hard. We could diagnose after the call but generally asan attempts to diagnose stuff before something is overwritten rather than after, or keep the current behavior (that is what this patch does, which has the disadvantage that it can complain about UB even for functions which never return and so never actually store, and doesn't check whether the memory wasn't e.g. poisoned during the call) or could e.g. instrument both before and after the call (that would have the disadvantage the current state has but at least would check post-factum the store again afterwards). Another case is when a function returns an aggregate through a hidden reference, struct T { int a[128]; }; on x86_64 or even the above struct S on ia32 as example. In the actual program such stores happen when storing something to or its parts in the callee, because there expands to *hidden_retval. So, IMHO we should instrument those in the callee rather than caller, that is where the writes are and we can do that easily. This is what the patch below does. And the last case is for builtins/internal functions. Usually those don't return aggregates, but in case they'd do and can be expanded inline, it is better to instrument them in the caller (as before) rather than not instrumenting the return stores at all. I had to tweak the expected output on the PR69276 testcase, because with the patch it keeps previous behavior on x86_64 (structure returned in registers, stored in the caller, so reported as UB in A::A()), while on i686 it changed the behavior and is reported as UB in the vnull::operator vec which stores the structure, A::A() is then a frame above it in the backtrace. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-03-12 Jakub Jelinek PR sanitizer/112709 * asan.cc (has_stmt_been_instrumented_p): Don't instrument call stores on the caller side unless it is a call to a builtin or internal function or function doesn't return by hidden reference. (maybe_instrument_call): Likewise. (instrument_derefs): Instrument stores to RESULT_DECL if returning by hidden reference. * gcc.dg/asan/pr112709-1.c: New test. * g++.dg/asan/pr69276.C: Adjust expected output for some targets. Jakub --- gcc/asan.cc.jj 2024-03-06 09:35:04.132894608 +0100 +++ gcc/asan.cc 2024-03-11 13:49:58.931045179 +0100 @@ -1372,7 +1372,12 @@ has_stmt_been_instrumented_p (gimple *st return true; } } - else if (is_gimple_call (stmt) && gimple_store_p (stmt)) + else if (is_gimple_call (stmt) + && gimple_store_p (stmt) + && (gimple_call_builtin_p (stmt) + || gimple_call_internal_p (stmt) + || !aggregate_value_p (TREE_TYPE (gimple_call_lhs (stmt)), + gimple_call_fntype (stmt)))) { asan_mem_ref r; asan_mem_ref_init (&r, NULL, 1); @@ -2751,7 +2756,9 @@ instrument_derefs (gimple_stmt_iterator return; poly_int64 decl_size; - if ((VAR_P (inner) || TREE_CODE (inner) == RESULT_DECL) + if ((VAR_P (inner) + || (TREE_CODE (inner) == RESULT_DECL + && !aggregate_value_p (inner, current_function_decl))) && offset == NULL_TREE && DECL_SIZE (inner) && poly_int_tree_p (DECL_SIZE (inner), &decl_size) @@ -3023,7 +3030,11 @@ maybe_instrument_call (gimple_stmt_itera } bool instrumented = false; - if (gimple_store_p (stmt)) + if (gimple_store_p (stmt) + && (gimple_call_builtin_p (stmt) + || gimple_call_internal_p (stmt) + || !aggregate_value_p (TREE_TYPE (gimple_call_lhs (stmt)), + gimple_call_fntype (stmt)))) { tree ref_expr = gimple_call_lhs (stmt); instrument_derefs (iter, ref_expr, --- gcc/testsuite/gcc.dg/asan/pr112709-1.c.jj 2024-03-11 13:59:15.300408140 +0100 +++ gcc/testsuite/gcc.dg/asan/pr112709-1.c 2024-03-11 13:59:58.626813417 +0100 @@ -0,0 +1,52 @@ +/* PR sanitizer/112709 */ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=address -O2" } */ + +struct S { char c[1024]; }; +int foo (int); + +__attribute__((returns_twice, noipa)) struct S +bar (int x) +{ + (void) x; + struct S s = {}; + s.c[42] = 42; + return s; +} + +void +baz (struct S *p) +{ + foo (1); + *p = bar (0); +} + +void +qux (int x, struct S *p) +{ + if (x == 25) + x = foo (2); + else if (x == 42) + x = foo (foo (3)); + *p = bar (x); +} + +void +corge (int x, struct S *p) +{ + void *q[] = { &&l1, &&l2, &&l3, &&l3 }; + if (x == 25) + { + l1: + x = foo (2); + } + else if (x == 42) + { + l2: + x = foo (foo (3)); + } +l3: + *p = bar (x); + if (x < 4) + goto *q[x & 3]; +} --- gcc/testsuite/g++.dg/asan/pr69276.C.jj 2020-01-14 20:02:46.691611212 +0100 +++ gcc/testsuite/g++.dg/asan/pr69276.C 2024-03-12 09:09:05.901446463 +0100 @@ -35,4 +35,5 @@ int main() } /* { dg-output "ERROR: AddressSanitizer: heap-buffer-overflow.*(\n|\r\n|\r)" } */ -/* { dg-output " #0 0x\[0-9a-f\]+ +in A::A()" } */ +/* { dg-output " #0 0x\[0-9a-f\]+ +in (A::A\\\(\\\)|vnull::operator vec\\\(\\\).*(\n|\r\n|\r)" } */ +/* { dg-output " #1 0x\[0-9a-f\]+ +in A::A\\\(\\\))" } */