From patchwork Mon Apr 18 19:36:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Faust X-Patchwork-Id: 53025 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7AC653858016 for ; Mon, 18 Apr 2022 19:37:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7AC653858016 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1650310657; bh=t3zyf86zXfbiHafGRDk/DaboJnnS0+fHVlxPiastens=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=jla5b4hPcZGEv+EZy2gzeIfMQt9JM6NdtNu/ynWiI4W5O80DNx6ZK0dhqj9xJmowh 0eMo63vJKBACugdYSWMHx9USZyVphgg5f8MVtBJidEmsj7/w8UuL1BFpdwULaRpA/v al5qDQHmBhBskEqqD86qnKzYrcc/0OxUxWaW9SxU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by sourceware.org (Postfix) with ESMTPS id 3C87E3858D28 for ; Mon, 18 Apr 2022 19:36:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3C87E3858D28 Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23IHFvQH019298; Mon, 18 Apr 2022 19:36:53 GMT Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com with ESMTP id 3ffm7cm549-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Apr 2022 19:36:52 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.16.1.2/8.16.1.2) with SMTP id 23IJWM27030900; Mon, 18 Apr 2022 19:36:51 GMT Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2177.outbound.protection.outlook.com [104.47.55.177]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com with ESMTP id 3ffm87j689-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 18 Apr 2022 19:36:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Fzk4yOHc41cjmxPvXRCbH5JUBk4k0AY5b48QRC3Iwt8e5XFfHs2/MxuBARXCu3eBNNiRFjB7Ngtg1M7zCTAt9HPTFa6h7irLGVhgpHzBiJmaXKIu2zf1b4NB0QOT7MUg1zHF3Gm8AREIaA5e6me8lRvrmJy5p7/ikCuxMHmmKMkNKpKte3Kt1/dJx/Itk0/pajgxn+POK7PCPzKqVGMasVOjB1oB3bJUmu2hzZ7WtJkyZF9ee0gJy6BBY9EPUUSI9Xrqco3hHDPPQ3lz2LhbZn+GZmt0RndY2srmYwCPtbB5lI8h0YxPpPks33CETCD6ADl70AixF+Q17xD1ewsifg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=t3zyf86zXfbiHafGRDk/DaboJnnS0+fHVlxPiastens=; b=hWcZAa4Ol5yrMgctqUofTf6tVBE7lDz/ojnnFpbaJgZbfTWcdAQvWsWFAfX26j7HRiHr6JXpfGAzH07nNufCZu8BGd6lqyi2ytGqVhOpUDo5llxTr8N0LFtcig6Su846cEggS3MzUdpG0wDkFPvcH0swQH1WXx8U0N3p2SGLRiGFYYt0h+eflZDD6YpcEo4PC39UaUE3NmTlJQshzGjZARjPTX5h6xV04IX7927DuUidXJD/xGaJwXov6uDSeC0gB2rT+sIm40U2OtopPhbPGMbhi1valJupPojtTuryM/MF9aww4rWL7YUDZxC9bvg943fNUYJ8fAbOHDZtiwVwUw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none Received: from MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) by BYAPR10MB2470.namprd10.prod.outlook.com (2603:10b6:a02:ae::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5164.18; Mon, 18 Apr 2022 19:36:49 +0000 Received: from MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::705e:a465:c30b:fec6]) by MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::705e:a465:c30b:fec6%6]) with mapi id 15.20.5164.025; Mon, 18 Apr 2022 19:36:49 +0000 Message-ID: Date: Mon, 18 Apr 2022 12:36:45 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: [ping][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <20220401194216.16469-1-david.faust@oracle.com> In-Reply-To: <20220401194216.16469-1-david.faust@oracle.com> X-ClientProxiedBy: SJ0PR03CA0175.namprd03.prod.outlook.com (2603:10b6:a03:338::30) To MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: b427f050-ea1b-457d-9674-08da2172c7c1 X-MS-TrafficTypeDiagnostic: BYAPR10MB2470:EE_ X-Microsoft-Antispam-PRVS: X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: eVlBEhMNJLU6IImH9oOWkgBjwO5Gy0Ezt6/OdV4ypTAYAm4OMBb1snTNmuHsZ1ZleCBAM06atogXnOBgxDEVFPx8CPzm898/0LsmnyLZkC0ONifBrdaPzr9Mo2L/aOIxqcHkLLZeJPAY2EeNS9qVL9QVUQuii+GfoJMrp8RlpJXUfBwwwB1IMjdChQIiffCpIdqC4J+dLiOXpzy5p84mZYf7hLfrspNBph/Yfomod1r1NhizMjbHGXx1RNX44rKs7k9BbStYZrn+s1zwyD0x20hEDscoCkCekYdoWxCK4DWwCv8TNv0jr+dNnt/tF/GPSAHxDxg7NAABca+9KNS3dWXOxL4O6StEn6Pgt3cV28orBWws6MDVa2P4e1JpY9u2feJLtD0VOyI5OBVKafZ8z9IZTAWPhvtMQGyJ6k+kZS0vzC1J2MufZoXzGx2onyX2zz9wrVvd3u+W0jGnwQgSW1QpaAy9yABPR8JKoWoXH8HskZ0NttHwUPjRza+sZwIGVrjaBi/q2tMJYZ2JhHxCXI3fBo/mjdhJOst8PgkLjPM/6mrDOwKIUjLdqm8zFb4TPGx4St1fsmKX0zaXb9asWmey9/T/KlQXMxHuh3OTbp1Eml6NeceyPxw7VVxDy0QmhowTkIa2tFd/qrWCr+uUpCI0rAG4/CVi4guUmvt/L1qw6zbfcEIuR1XixGnuSpWGKC58W8PzwQ43CMUN+slZvzc8VMP8sRSHsT1clEVx+v8/eUnLTCz/mrOkM7HzD1ai/vkGnbEvyNaeDZVG5AeE5J6dL5kKU0tKObh4S5hKP1fKJYdOK0oZ2Tv7SMDZO9eiXibofW95sJ95uSsWhVEeE55nvcSvgrybmZY/idl3kso= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR10MB3213.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(6512007)(5660300002)(31686004)(316002)(30864003)(186003)(38100700002)(86362001)(2616005)(54906003)(53546011)(83380400001)(44832011)(2906002)(508600001)(66476007)(66556008)(6486002)(66946007)(966005)(6506007)(8936002)(8676002)(4326008)(6916009)(6666004)(36756003)(31696002)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?vB308I4GXzCswNAlPY2ckeqUrI+E?= =?utf-8?q?R5G7f1XUFesMB3tE0os1LUPAb6AcMZSBUkBCCuJ4GP1uNHgKAXuuGK9rrAFBph1tO?= =?utf-8?q?i5NStloVjzPcj0fK8Sx3hdwkkqR1wQVBfKGjxap2sC7hGZcQeF3f5XsYF+43+a+Qe?= =?utf-8?q?48yX5iYilPHFwdFWPg88NR4WB4NgsTak5VtKDy+foSaHYRJbmdVgGmsKGavlmm6CD?= =?utf-8?q?rTEVzG3Z9lAYXWnGpnCmB0igv3/zHTeNjPWHqkdf6H5uD96r71jB6ddE+unuGhBww?= =?utf-8?q?rjJKM9g5n9NVobrp8S8o36xQZfUo9wtRd/VV26339KdMB695vGoBTKZ0EKAvPnRMj?= =?utf-8?q?uQO9WtcxpIhBwIAh8pARQ5GD9pzCZDR1rpgrmRNBCzncEB7tHbE8ITP2XP2E5MPjy?= =?utf-8?q?e59WKbOF6MF5Jrkvyc0SGFKGnN10RkqftGxmV1rcp8/4zu/Dqz2lCAy6pFDs1VWVi?= =?utf-8?q?5w2UIKcg6g5EZIq7JvmQ8PXhuHL5lN6ZEICSeLrf5U0yhRKm7QdrkC/h/kTkw+o5m?= =?utf-8?q?LdpA0Ht20ufchdMfEbxrt+g1c+qZnCvYQutCHRb5UgjjZzUS/1/b0pGuyKWzuErha?= =?utf-8?q?Emf0j2P/GPn5IdzFHM2/aMRt84k6CqbfjAAn4qAztxlKEb1B+l4XIr2G32UgtXHt9?= =?utf-8?q?y9PfrsCGLGdr/K4oR9ostc/YhAQfAPU1o66uzgqNPwXOH7FxQQXYeeTPnTZ6Tr+w0?= =?utf-8?q?8jJi7KcekqYcVDLgEIiYdgXZqTpfoUSGnEI68aRTdAGUHdUMqej21sLCnoIvjFX60?= =?utf-8?q?kriVjD6bk36tqwLRID3UaqllpFIoGn1JFmceRsdxyWm0RgR0DLbypoKJbNZwbLEah?= =?utf-8?q?tp5ZTNlLO49q6z/K1j2vpkuFvs0PIZJoiAmlc0rWBkdSVE4RyxSh+QTTo/UawJcsh?= =?utf-8?q?S3ogPd72Y1+nPbSzJLe+dZwc9t6GC9peAku08+TmFVj1stsr+JSxbH08P7aQvPWb8?= =?utf-8?q?1W4Gnza+LNU1Mrkgtoed8+Bhniy1fW5htjR1sUNXs/w/tFYWv76DEx0FwLpxSjCW1?= =?utf-8?q?nAlLpwBV6uNzOTHwYpy6Yeonhkd4NCiuL0MD/jvGsw30I397VZbtinhRL+VglLtkw?= =?utf-8?q?WPRDnXtB6AEuKVDEoDqgNi+PvjmuMlpM6vj9gtA5ID5WzmjuI7XKGAL79ayhrGtpF?= =?utf-8?q?/qZs0EiZzYYFk+T9u7jPG5X1L+qzklw0D0Iennk/AOYR/OMlBxsHAYkKbsURL8zA9?= =?utf-8?q?+w6dorIF3njhTnvatESYK6MDBDFrjZQWAnZQsHnkXteLJJZd3Ksja+yl8wOtdo5Iq?= =?utf-8?q?zM3XPEGrcIZf8nxCb0j4oMWwDr3QqBj7HmOyfLZfH3XpalNXXLznfcdddl7pvhBAO?= =?utf-8?q?4XsRmnCT3Oi1oolZTXvc2wqLJBrdlfCgLiClxCY/dqIMs3QxFsTO6ftGdMqU+kdNO?= =?utf-8?q?K7kUVgrblcy6KIm5uDX+cgUj9O+DeGkY9e+VfyHzB0nnsm1sacvbkBLOtKFpBDvzt?= =?utf-8?q?yw2QR1D57e/wR9QATGWj2A5Jthq5uHiOgPd5IKCE8ipk9ck8ghGjWUrl1KQFFxSi9?= =?utf-8?q?S3rCf4Rb+8VXRVC2h/9vskv9E9ld1L6H1NMSIlOIIfQMgeqfe1Dt6bnx4/usi1RVL?= =?utf-8?q?OBBHuJg1AU8AclLTmtxUEua/xWNpg+5uErllbyzCKuehyrqPcKHfqOtJfupRuEAAd?= =?utf-8?q?sWsdRGHPc1pSHptbqTR5cBTiiqTMvnwj0rrfBXTkt0sgY4VGxvGMDuAyawrhn7aAh?= =?utf-8?q?3EIheDWcM?= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: b427f050-ea1b-457d-9674-08da2172c7c1 X-MS-Exchange-CrossTenant-AuthSource: MN2PR10MB3213.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Apr 2022 19:36:48.9875 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: QsmBKxgh2MHDcSBeBhPFZu6kcxPK660fabEgziKBe+xotANZjPYYov5QjGyZ/C1uAY+RYBo/OthA1Xo12KFqlw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR10MB2470 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.486, 18.0.858 definitions=2022-04-18_02:2022-04-15, 2022-04-18 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxscore=0 phishscore=0 adultscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204180113 X-Proofpoint-GUID: qSK09259VbVrDwFE08JmfY_3B5EGQVj9 X-Proofpoint-ORIG-GUID: qSK09259VbVrDwFE08JmfY_3B5EGQVj9 X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: David Faust via Gcc-patches From: David Faust Reply-To: David Faust Cc: Yonghong Song Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Gentle ping :) Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/592685.html The series adds support for new attribues btf_type_tag and btf_decl_tag, for recording arbitrary string tags in DWARF and BTF debug info. The feature is to support kernel use cases. Thanks, David On 4/1/22 12:42, David Faust via Gcc-patches wrote: > Hello, > > This patch series is a first attempt at adding support for: > > - Two new C-language-level attributes that allow to associate (to "tag") > particular declarations and types with arbitrary strings. As explained below, > this is intended to be used to, for example, characterize certain pointer > types. > > - The conveyance of that information in the DWARF output in the form of a new > DIE: DW_TAG_GNU_annotation. > > - The conveyance of that information in the BTF output in the form of two new > kinds of BTF objects: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. > > All of these facilities are being added to the eBPF ecosystem, and support for > them exists in some form in LLVM. However, as we shall see, we have found some > problems implementing them so some discussion is in order. > > Purpose > ======= > > 1) Addition of C-family language constructs (attributes) to specify free-text > tags on certain language elements, such as struct fields. > > The purpose of these annotations is to provide additional information about > types, variables, and function paratemeters of interest to the kernel. A > driving use case is to tag pointer types within the linux kernel and eBPF > programs with additional semantic information, such as '__user' or '__rcu'. > > For example, consider the linux kernel function do_execve with the > following declaration: > > static int do_execve(struct filename *filename, > const char __user *const __user *__argv, > const char __user *const __user *__envp); > > Here, __user could be defined with these annotations to record semantic > information about the pointer parameters (e.g., they are user-provided) in > DWARF and BTF information. Other kernel facilites such as the eBPF verifier > can read the tags and make use of the information. > > 2) Conveying the tags in the generated DWARF debug info. > > The main motivation for emitting the tags in DWARF is that the Linux kernel > generates its BTF information via pahole, using DWARF as a source: > > +--------+ BTF BTF +----------+ > | pahole |-------> vmlinux.btf ------->| verifier | > +--------+ +----------+ > ^ ^ > | | > DWARF | BTF | > | | > vmlinux +-------------+ > module1.ko | BPF program | > module2.ko +-------------+ > ... > > This is because: > > a) Unlike GCC, LLVM will only generate BTF for BPF programs. > > b) GCC can generate BTF for whatever target with -gbtf, but there is no > support for linking/deduplicating BTF in the linker. > > In the scenario above, the verifier needs access to the pointer tags of > both the kernel types/declarations (conveyed in the DWARF and translated > to BTF by pahole) and those of the BPF program (available directly in BTF). > > Another motivation for having the tag information in DWARF, unrelated to > BPF and BTF, is that the drgn project (another DWARF consumer) also wants > to benefit from these tags in order to differentiate between different > kinds of pointers in the kernel. > > 3) Conveying the tags in the generated BTF debug info. > > This is easy: the main purpose of having this info in BTF is for the > compiled eBPF programs. The kernel verifier can then access the tags > of pointers used by the eBPF programs. > > > For more information about these tags and the motivation behind them, please > refer to the following linux kernel discussions: > > https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ > https://lore.kernel.org/bpf/20211012164838.3345699-1-yhs@fb.com/ > https://lore.kernel.org/bpf/20211112012604.1504583-1-yhs@fb.com/ > > > What is in this patch series > ============================ > > This patch series adds support for these annotations in GCC. The implementation > is largely complete. However, in some cases the produced debug info (both DWARF > and BTF) differs significantly from that produced by LLVM. This issue is > discussed in detail below, along with a few specific questions for both GCC and > LLVM. Any input would be much appreciated. > > > Implementation Overview > ======================= > > To enable these annotations, two new C language attributes are added: > __attribute__((btf_decl_tag("foo")) and __attribute__((btf_type_tag("bar"))). > Both attributes accept a single arbitrary string constant argument, which will > be recorded in the generated DWARF and/or BTF debugging information. They have > no effect on code generation. > > Note that we are using the same attribute names as LLVM, which include "btf" > in the name. This may be controversial, as these tags are not really > BTF-specific. A different name may be more appropriate. There was much > discussion about naming in the proposal for the functionality in LLVM, the > full thread can be found here: > > https://lists.llvm.org/pipermail/llvm-dev/2021-June/151023.html > > The name debug_info_annotate, suggested here, might better suit the attribute: > > https://lists.llvm.org/pipermail/llvm-dev/2021-June/151042.html > > DWARF support is enabled via a new DW_TAG_GNU_annotation. When generating DWARF, > declarations and types will be checked for the corresponding attributes. If > present, a DW_TAG_GNU_annotation DIE will be created as a child of the DIE for > the annotated type or declaration, one for each tag. These DIEs link the > arbitrary tag value to the item they annotate. > > For example, the following variable declaration: > > #define __typetag1 __attribute__((btf_type_tag("type-tag-1"))) > #define __decltag1 __attribute__((btf_decl_tag("decl-tag-1"))) > #define __decltag2 __attribute__((btf_decl_tag("decl-tag-2"))) > > int __typetag1 * x __decltag1 __decltag2; > > Produces the following DIEs: > > <1><1e>: Abbrev Number: 3 (DW_TAG_variable) > <1f> DW_AT_name : x > <21> DW_AT_decl_file : 1 > <22> DW_AT_decl_line : 6 > <23> DW_AT_decl_column : 18 > <24> DW_AT_type : <0x49> > <28> DW_AT_external : 1 > <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) > <32> DW_AT_sibling : <0x49> > <2><36>: Abbrev Number: 1 (User TAG value: 0x6000) > <37> DW_AT_name : (indirect string, offset: 0x10): btf_decl_tag > <3b> DW_AT_const_value : (indirect string, offset: 0x0): decl-tag-2 > <2><3f>: Abbrev Number: 1 (User TAG value: 0x6000) > <40> DW_AT_name : (indirect string, offset: 0x10): btf_decl_tag > <44> DW_AT_const_value : (indirect string, offset: 0x1d): decl-tag-1 > <2><48>: Abbrev Number: 0 > <1><49>: Abbrev Number: 4 (DW_TAG_pointer_type) > <4a> DW_AT_byte_size : 8 > <4b> DW_AT_type : <0x5d> > <4f> DW_AT_sibling : <0x5d> > <2><53>: Abbrev Number: 1 (User TAG value: 0x6000) > <54> DW_AT_name : (indirect string, offset: 0x28): btf_type_tag > <58> DW_AT_const_value : (indirect string, offset: 0xd7): type-tag-1 > <2><5c>: Abbrev Number: 0 > <1><5d>: Abbrev Number: 5 (DW_TAG_base_type) > <5e> DW_AT_byte_size : 4 > <5f> DW_AT_encoding : 5 (signed) > <60> DW_AT_name : int > <1><64>: Abbrev Number: 0 > > Please note that currently, the annotation DWARF DIEs will be generated only if > BTF debug information requested (via -gbtf). Therefore, the annotation DIEs > will only be output if both BTF and DWARF are requested (e.g. -gbtf -gdwarf). > This will change, since these tags are needed even when not generating BTF, > for example in a GCC-built Linux kernel. > > In the case of BTF, the annotations are recorded in two type kinds recently > added to the BTF specification: BTF_KIND_DECL_TAG and BTF_KIND_TYPE_TAG. > The above example declaration prodcues the following BTF information: > > [1] int 'int'(1U#B) size=4U#B offset=0UB#b bits=32UB#b SIGNED > [2] ptr type=3 > [3] type_tag 'type-tag-1'(5U#B) type=1 > [4] decl_tag 'decl-tag-1'(18U#B) type=6 component_idx=-1 > [5] decl_tag 'decl-tag-2'(29U#B) type=6 component_idx=-1 > [6] var 'x'(16U#B) type=2 linkage=1 (global) > > > Current issues in the implementation > ==================================== > > The __attribute__((btf_type_tag ("foo"))) syntax does not work correctly for > types involving multiple pointers. > > Consider the following example: > > #define __typetag1 __attribute__((btf_type_tag("type-tag-1"))) > #define __typetag2 __attribute__((btf_type_tag("type-tag-2"))) > #define __typetag3 __attribute__((btf_type_tag("type-tag-3"))) > > int __typetag1 * __typetag2 __typetag3 * g; > > The current implementation produces the following DWARF: > > <1><1e>: Abbrev Number: 4 (DW_TAG_variable) > <1f> DW_AT_name : g > <21> DW_AT_decl_file : 1 > <22> DW_AT_decl_line : 6 > <23> DW_AT_decl_column : 42 > <24> DW_AT_type : <0x32> > <28> DW_AT_external : 1 > <28> DW_AT_location : 9 byte block: 3 0 0 0 0 0 0 0 0 (DW_OP_addr: 0) > <1><32>: Abbrev Number: 2 (DW_TAG_pointer_type) > <33> DW_AT_byte_size : 8 > <33> DW_AT_type : <0x45> > <37> DW_AT_sibling : <0x45> > <2><3b>: Abbrev Number: 1 (User TAG value: 0x6000) > <3c> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag > <40> DW_AT_const_value : (indirect string, offset: 0xc7): type-tag-1 > <2><44>: Abbrev Number: 0 > <1><45>: Abbrev Number: 2 (DW_TAG_pointer_type) > <46> DW_AT_byte_size : 8 > <46> DW_AT_type : <0x61> > <4a> DW_AT_sibling : <0x61> > <2><4e>: Abbrev Number: 1 (User TAG value: 0x6000) > <4f> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag > <53> DW_AT_const_value : (indirect string, offset: 0xd): type-tag-3 > <2><57>: Abbrev Number: 1 (User TAG value: 0x6000) > <58> DW_AT_name : (indirect string, offset: 0x18): btf_type_tag > <5c> DW_AT_const_value : (indirect string, offset: 0xd2): type-tag-2 > <2><60>: Abbrev Number: 0 > <1><61>: Abbrev Number: 5 (DW_TAG_base_type) > <62> DW_AT_byte_size : 4 > <63> DW_AT_encoding : 5 (signed) > <64> DW_AT_name : int > <1><68>: Abbrev Number: 0 > > This does not agree with the DWARF produced by LLVM/clang for the same case: > (clang 15.0.0 git 142501117a78080d2615074d3986fa42aa6a0734) > > <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > <1f> DW_AT_name : (indexed string: 0x3): g > <20> DW_AT_type : <0x29> > <24> DW_AT_external : 1 > <24> DW_AT_decl_file : 0 > <25> DW_AT_decl_line : 6 > <26> DW_AT_location : 2 byte block: a1 0 ((Unknown location op 0xa1)) > <1><29>: Abbrev Number: 3 (DW_TAG_pointer_type) > <2a> DW_AT_type : <0x35> > <2><2e>: Abbrev Number: 4 (User TAG value: 0x6000) > <2f> DW_AT_name : (indexed string: 0x5): btf_type_tag > <30> DW_AT_const_value : (indexed string: 0x7): type-tag-2 > <2><31>: Abbrev Number: 4 (User TAG value: 0x6000) > <32> DW_AT_name : (indexed string: 0x5): btf_type_tag > <33> DW_AT_const_value : (indexed string: 0x8): type-tag-3 > <2><34>: Abbrev Number: 0 > <1><35>: Abbrev Number: 3 (DW_TAG_pointer_type) > <36> DW_AT_type : <0x3e> > <2><3a>: Abbrev Number: 4 (User TAG value: 0x6000) > <3b> DW_AT_name : (indexed string: 0x5): btf_type_tag > <3c> DW_AT_const_value : (indexed string: 0x6): type-tag-1 > <2><3d>: Abbrev Number: 0 > <1><3e>: Abbrev Number: 5 (DW_TAG_base_type) > <3f> DW_AT_name : (indexed string: 0x4): int > <40> DW_AT_encoding : 5 (signed) > <41> DW_AT_byte_size : 4 > <1><42>: Abbrev Number: 0 > > Notice the structural difference. From the DWARF produced by GCC (i.e. this > patch series), variable 'g' is a pointer with tag 'type-tag-1' to a pointer > with tags 'type-tag-2' and 'type-tag3' to an int. But from the LLVM DWARF, > variable 'g' is a pointer with tags 'type-tag-2' and 'type-tag3' to a pointer > to an int. > > Because GCC produces BTF from the internal DWARF DIE tree, the BTF also differs. > This can be seen most obviously in the BTF type reference chains: > > GCC > VAR (g) -> ptr -> tag1 -> ptr -> tag3 -> tag2 -> int > > LLVM > VAR (g) -> ptr -> tag3 -> tag2 -> ptr -> tag1 -> int > > It seems that the ultimate cause for this is the structure of the TREE > produced by the C frontend parsing and attribute handling. I believe this may > be due to differences in __attribute__ syntax parsing between GCC and LLVM. > > This is the TREE for variable 'g': > int __typetag1 * __typetag2 __typetag3 * g; > > type type > asm_written unsigned DI > size > unit-size > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff7450888 > attributes purpose > value value > readonly constant static "type-tag-3\000">> > chain > value value > readonly constant static "type-tag-2\000">>>> > pointer_to_this > > asm_written unsigned DI size unit-size > align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7ffff7509930 > attributes > value value > readonly constant static "type-tag-1\000">>>> > public static unsigned DI defer-output /home/dfaust/playpen/btf/annotate.c:29:42 size unit-size > align:64 warn_if_not_align:0> > > To me this is surprising. I would have expected the int** type of "g" to have > the tags 'type-tag-2' and 'type-tag-3', and the inner (int*) pointer type to > have the 'type-tag-1' tag. So far my attempts at resolving this difference in > the new attribute handlers for the tag attributes has not been successful. > > I do not understand why exacly the attributes are attached in this way. I think > that it may be related to the pointer cases discussed in the "All other > attributes" section here: > > https://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html > > In particular it seems similar to this example: > > char *__attribute__((aligned(8))) *f; > > specifies the type “pointer to 8-byte-aligned pointer to char”. Note again > that this does not work with most attributes; for example, the usage of > ‘aligned’ and ‘noreturn’ attributes given above is not yet supported. > > I am not sure if this section of the documentation is outdated, if scenarios > like this one have not been an issue before now, or if there is a way to > resolve this within the attribute handler. I am by no means an expert in the C > frontend nor attribute handling, if someone with more knowledge could help me > understand this case I would be very grateful. :) > > Questions for GCC > ================= > > 1) How can this issue with the type tags be resolved? Is this a bug or > limitation in the attribute parsing that hasn't been an issue until now? > Oris it that the above case is somehow a "weird" usage of attribtes? > > 2) Are attributes the right tool for this? Is there some other mechanism that > would better fit the design of these tags? In some ways the type tags seem > more similar to const/volatile/restrict qualifiers than to most other > attributes. > > > Questions for LLVM / kernel BPF > =============================== > > 1) What special handling does the LLVM frontend/clang do for these attributes? > Is there anything specific? Or does it simply follow whatever is default? > > 2) What is the correct BTF representation for type tags? The documentation for > BTF_KIND_TYPE_TAG in linux/Documentation/bpf/btf.rst seems to conflict with > the output of clang, and the format change that was discussed here: > https://reviews.llvm.org/D113496 > I assume the kernel btf.rst might simply be outdated, but I want to be sure. > > 3) Is the ordering of multiple type tags on the same type important? > e.g. for this variable: > int __tag1 __tag2 __tag3 * b; > > would it be "correct" (or at least, acceptable) to produce: > VAR(b) -> ptr -> tag2 -> tag3 -> tag1 -> int > > or _must_ it be: > VAR(b) -> ptr -> tag3 -> tag2 -> tag1 -> int > > In the DWARF representation, all tags are equal sibling children of the type > they annotate, so this 'ordering' problem seems like it only arises because of > the BTF format for type tags. > > 4) Are types with the same tags in different orders considered distinct types? > I think the answer is "no", but given the format of the tags in BTF we get > distinct chains for the types I am curious. > e.g. > int __tag1 __tag2 * x; > int __tag2 __tag1 * y; > > produces > VAR(x) -> ptr -> tag2 -> tag1 -> int > VAR(y) -> ptr -> tag1 -> tag2 -> int > > but would > VAR(y) -> ptr -> tag2 -> tag1 -> int > > be just as correct? > > 5) According to the clang docs, type tags are currently ignored for non-pointer > types. Is pointer tagging e.g. '__user' the only use case so far? > > This GCC implementation allows type tags on non-pointer types. Such tags > can be represented in the DWARF but don't make much sense in BTF output, > e.g. > > struct __typetag1 S { > int a; > int b; > } __decltag1; > > struct S my_s; > > This will produce a type tag child DIE of S. In the current implementation, > it will also produce a BTF type tag type, which refers to the __decltag1 BTF > decl tag, which in turn refers to the struct type. But nothing refers to > the type tag type, currently variable my_s in BTF refers to the struct type > directly. > > In my opinion, the DWARF here is useful but the BTF seems odd. What would be > "correct" BTF in such a case? > > 6) Would LLVM be open to changing the name of the attribute, for example to > 'debug_info_annotate' (or any other suggestion)? The use cases for these > tags have grown (e.g. drgn) since they were originally proposed, and the > scope is no longer limited to BTF. > > The kernel eBPF developers have said they can accomodate whatever name we > would like to use. So although we in GCC are not tied to the name LLVM > uses, it would be ideal for everyone to use the same attribute name. > > Thanks! > > David > > David Faust (8): > dwarf: Add dw_get_die_parent function > include: Add BTF tag defines to dwarf2 and btf > c-family: Add BTF tag attribute handlers > dwarf: create BTF decl and type tag DIEs > ctfc: Add support to pass through BTF annotations > dwarf2ctf: convert tag DIEs to CTF types > Output BTF DECL_TAG and TYPE_TAG types > testsuite: Add tests for BTF tags > > gcc/btfout.cc | 28 +++++ > gcc/c-family/c-attribs.cc | 45 +++++++ > gcc/ctf-int.h | 29 +++++ > gcc/ctfc.cc | 11 +- > gcc/ctfc.h | 17 ++- > gcc/dwarf2ctf.cc | 115 +++++++++++++++++- > gcc/dwarf2out.cc | 110 +++++++++++++++++ > gcc/dwarf2out.h | 1 + > .../gcc.dg/debug/btf/btf-decltag-func.c | 18 +++ > .../gcc.dg/debug/btf/btf-decltag-sou.c | 34 ++++++ > .../gcc.dg/debug/btf/btf-decltag-typedef.c | 15 +++ > .../gcc.dg/debug/btf/btf-typetag-1.c | 20 +++ > .../gcc.dg/debug/dwarf2/annotation-1.c | 29 +++++ > include/btf.h | 17 ++- > include/dwarf2.def | 4 + > 15 files changed, 482 insertions(+), 11 deletions(-) > create mode 100644 gcc/ctf-int.h > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-typedef.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-typetag-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-1.c >