Message ID | 20240411074407.1429624-1-indu.bhagat@oracle.com |
---|---|
Headers |
Return-Path: <binutils-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0B0F5385841E for <patchwork@sourceware.org>; Thu, 11 Apr 2024 07:44:59 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by sourceware.org (Postfix) with ESMTPS id D48A4385840B for <binutils@sourceware.org>; Thu, 11 Apr 2024 07:44:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D48A4385840B Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=oracle.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=oracle.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D48A4385840B Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=205.220.165.32 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1712821464; cv=pass; b=nMzEYiJF4kz5yxMQML3yCQA+iJ6QaRsvq5oyB7z/3sPnGSw9lp1J7Vb4ABAAIN6tXtiZTIXP46Iict9h7vRVFwRk1NFAC7DgAMV+0rlgeAr7U2HHtM2PtvFAaH7fnBPz/RXSCBIiMwH0Isj86WQSMqCIlRkQRI2+7VT0SpcBCoM= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1712821464; c=relaxed/simple; bh=bDYkCbmke+JYbvRbnlpn9k0U1f/3Ien8F1J6XQJczto=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=gIVPoddIo3mmlmcIkqW67TPwkbGJtGGgmjBFD/joYGosFt88ITQHXrD+2bnwcp4fYgAJ9Hl95XK441qdwarMqlJCJm7GVFrWhduwnl3fZY6G6QJVsut8fiwB2zOaJqCkxVTl6RI705CMUIyw+T/gzLdBo2SViTXjIInnnt6jai0= ARC-Authentication-Results: i=2; server2.sourceware.org Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 43B6EXhY018307 for <binutils@sourceware.org>; Thu, 11 Apr 2024 07:44:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : content-transfer-encoding : content-type : mime-version; s=corp-2023-11-20; bh=1YiIOY0QN2NyRXUPthJ20VNVkTavuwzCUl2HuFZDUKE=; b=GML0HADIICdNoRY/ea8uh2hQzxHHaiOQCR05N4zN6dfD72FbqDjmdfb/H/4js3fYyKNk yQ+Mk8qrVBCP/YV27COWOIW2y8OmM9w/pKv4YF+x0rvSZ0TTxCnRSylGx7O1m9Axhrgc yVWjaItjxbSJQOLuEY/s2yLW8aDTFGaLVl5ZvJ4f05sGxToFXNExplDcC7/kaHNN46X1 rbb5NgywoZkRk6sI+rfVXBzfak2X73egYQCZ35mJqYmPDb9V8lmJt6e7ojbkP1ZXEwIB z9oseuTVIXYfut7baiYO/st8TodiRFL/omxH7juKQ3j3FurUPRwfvNEs/gB2eDE8/f7W Cg== Received: from iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta02.appoci.oracle.com [147.154.18.20]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3xaxxvh1b8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <binutils@sourceware.org>; Thu, 11 Apr 2024 07:44:19 +0000 Received: from pps.filterd (iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 43B61ko0026321 for <binutils@sourceware.org>; Thu, 11 Apr 2024 07:44:18 GMT Received: from nam12-bn8-obe.outbound.protection.outlook.com (mail-bn8nam12lp2168.outbound.protection.outlook.com [104.47.55.168]) by iadpaimrmta02.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3xdrss9wf9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for <binutils@sourceware.org>; Thu, 11 Apr 2024 07:44:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Bhwox8hAjN/uH33nq/Yy2mVvcNRp087g9LiHxbaSbJWjJu2AYp2q/XCRuQMVdSZ0nfVjALX97mZmBDfInVM3r/jWIzRuqI43rZoOcEhm5B6TCHkifnPSTrMSoA9KrYYT05hlyRfXEsmizdAEFfU8/kM3b9BdHEhaXKd3OdR9KtGaMSKs7cymYkauL8iR60x13BefyqwlA4oYPsdKCru/vvBvxnoyRaC/yL0qgim6GWHJgpiLhb3Er7yJOJqPWdnnn8UNA93Q6d4TlwJ3VpEpfvixs914rdhgpDVmX2zgbk1PmJKntP6He8qo6vSfOZg5Yc5MRR8/wdGvgXBbK1U6Cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1YiIOY0QN2NyRXUPthJ20VNVkTavuwzCUl2HuFZDUKE=; b=F2r3MGU0/lnWDMsoNlnoTCK4jggfONO8JXyA0HHkX344u6vyd0EXetbXb+DwU6ebAaipy8GIlCSodKSG5ENBxdb7JGSa4HqKJecZa/71DyV9L82JKZToGsVn69rapBwWbSxSzkezhhSS3sT7QXQ3d6CpsKA8phC41T7Dvv4TmEQRuv6I2OqM2h7gjva8vUAF/oRHaB42nMhbXahPtQssTd+lKqS3k5x5qilog9ZZlyPJLkR04GnwU0jjWHwswAfgUyeOLiaOJvhHkhdCPUYbWxDTgtCGZPz2a3ju61vyJfQfkqOuJEa7lzefMlFJfWBjT3ogHGkxlbnZGqylTRTB7w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1YiIOY0QN2NyRXUPthJ20VNVkTavuwzCUl2HuFZDUKE=; b=HDTNSd+lG9DR/blywNeqXQzEQMXQXQ/dt/p+omtt2K8r/Z2cE2swdf6cr4VgPe8m+ryKw1qMy9QMPFUEreQlrpYusNz0y0MhKkVB+qvlmLeAxSV24p22ZyxUbVmHNX/9CBnk/jSLxfzEiskQiSCcDZZddbz2nByUfs0zvGyH9JI= Received: from MWHPR1001MB2158.namprd10.prod.outlook.com (2603:10b6:301:2d::17) by IA0PR10MB6723.namprd10.prod.outlook.com (2603:10b6:208:43f::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Thu, 11 Apr 2024 07:44:16 +0000 Received: from MWHPR1001MB2158.namprd10.prod.outlook.com ([fe80::6ae9:dd4d:17bc:4f2]) by MWHPR1001MB2158.namprd10.prod.outlook.com ([fe80::6ae9:dd4d:17bc:4f2%7]) with mapi id 15.20.7409.053; Thu, 11 Apr 2024 07:44:15 +0000 From: Indu Bhagat <indu.bhagat@oracle.com> To: binutils@sourceware.org Cc: Indu Bhagat <indu.bhagat@oracle.com> Subject: [PATCH 0/2] Add SCFI support for aarch64 Date: Thu, 11 Apr 2024 00:44:05 -0700 Message-ID: <20240411074407.1429624-1-indu.bhagat@oracle.com> X-Mailer: git-send-email 2.43.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: MW4PR04CA0060.namprd04.prod.outlook.com (2603:10b6:303:6a::35) To MWHPR1001MB2158.namprd10.prod.outlook.com (2603:10b6:301:2d::17) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWHPR1001MB2158:EE_|IA0PR10MB6723:EE_ X-MS-Office365-Filtering-Correlation-Id: 37d751c0-4100-42ed-12d2-08dc59fb2fb1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 6/zcNB2P543jgFtVPrLSPIy/lwDXWT8+fmEPdsG3cSANUBUi8L/oc4YnLX/HMaBJC2n9/wuBP2/0sBHxzh2siVYUvhMTADDIQECvJR5Dd4QZJAIOiljnMzt+gfo2jaWQeaEtumygtKlF/5xaE2ciB2X09XC3EANM2dKSTmok2e+zHOuzhKxtKkgtih9+R7rkv0QcTYJJI6tHpfO1U2rKn3rNK1EHpjG6Cv0BSkaZVGWIAVulmi27yQj47T/Y26Zz/Zvg6EAwHr48/+pc+W8U3LqwphfzgVgwZ1z0/At6O347okUjC5ZoSGrizd0K/Zwta9ToscYpyAmicN251Z8bV8xrPsi2p+frG/2TV8NTJiOjC3eWcYMNDxMjiwLv3kI80bEUEabspLyQDWTa0LB/WVGIXMjkqfNcGPUL5PIbNLLDy23gPiIi5q/60lwsnampf9OdQAiKPFX9WowjuIV+oAJVxDN5G74uJy2a25eRSfzSt0auQEv8L1vfcwxWn0i8H9PQi69iX9Zv/hC+5Eav/BagHwmbVrUip4kFId99NyN8sisW42/bJPh/iJ1FkYZn/iw1lcrHDm15I0z1VDxZoP5JLNEriJUH1MhpRjTwqOErvT1q/n4r7MVt1pKLZ04/pEZXkKADhr2xCJgvO6Q9UdLY2KBCp0L31MryHI31nfc= X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MWHPR1001MB2158.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366007)(1800799015)(376005); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: pgR4smsvWMvw6q0Rj+GO3XyFSKZz/YWnWYizj3hHv4dIB/uw3FpfH46aHYTMrJxHuLHGpXkZbOHvlbFXn9uV6755ZUQRsAV0W+FbIXFnSj2qnlDLcfeegwJbqSaT6yv7vDV11B5Rgi0zVvmUHxwOOITu+pmqQfzij0MrE1LtXSVIJFVBZc8m6LP3C6Qw4W/0Ee8MIXMDvV7RG2+yUKdpxLTy3qKaM7JaaX925BwVs3TVKyYuWzl5LLay37ogDUjgIGekvol9hynKxFS7qK4KwTsNeqr0IK2WwSBhXKkibwN3LCMCCjZnmaNJ93hAxj7LIY9VNDInwOS5YlZS3097qOeunOwKgChBWIpQrmszSzGsAjZFyTYvmOm8cRRpcXLEHRX61N0pOPHNhw59Fa7vcBzm24RI1FMJe2Fm9U+rHsPs199jk9hX8wF58I3D+nljK9D/Y6EgWqwFuF2AhQGe+Duzkq3Kqu7ietNlWFQrC/ccLACxnt60RuAE0hR4SEtwklaQedEtMnilzlErmz1hkYWcn2LCLIQX2JKcFgYMi28LuD6SilTDjEbwigRz/ykPPlf20o6eU1QjAzgHu8xz18q1avCQnYzoH+TY1Ty9LwMxzlFXMOOeBRBN+cmi5F2jizMYVFh8h+9QOLzjAVB73RHgziqE5avOmf4jq8r/v3qUSBBPkBniRbvp58mKNQ/jrvVl9kudM/lE0uTTHhhj2Xq1nXgtPSsg4arBHBoMl0BiX8GOsbdPBmb27lraz6N29aHA2Do5+uEsVqpFyZUQFHr5CrLhBY+H8gwvkTp5AVD/a+iPxADvsyz2h+Am6Fppi8pD/hlaMuc1b5Mcgy8y52IOcO1PzjCClmH/TxVMynz51Cmw5oC4YS1E0IplTynnjB1R8OWfiqZ6/CYau7uzQChyEqFCmO3gOkJyaO0VIs3VRrZDDz++CNdLDtnM8Au4NZ/htznLhKZN8K2CxxNDYJ9M4izP936sWHHigVLbf1HyDpp5AMCK3RzpVIlY3j7GNdiMyJeX/RiCdrwXWgKBoXWsFk9ZLmvqpWD7g4uwTBrHCxjSkcj9l1CYAFWgUeAtBd1pzjXA43FqelPJTR4cYULejipIOvrD00rKhjJmQB1TAjQxAmzb7cRYb+LdVJCp3mtBYh1knjoL9bmwvE4Ym5Zdzu5Xcg03zo5U/nU4I1o3+0HfgNFazLrCk/KHEXqWabYSm8MxtVKPaJtqX9ccUVkQAj7gkITBfaStWKIfupdxcGCJntcxsd1UHVXQxJIqy+rx8VLO45aBcVLs/YuvPL94LguYjbVRWpzaAjHF1WIlgZiPXiZF4Wnz7F1SOwvEM6NSRCTlgl0BInZPt6B+bzR0HBXAJq9tgVzS2zOrudIPkArLpVI1k2ROe3fyfWHN7aBZVTSXFkJ6WuWdN7+WQ/kmRUS+eiNUuTPYMZfwM3hnLU79d3aBVPKuP8JqCW1g36RrQMC/F1eoCHhL6EYyZmOAtgio1vCpkk4l5C+CYYnvPK7mne+aTQFIPTUurtYfwyqXngWwVs9ccicLQue2pwNbEqliw/z7lexoNkEx8yylRxOCBYZqZiuleerTt/bgM7/rETuVbT/66rJnoYC6uwsBlvD91LIY+fJfn1RnMGw= X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: YA63nIj0KIYmnAbdTW36VWfO31Ph2XQhPdtpQSjaoV+VzBGSK/Y4w4vz66CGkpDWBKVvMdqyvrTA8ovjwBgV1bk8/Fg6A2pUnynvz8fTFEwoIiC7Byx6hv8Ir30Yzz09Q93XvupHJTh10WVtkBe34u8VCNHjImB2B/ikYQnxWU+tnV3SD23agBo2olU5dBrg6meA0PYPC8B5Evk5IwAHSU/Dluci5LVR7kcLhpds/1gMMDge3/fBI6/y08vCoQABUa1tPrIXzxaWUPkaHm7ptmfUXj25nhc2SNrPpP0kBm5wp3F8z/Bw786wc8VPMNM6cLaRmeMYGfTXv5JFgLwvZC2WqEHCepxa5YK4V4fDRorXQoJO9sHlsoRqTmw2qYIPwWyQ9nV6oCj7LZbVdkfvjQLqltKEz3eOh+J++YK7WQuiqRy/Caoq8D4Z8A6xqDRgyvJ9i+HtehDHqjtjMxQzCXytwWxgmTQVyp+Y7QyNbTlRN+jcodcZ0nHewT8SWWuiVe23Ldo0fY2h0xNUae8+FV0sx64mDwizkNP/lpyfgVUpDBWDT2F1+BUbWKc4aU8OPrt8qaRG17XPrWn0nqVWXZTROE8iQNi2AypRIhCjHEk= X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 37d751c0-4100-42ed-12d2-08dc59fb2fb1 X-MS-Exchange-CrossTenant-AuthSource: MWHPR1001MB2158.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Apr 2024 07:44:15.2793 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: EDWsZW5FBeR07DrQJ0PNuH0ME4gp0DBgEPYSiXQhqoNCnzcz+Kz2ltDZXY0nroEA1i1a7uFWrdlJPlTYnj2MRw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR10MB6723 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-04-11_02,2024-04-09_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 bulkscore=0 mlxscore=0 adultscore=0 phishscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2404010000 definitions=main-2404110054 X-Proofpoint-ORIG-GUID: 20KQeC6WKwWvUN40qAqtvL5dXhSvaNuM X-Proofpoint-GUID: 20KQeC6WKwWvUN40qAqtvL5dXhSvaNuM X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list <binutils.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/binutils>, <mailto:binutils-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/binutils/> List-Post: <mailto:binutils@sourceware.org> List-Help: <mailto:binutils-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/binutils>, <mailto:binutils-request@sourceware.org?subject=subscribe> Errors-To: binutils-bounces+patchwork=sourceware.org@sourceware.org |
Series |
Add SCFI support for aarch64
|
|
Message
Indu Bhagat
April 11, 2024, 7:44 a.m. UTC
Hello, This patch series extends GAS support for SCFI to aarch64. Since Binutils 2.42, GAS has experimental support for synthesizing CFI (SCFI) for hand-written asm for the x86 backend. This is invoked via --scfi=experimental on the hand-written asm. SCFI aims to relieve users from the overhead of writing and maintaining CFI directives in hand-written asm. One of the ways of hardening the SCFI feature in GAS is to extend support to another major architecture. This would also allow exercising SCFI on more workloads. Background ----------- Some background notes on SCFI are present on the wiki https://sourceware.org/binutils/wiki/gas/SCFI. I will refrain from repeating some of that content here for sake of brevity. Additionally, the commit log for the first commit which added the support on x86 may also be helpful in reviewing this series. - gas: x86: synthesize CFI for hand-written asm c7defc5386cc53a4abbb7c53a924cdac3f16aa33 For synthesizing (DWARF) CFI, the SCFI machinery requires the programmer to adhere to some pre-requisites for their asm: - Hand-written asm block must begin with a .type foo, %function It is highly recommended to, additionally, also ensure that: - Hand-written asm block ends with a .size foo, .-foo ginsns, SCFI constraints, etc. ------------------------------ ginsn is an acronym for generic GAS instruction. This is intended to be architecture-neutral abstraction that can be used to convey and keep semantic information about machine instructions in an arch-neutral way in GAS. ginsn specification and associated interfaces can be seen in gas/ginsn.c and gas/ginsn.h. The SCFI algorithm itself is implemented as a couple of passes. The following is a gross over-simplification of the overall process; simplified to hopefully aid the review process: - Create the GCFG (control flow graph) of the ginsns. - Process each basic block and make a note of how each instruction changes the SCFI state (CFA, callee-saved registers, RA). This is done via two passes: forward_flow_scfi_state () and backward_flow_scfi_state (). - Translate SCFI ops to equivalent DWARF CFI ops or directives. The above is implemented in gas/scfi.h and gas/scfi.c. Also see the gas/scfidw2gen.h and gas/scfidw2gen.c where SCFI ops are processed to finally create the DWARF CFI directives. Lastly, I think stating some specifics of SCFI core algorithm itself may be helpful for the review process: Basically the SCFI machinery encodes some rules specified in the standard ABI calling convention (e.g., set of callee-saved registers, how the return address is managed etc). Apart from the rules, the SCFI machinery employs some heuristics. Few examples of heuristics: - The base register for CFA tracking may be either REG_SP or REG_FP. - If the base register for CFA tracking is REG_SP, the precise amount of stack usage (and hence, the value of REG_SP) must be known at all times. - If using dynamic stack allocation, the function must switch to FP-based CFA. This means using instructions like the following (in AMD64) in prologue: pushq %rbp movq %rsp, %rbp and analogous instructions in epilogue. In case of aarch64, this simply means creation of the frame record. - Save and Restore of callee-saved registers must be symmetrical. However, the SCFI machinery at this time only warns if any such asymmetry is seen. These heuristics/rules are architecture-independent and are meant to employed for all architectures/ABIs using SCFI in the future. The SCFI paper published sometime ago (https://sourceware.org/pipermail/binutils/2023-September/129558.html) may be a useful resource to get additional understanding of the above. Known limitations ----------------- These are planned to be worked on in the near future: - The current SCFI machinery does not currently synthesize the PAC-related aarch64-specific CFI directives: .cfi_b_key_frame. Other opcodes used when pointer authentication is enabled also need to be handled (braa, brab, retaa, etc.). - Supporting the following pattern: mov x16,4266 add sp, x16, sp ... - Not a limitation per se, but a note that ATM, that predicated insns are skipped from ginsn translation. IIUC, these instructions are not such that can be used alongside stack management ops. To be double-checked. Thanks, Indu Bhagat (2): gas: aarch64: add experimental support for SCFI gas: aarch64: testsuite: add new tests for SCFI gas/config/tc-aarch64.c | 744 ++++++++++++++++++ gas/config/tc-aarch64.h | 20 + gas/testsuite/gas/scfi/README | 2 +- gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.l | 30 + gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.s | 16 + gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.l | 40 + gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.s | 21 + gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.l | 32 + gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.s | 15 + .../gas/scfi/aarch64/scfi-aarch64.exp | 60 ++ gas/testsuite/gas/scfi/aarch64/scfi-cb-1.d | 20 + gas/testsuite/gas/scfi/aarch64/scfi-cb-1.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-cb-1.s | 14 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.d | 31 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.s | 46 ++ gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.d | 40 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.s | 42 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.d | 32 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.s | 34 + .../gas/scfi/aarch64/scfi-cond-br-1.d | 20 + .../gas/scfi/aarch64/scfi-cond-br-1.l | 2 + .../gas/scfi/aarch64/scfi-cond-br-1.s | 13 + gas/testsuite/gas/scfi/aarch64/scfi-diag-1.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-diag-1.s | 6 + gas/testsuite/gas/scfi/aarch64/scfi-diag-2.l | 3 + gas/testsuite/gas/scfi/aarch64/scfi-diag-2.s | 25 + gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.d | 59 ++ gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.s | 52 ++ gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.d | 33 + gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.s | 26 + gas/testsuite/gas/scfi/aarch64/scfi-strp-1.d | 39 + gas/testsuite/gas/scfi/aarch64/scfi-strp-1.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-strp-1.s | 37 + gas/testsuite/gas/scfi/aarch64/scfi-strp-2.d | 35 + gas/testsuite/gas/scfi/aarch64/scfi-strp-2.l | 2 + gas/testsuite/gas/scfi/aarch64/scfi-strp-2.s | 30 + .../gas/scfi/aarch64/scfi-unsupported-1.l | 4 + .../gas/scfi/aarch64/scfi-unsupported-1.s | 31 + 43 files changed, 1671 insertions(+), 1 deletion(-) create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-aarch64.exp create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cb-1.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cb-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cb-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cond-br-1.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cond-br-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cond-br-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-2.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-2.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-1.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-1.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-2.d create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-2.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-2.s create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-unsupported-1.l create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-unsupported-1.s
Comments
Ping. On 4/11/24 12:44 AM, Indu Bhagat wrote: > Hello, > > This patch series extends GAS support for SCFI to aarch64. > > Since Binutils 2.42, GAS has experimental support for synthesizing CFI (SCFI) > for hand-written asm for the x86 backend. This is invoked via > --scfi=experimental on the hand-written asm. SCFI aims to relieve users from > the overhead of writing and maintaining CFI directives in hand-written asm. > > One of the ways of hardening the SCFI feature in GAS is to extend support to > another major architecture. This would also allow exercising SCFI on more > workloads. > > Background > ----------- > Some background notes on SCFI are present on the wiki > https://sourceware.org/binutils/wiki/gas/SCFI. I will refrain from repeating > some of that content here for sake of brevity. > > Additionally, the commit log for the first commit which added the support on > x86 may also be helpful in reviewing this series. > - gas: x86: synthesize CFI for hand-written asm > c7defc5386cc53a4abbb7c53a924cdac3f16aa33 > > For synthesizing (DWARF) CFI, the SCFI machinery requires the programmer > to adhere to some pre-requisites for their asm: > - Hand-written asm block must begin with a .type foo, %function > It is highly recommended to, additionally, also ensure that: > - Hand-written asm block ends with a .size foo, .-foo > > ginsns, SCFI constraints, etc. > ------------------------------ > ginsn is an acronym for generic GAS instruction. This is intended to be > architecture-neutral abstraction that can be used to convey and keep semantic > information about machine instructions in an arch-neutral way in GAS. ginsn > specification and associated interfaces can be seen in gas/ginsn.c and > gas/ginsn.h. > > The SCFI algorithm itself is implemented as a couple of passes. The following > is a gross over-simplification of the overall process; simplified to hopefully > aid the review process: > > - Create the GCFG (control flow graph) of the ginsns. > - Process each basic block and make a note of how each instruction changes the > SCFI state (CFA, callee-saved registers, RA). This is done via two passes: > forward_flow_scfi_state () and backward_flow_scfi_state (). > - Translate SCFI ops to equivalent DWARF CFI ops or directives. > > The above is implemented in gas/scfi.h and gas/scfi.c. Also see the > gas/scfidw2gen.h and gas/scfidw2gen.c where SCFI ops are processed to finally > create the DWARF CFI directives. > > Lastly, I think stating some specifics of SCFI core algorithm itself may be > helpful for the review process: Basically the SCFI machinery encodes some rules > specified in the standard ABI calling convention (e.g., set of callee-saved > registers, how the return address is managed etc). Apart from the rules, the > SCFI machinery employs some heuristics. Few examples of heuristics: > > - The base register for CFA tracking may be either REG_SP or REG_FP. > - If the base register for CFA tracking is REG_SP, the precise amount of > stack usage (and hence, the value of REG_SP) must be known at all times. > - If using dynamic stack allocation, the function must switch to > FP-based CFA. This means using instructions like the following (in > AMD64) in prologue: > pushq %rbp > movq %rsp, %rbp > and analogous instructions in epilogue. In case of aarch64, this simply > means creation of the frame record. > - Save and Restore of callee-saved registers must be symmetrical. > However, the SCFI machinery at this time only warns if any such > asymmetry is seen. > > These heuristics/rules are architecture-independent and are meant to > employed for all architectures/ABIs using SCFI in the future. > > The SCFI paper published sometime ago > (https://sourceware.org/pipermail/binutils/2023-September/129558.html) may be a > useful resource to get additional understanding of the above. > > Known limitations > ----------------- > These are planned to be worked on in the near future: > > - The current SCFI machinery does not currently synthesize the PAC-related > aarch64-specific CFI directives: .cfi_b_key_frame. Other opcodes used when > pointer authentication is enabled also need to be handled (braa, brab, > retaa, etc.). > > - Supporting the following pattern: > mov x16,4266 > add sp, x16, sp > ... > > - Not a limitation per se, but a note that ATM, that predicated insns are > skipped from ginsn translation. IIUC, these instructions are not such that > can be used alongside stack management ops. To be double-checked. > > Thanks, > > Indu Bhagat (2): > gas: aarch64: add experimental support for SCFI > gas: aarch64: testsuite: add new tests for SCFI > > gas/config/tc-aarch64.c | 744 ++++++++++++++++++ > gas/config/tc-aarch64.h | 20 + > gas/testsuite/gas/scfi/README | 2 +- > gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.l | 30 + > gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.s | 16 + > gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.l | 40 + > gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.s | 21 + > gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.l | 32 + > gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.s | 15 + > .../gas/scfi/aarch64/scfi-aarch64.exp | 60 ++ > gas/testsuite/gas/scfi/aarch64/scfi-cb-1.d | 20 + > gas/testsuite/gas/scfi/aarch64/scfi-cb-1.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-cb-1.s | 14 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.d | 31 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.s | 46 ++ > gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.d | 40 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.s | 42 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.d | 32 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.s | 34 + > .../gas/scfi/aarch64/scfi-cond-br-1.d | 20 + > .../gas/scfi/aarch64/scfi-cond-br-1.l | 2 + > .../gas/scfi/aarch64/scfi-cond-br-1.s | 13 + > gas/testsuite/gas/scfi/aarch64/scfi-diag-1.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-diag-1.s | 6 + > gas/testsuite/gas/scfi/aarch64/scfi-diag-2.l | 3 + > gas/testsuite/gas/scfi/aarch64/scfi-diag-2.s | 25 + > gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.d | 59 ++ > gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.s | 52 ++ > gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.d | 33 + > gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.s | 26 + > gas/testsuite/gas/scfi/aarch64/scfi-strp-1.d | 39 + > gas/testsuite/gas/scfi/aarch64/scfi-strp-1.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-strp-1.s | 37 + > gas/testsuite/gas/scfi/aarch64/scfi-strp-2.d | 35 + > gas/testsuite/gas/scfi/aarch64/scfi-strp-2.l | 2 + > gas/testsuite/gas/scfi/aarch64/scfi-strp-2.s | 30 + > .../gas/scfi/aarch64/scfi-unsupported-1.l | 4 + > .../gas/scfi/aarch64/scfi-unsupported-1.s | 31 + > 43 files changed, 1671 insertions(+), 1 deletion(-) > create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-cofi-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-ldst-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/ginsn-misc-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-aarch64.exp > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cb-1.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cb-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cb-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-2.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cfg-3.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cond-br-1.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cond-br-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-cond-br-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-2.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-diag-2.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-ldrp-2.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-1.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-1.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-2.d > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-2.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-strp-2.s > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-unsupported-1.l > create mode 100644 gas/testsuite/gas/scfi/aarch64/scfi-unsupported-1.s >
Hi, I was having a look at the v3 series, but had a question about the design & known limitations that I thought was better to ask here: Indu Bhagat <indu.bhagat@oracle.com> writes: > Known limitations > ----------------- > These are planned to be worked on in the near future: > > - The current SCFI machinery does not currently synthesize the PAC-related > aarch64-specific CFI directives: .cfi_b_key_frame. Other opcodes used when > pointer authentication is enabled also need to be handled (braa, brab, > retaa, etc.). > > - Supporting the following pattern: > mov x16,4266 > add sp, x16, sp > ... > > - Not a limitation per se, but a note that ATM, that predicated insns are > skipped from ginsn translation. IIUC, these instructions are not such that > can be used alongside stack management ops. To be double-checked. AFAICT, the current code only handles GPRs. It doesn't handle D8-D15, which are also call-preserved under the base AAPCS64. Is that right? I think we should try to handle those as well. D8-D15 are "interesting" because they are the low 64 bits of Q8-Q15, and of Z8-Z15 if SVE is used. However, a CFI save slot always represents the low 64 bits, regardless of whether a save occurs on D, Q or Z registers. This matters for big-endian code, because there are two additional PCS variants: * the "vector PCS", which preserves Q8-Q23 * the "SVE PCS", which preserves Z8-Z23 and P3-P15 So vector PCS functions might need to save and restore Q8 when returning normally, but the CFI only describes the save of the D8 portion (since that's the only portion that is preserved by exceptions). This means that, on big-endian: str q8, [sp, #16] should record D8 as being saved at sp+24 rather than sp+16. A further complication is that STR Qn and STR Zn do not store in the same byte order for big-endian: STR Qn stores as a 128-bit integer (MSB first), whereas STR Zn stores as a stream of bytes (LSB first). This means that GCC-generated big-endian SVE PCS functions use things like: st1d z8.d, p2, [sp, #1, mul vl] with the D8 save slot then being at sp + 2*VL - 64. I think it's OK to punt on the big-endian SVE PCS case for now (provided that there's a warning that the code isn't understood, which it looks like there is). But I think it's worth handling the Q register saves. Other comments: - I like the new approach of using a combination of the iclass and a "subclass" field of the flags. How about making aarch64-gen.c enforce that: - if aarch64-ginsn.c looks at the subclass of a particular iclass, every instruction of that iclass has a nonzero subclass field - every other instruction has a zero subclass field This would help to ensure that the data stays up to date. The subclass enum could include a nonzero "other" value where necessary. - I think we should only add things like F_LDST_LOAD and F_LDST_STORE to instructions that are semantically simple loads and stores (unless the iclass gives us the information needed to handle more complicated cases). E.g. it looks like patch 2/7 adds F_LDST_LOAD to things like ld4, which are AoS->SoA loads. It would not be correct to interpret an LD4 on byte elements (say) as a register restore for CFI purposes. I realise the information could be useful for other things besides ginsns. But while ginsns are the only things using the information, I think we should be careful to make sure that the information can't be misunderstood. Thanks, Richard
On 6/26/24 04:01, Richard Sandiford wrote: > Hi, > > I was having a look at the v3 series, but had a question about the > design & known limitations that I thought was better to ask here: > > Indu Bhagat <indu.bhagat@oracle.com> writes: >> Known limitations >> ----------------- >> These are planned to be worked on in the near future: >> >> - The current SCFI machinery does not currently synthesize the PAC-related >> aarch64-specific CFI directives: .cfi_b_key_frame. Other opcodes used when >> pointer authentication is enabled also need to be handled (braa, brab, >> retaa, etc.). >> >> - Supporting the following pattern: >> mov x16,4266 >> add sp, x16, sp >> ... >> >> - Not a limitation per se, but a note that ATM, that predicated insns are >> skipped from ginsn translation. IIUC, these instructions are not such that >> can be used alongside stack management ops. To be double-checked. > > AFAICT, the current code only handles GPRs. It doesn't handle D8-D15, > which are also call-preserved under the base AAPCS64. Is that right? > I think we should try to handle those as well. Ah yes, the current code only handles GPRS. I will need to add the D8-D15 registers. The code was also explicitly skipping ldstp with FP registers. Thanks. > > D8-D15 are "interesting" because they are the low 64 bits of Q8-Q15, > and of Z8-Z15 if SVE is used. However, a CFI save slot always represents > the low 64 bits, regardless of whether a save occurs on D, Q or Z registers. > This matters for big-endian code, because there are two additional > PCS variants: > > * the "vector PCS", which preserves Q8-Q23 > * the "SVE PCS", which preserves Z8-Z23 and P3-P15 > Is there a way to annotate that a (hand-written asm) function adheres to vectors PCS or SVE PCS ? I see that there is a .variant_pcs but that does not help differentiate between the above two? I _think_ gas will need to know which of SVE vs vector PCS is in effect for a specific function so that the P3-P15 can be added to the set of callee-saved registers being tracked for SCFI for SVE PCS but not for vector PCS. > So vector PCS functions might need to save and restore Q8 when returning > normally, but the CFI only describes the save of the D8 portion (since > that's the only portion that is preserved by exceptions). This means > that, on big-endian: > > str q8, [sp, #16] > > should record D8 as being saved at sp+24 rather than sp+16. > > A further complication is that STR Qn and STR Zn do not store in > the same byte order for big-endian: STR Qn stores as a 128-bit > integer (MSB first), whereas STR Zn stores as a stream of bytes > (LSB first). This means that GCC-generated big-endian SVE PCS > functions use things like: > > st1d z8.d, p2, [sp, #1, mul vl] > > with the D8 save slot then being at sp + 2*VL - 64. > > I think it's OK to punt on the big-endian SVE PCS case for now (provided > that there's a warning that the code isn't understood, which it looks > like there is). But I think it's worth handling the Q register saves. It looks to me that using reg name / size is an unambiguous proxy to deciding whether SVE PCS is in effect. Is this correct ? > > Other comments: > > - I like the new approach of using a combination of the iclass and a > "subclass" field of the flags. How about making aarch64-gen.c enforce > that: > > - if aarch64-ginsn.c looks at the subclass of a particular iclass, > every instruction of that iclass has a nonzero subclass field > (Let me refer to the above as #1). I can see that there can be ways to achieve this... > - every other instruction has a zero subclass field > ..but I am not sure I follow this statement. (Let me refer to the above as #2). > This would help to ensure that the data stays up to date. > The subclass enum could include a nonzero "other" value where > necessary. > Currently, we are using the opcode->flags bits to encode: In include/opcode/aarch64.h: /* 4-bit flag field to indicate subclass of operations. Note that there is an (intended) overlap between the three flag sets (F_LDST*, F_ARITH* and F_BRANCH*). This allows space savings. */ #define F_LDST_LOAD (1ULL << 36) #define F_LDST_STORE (2ULL << 36) /* A load followed by a store (using the same address). */ #define F_LDST_SWAP (F_LDST_LOAD | F_LDST_STORE) /* Subclasses to denote add, sub and mov insns. */ #define F_ARITH_ADD (1ULL << 36) #define F_ARITH_SUB (2ULL << 36) #define F_ARITH_MOV (4ULL << 36) /* Subclasses to denote call and ret insns. */ #define F_BRANCH_CALL (1ULL << 36) #define F_BRANCH_RET (2ULL << 36) We can dedicate F_SUBCLASS_NONE (8ULL << 36) and enforce this subclass on all insns which use none of the above subclasses in a specific iclass. This can help address (#1), but not sure about (#2). > - I think we should only add things like F_LDST_LOAD and F_LDST_STORE > to instructions that are semantically simple loads and stores > (unless the iclass gives us the information needed to handle > more complicated cases). E.g. it looks like patch 2/7 adds > F_LDST_LOAD to things like ld4, which are AoS->SoA loads. > It would not be correct to interpret an LD4 on byte elements > (say) as a register restore for CFI purposes. > > I realise the information could be useful for other things > besides ginsns. But while ginsns are the only things using the > information, I think we should be careful to make sure that the > information can't be misunderstood. > If some safeguards like #1 are placed for the specific iclasses, and further we only allow subclass information retrieval for selected iclasses in aarch64-gen.c, I think we can afford to go this route as you suggest: Only add subclasses for those iclasses relevant for SCFI purposes ATM. Thanks for review and feedback Indu
Indu Bhagat <indu.bhagat@oracle.com> writes: > On 6/26/24 04:01, Richard Sandiford wrote: >> D8-D15 are "interesting" because they are the low 64 bits of Q8-Q15, >> and of Z8-Z15 if SVE is used. However, a CFI save slot always represents >> the low 64 bits, regardless of whether a save occurs on D, Q or Z registers. >> This matters for big-endian code, because there are two additional >> PCS variants: >> >> * the "vector PCS", which preserves Q8-Q23 >> * the "SVE PCS", which preserves Z8-Z23 and P3-P15 >> > > Is there a way to annotate that a (hand-written asm) function adheres to > vectors PCS or SVE PCS ? I see that there is a .variant_pcs but that > does not help differentiate between the above two? > > I _think_ gas will need to know which of SVE vs vector PCS is in effect > for a specific function so that the P3-P15 can be added to the set of > callee-saved registers being tracked for SCFI for SVE PCS but not for > vector PCS. Only the normal base AAPCS64 register set is preserved across abnormal control flow (setjmp/longjmp, exceptions, etc.) The extra call-preserved guarantees for vector and SVE PCS functions only apply to normal returns. [This means, for example, that: void foo(); svbool_t f() { try { foo(); } catch (...) {}; return svptrue_b8(); } must manually restore the additional register state when catching and returning normally.] The CFI requirements therefore don't change: only D8-D15 matter, like for normal functions. But that's also where the big-endian complications that I mentioned come from. So I don't think the code needs to know which kind of function is being assembled. The code just needs to be able to recognise Q-based and Z-based loads and stores of D8-D15 and work out the correct offset of the low 64 bits. (Although, like I say, I think we can punt on big-endian SVE PCS functions.) >> So vector PCS functions might need to save and restore Q8 when returning >> normally, but the CFI only describes the save of the D8 portion (since >> that's the only portion that is preserved by exceptions). This means >> that, on big-endian: >> >> str q8, [sp, #16] >> >> should record D8 as being saved at sp+24 rather than sp+16. >> >> A further complication is that STR Qn and STR Zn do not store in >> the same byte order for big-endian: STR Qn stores as a 128-bit >> integer (MSB first), whereas STR Zn stores as a stream of bytes >> (LSB first). This means that GCC-generated big-endian SVE PCS >> functions use things like: >> >> st1d z8.d, p2, [sp, #1, mul vl] >> >> with the D8 save slot then being at sp + 2*VL - 64. >> >> I think it's OK to punt on the big-endian SVE PCS case for now (provided >> that there's a warning that the code isn't understood, which it looks >> like there is). But I think it's worth handling the Q register saves. > > It looks to me that using reg name / size is an unambiguous proxy to > deciding whether SVE PCS is in effect. Is this correct ? Not necessarily. There's nothing stopping code from using Q-based loads and stores for normal functions (although it would be an odd choice). There's also the possiblity of ad-hoc PCSes, but the assumption there too would be that only the base AAPCS64 set needs to be preserved through unwinding. >> Other comments: >> >> - I like the new approach of using a combination of the iclass and a >> "subclass" field of the flags. How about making aarch64-gen.c enforce >> that: >> >> - if aarch64-ginsn.c looks at the subclass of a particular iclass, >> every instruction of that iclass has a nonzero subclass field >> > > (Let me refer to the above as #1). I can see that there can be ways to > achieve this... > >> - every other instruction has a zero subclass field >> > > ..but I am not sure I follow this statement. (Let me refer to the above > as #2). > >> This would help to ensure that the data stays up to date. >> The subclass enum could include a nonzero "other" value where >> necessary. >> > > Currently, we are using the opcode->flags bits to encode: > > In include/opcode/aarch64.h: > > /* 4-bit flag field to indicate subclass of operations. > Note that there is an (intended) overlap between the three flag sets > (F_LDST*, F_ARITH* and F_BRANCH*). This allows space savings. */ > #define F_LDST_LOAD (1ULL << 36) > #define F_LDST_STORE (2ULL << 36) > /* A load followed by a store (using the same address). */ > #define F_LDST_SWAP (F_LDST_LOAD | F_LDST_STORE) > /* Subclasses to denote add, sub and mov insns. */ > #define F_ARITH_ADD (1ULL << 36) > #define F_ARITH_SUB (2ULL << 36) > #define F_ARITH_MOV (4ULL << 36) > /* Subclasses to denote call and ret insns. */ > #define F_BRANCH_CALL (1ULL << 36) > #define F_BRANCH_RET (2ULL << 36) > > We can dedicate F_SUBCLASS_NONE (8ULL << 36) and enforce this subclass > on all insns which use none of the above subclasses in a specific > iclass. This can help address (#1), but not sure about (#2). I think the 4 bits are really an enum rather than true independent flags. So it might be better to use 15ULL, so that the other 14 nonzero values are consecutive. But yeah, I think it addresses both #1 and #2. #2 makes sure that a subclass is only present when we expect one. If we define: #define F_SUBCLASS (15ULL << 36) then #2 makes sure that (flags & F_SUBCLASS) == 0 for classes that are not interpreted by ginsns. Thanks, Richard
On 6/27/24 02:40, Richard Sandiford wrote: > Indu Bhagat <indu.bhagat@oracle.com> writes: >> On 6/26/24 04:01, Richard Sandiford wrote: >>> D8-D15 are "interesting" because they are the low 64 bits of Q8-Q15, >>> and of Z8-Z15 if SVE is used. However, a CFI save slot always represents >>> the low 64 bits, regardless of whether a save occurs on D, Q or Z registers. >>> This matters for big-endian code, because there are two additional >>> PCS variants: >>> >>> * the "vector PCS", which preserves Q8-Q23 >>> * the "SVE PCS", which preserves Z8-Z23 and P3-P15 >>> >> >> Is there a way to annotate that a (hand-written asm) function adheres to >> vectors PCS or SVE PCS ? I see that there is a .variant_pcs but that >> does not help differentiate between the above two? >> >> I _think_ gas will need to know which of SVE vs vector PCS is in effect >> for a specific function so that the P3-P15 can be added to the set of >> callee-saved registers being tracked for SCFI for SVE PCS but not for >> vector PCS. > > Only the normal base AAPCS64 register set is preserved across abnormal > control flow (setjmp/longjmp, exceptions, etc.) The extra call-preserved > guarantees for vector and SVE PCS functions only apply to normal returns. > > [This means, for example, that: > > void foo(); > svbool_t f() { > try { > foo(); > } catch (...) {}; > return svptrue_b8(); > } > > must manually restore the additional register state when catching > and returning normally.] > > The CFI requirements therefore don't change: only D8-D15 matter, > like for normal functions. But that's also where the big-endian > complications that I mentioned come from. > > So I don't think the code needs to know which kind of function is > being assembled. The code just needs to be able to recognise Q-based > and Z-based loads and stores of D8-D15 and work out the correct offset > of the low 64 bits. (Although, like I say, I think we can punt on > big-endian SVE PCS functions.) > >>> So vector PCS functions might need to save and restore Q8 when returning >>> normally, but the CFI only describes the save of the D8 portion (since >>> that's the only portion that is preserved by exceptions). This means >>> that, on big-endian: >>> >>> str q8, [sp, #16] >>> >>> should record D8 as being saved at sp+24 rather than sp+16. >>> >>> A further complication is that STR Qn and STR Zn do not store in >>> the same byte order for big-endian: STR Qn stores as a 128-bit >>> integer (MSB first), whereas STR Zn stores as a stream of bytes >>> (LSB first). This means that GCC-generated big-endian SVE PCS >>> functions use things like: >>> >>> st1d z8.d, p2, [sp, #1, mul vl] >>> >>> with the D8 save slot then being at sp + 2*VL - 64. >>> >>> I think it's OK to punt on the big-endian SVE PCS case for now (provided >>> that there's a warning that the code isn't understood, which it looks >>> like there is). But I think it's worth handling the Q register saves. >> >> It looks to me that using reg name / size is an unambiguous proxy to >> deciding whether SVE PCS is in effect. Is this correct ? > > Not necessarily. There's nothing stopping code from using Q-based > loads and stores for normal functions (although it would be an > odd choice). Of course, I dont know what I was thinking when I wrote that. As for Z registers, I realized that I need more time to take a look at the SVE insns and see what patterns need to be handled etc for memory offset calculation. For V4 (to be posted soon), I have added handling for D and Q registers (little-endian and big-endian), but skipped Z altogether for now (SCFI errors out when correctness is affected). Also added this to the set of known limitations to be addressed in a future patch. > > There's also the possiblity of ad-hoc PCSes, but the assumption there > too would be that only the base AAPCS64 set needs to be preserved > through unwinding. > >>> Other comments: >>> >>> - I like the new approach of using a combination of the iclass and a >>> "subclass" field of the flags. How about making aarch64-gen.c enforce >>> that: >>> >>> - if aarch64-ginsn.c looks at the subclass of a particular iclass, >>> every instruction of that iclass has a nonzero subclass field >>> >> >> (Let me refer to the above as #1). I can see that there can be ways to >> achieve this... >> >>> - every other instruction has a zero subclass field >>> >> >> ..but I am not sure I follow this statement. (Let me refer to the above >> as #2). >> >>> This would help to ensure that the data stays up to date. >>> The subclass enum could include a nonzero "other" value where >>> necessary. >>> >> >> Currently, we are using the opcode->flags bits to encode: >> >> In include/opcode/aarch64.h: >> >> /* 4-bit flag field to indicate subclass of operations. >> Note that there is an (intended) overlap between the three flag sets >> (F_LDST*, F_ARITH* and F_BRANCH*). This allows space savings. */ >> #define F_LDST_LOAD (1ULL << 36) >> #define F_LDST_STORE (2ULL << 36) >> /* A load followed by a store (using the same address). */ >> #define F_LDST_SWAP (F_LDST_LOAD | F_LDST_STORE) >> /* Subclasses to denote add, sub and mov insns. */ >> #define F_ARITH_ADD (1ULL << 36) >> #define F_ARITH_SUB (2ULL << 36) >> #define F_ARITH_MOV (4ULL << 36) >> /* Subclasses to denote call and ret insns. */ >> #define F_BRANCH_CALL (1ULL << 36) >> #define F_BRANCH_RET (2ULL << 36) >> >> We can dedicate F_SUBCLASS_NONE (8ULL << 36) and enforce this subclass >> on all insns which use none of the above subclasses in a specific >> iclass. This can help address (#1), but not sure about (#2). > > I think the 4 bits are really an enum rather than true independent flags. > So it might be better to use 15ULL, so that the other 14 nonzero values are > consecutive. > > But yeah, I think it addresses both #1 and #2. #2 makes sure that a > subclass is only present when we expect one. If we define: > > #define F_SUBCLASS (15ULL << 36) > > then #2 makes sure that (flags & F_SUBCLASS) == 0 for classes that > are not interpreted by ginsns. > Thanks. This should be addressed in the V4 series which I will be posting soon. Thanks Indu