From patchwork Fri Sep 3 12:14:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Huang Shijie X-Patchwork-Id: 44839 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8AB3B384F039 for ; Fri, 3 Sep 2021 04:16:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8AB3B384F039 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1630642612; bh=n+6dNHCflCGNyL7IbwDaqHqGgWd9p9v6EwSaXlhfHbw=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=tBjt5U1HMQjsZlt71QtOBPQFzNyyO1D0o7yaT74edmqYBIX9PQ5ToTtB7MVDwux8E 1WKYwSepbRC9otU+F3VvwqZGBcShDBwc/bb2m0wVnogqF2557gKino2GG6Oog6/bmy xy8xnOXnkilaoyzP0BW2PHesl1qX7BClo3zPQG0E= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from NAM02-BN1-obe.outbound.protection.outlook.com (mail-bn1nam07on2096.outbound.protection.outlook.com [40.107.212.96]) by sourceware.org (Postfix) with ESMTPS id B618A3850439 for ; Fri, 3 Sep 2021 04:16:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B618A3850439 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TqyC0oIncpU7fNU7BrXkeXnPcDzARv4bLz4qh6PHG8ezEIpcton3G5jNt5SAO5S2VKEkufgK/YW1CiouV01BRfQ977mRIqEmyOaf6rF35silumL/rveQbSLgEIR/nhPEyV+pGS0lHdGfvKQ40NpEmsGDh3hf+HAIWFLq6mpxUz6xjGZUoELHV34dP+XzOVfSuohts1pDpF7jXy/Onc2UhWfacuCEdOeVgayVRtYrFyrdfLTu1tvFGfUE2ZqfEb6V0LVY7XIDW0kR6vmhfh47ifatFJ+ZhlUlfi6l2Z44y9qOFZHKja29mxcHfEQBDsJijd+rgu4tT7srO2YH6I2Bfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=n+6dNHCflCGNyL7IbwDaqHqGgWd9p9v6EwSaXlhfHbw=; b=bf9SiPlEfkhJpTZKVrS4Krn3OO7rXRLuP3e7TXZGKRqcPLfK4IwPjwnqJNn/eFDiKMcYabhjR9lxALYUcXilLyCbJ0M8je8jeHpeXnnOeKcY2IMANAW/l/Vj0DgIjwg2BYvmpWIb/Bwi2fiiD3IWmufWDYxED5gJhXB/PtRy2pfRO2/7f939kCuxpRErW6i286acgBiuohS5cjUaviChLb1mNbOZwfcurZSMMJS4WoPYsFqs1iNRlzoslCM6RMrwkY8PD3O2ThHcfG6g5f/xBVwrGFkBbXnNPDIQ1+qARTySDWORe3cx30KkGqlJUvnn0WN26+U8yKk9ABAFsBwf/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=os.amperecomputing.com; dmarc=pass action=none header.from=os.amperecomputing.com; dkim=pass header.d=os.amperecomputing.com; arc=none Received: from MWHPR0101MB3165.prod.exchangelabs.com (2603:10b6:301:2f::19) by MWHPR01MB2687.prod.exchangelabs.com (2603:10b6:300:fa::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4478.17; Fri, 3 Sep 2021 04:16:28 +0000 Received: from MWHPR0101MB3165.prod.exchangelabs.com ([fe80::ed89:1b21:10f4:ed56]) by MWHPR0101MB3165.prod.exchangelabs.com ([fe80::ed89:1b21:10f4:ed56%3]) with mapi id 15.20.4478.022; Fri, 3 Sep 2021 04:16:28 +0000 To: carlos@systemhalted.org Subject: [PATCH] Add LD_NUMA_REPLICATION for glibc Date: Fri, 3 Sep 2021 12:14:34 +0000 Message-Id: <20210903121434.12162-1-shijie@os.amperecomputing.com> X-Mailer: git-send-email 2.30.2 X-ClientProxiedBy: CH0PR03CA0255.namprd03.prod.outlook.com (2603:10b6:610:e5::20) To MWHPR0101MB3165.prod.exchangelabs.com (2603:10b6:301:2f::19) MIME-Version: 1.0 Received: from hsj.amperecomputing.com (180.167.209.74) by CH0PR03CA0255.namprd03.prod.outlook.com (2603:10b6:610:e5::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4478.17 via Frontend Transport; Fri, 3 Sep 2021 04:16:22 +0000 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 6d6b1acd-9437-4262-1311-08d96e91996e X-MS-TrafficTypeDiagnostic: MWHPR01MB2687: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0SJcCdQLWa0njJQFvm1MqsMtVWeNB+6/QpCiaFFAl188mDmzB3V76qGCvX7enZXy/o/ffHriuIMhrotd195hIY8ltiupVSalUvkkD0dGaP2Z+aXKJNppzD7YN/HStXGOaFGG2ykiBoH8csQvFOSHGapgIwv2tS2KEBgB60i8TcnNqthIuMIszSBdSOTkWqRyOMFHwqEqWmP0ebndzH2j7BR6pYQJFOKARziQ6ZBrWBoKP0FIMaaZ+yx+RjtHLHhUbPtUxnTXrSNgQqNSfUbF3QmpvFGA9PBAg0r7DP06GDs7RSBjj/kC10b74OPpgfvWeTZLgrL+Kj6rQk6gx4cykwWWfJMM8riOGSd1dP/3ERDs1MTk/BppPp9eU8eBDGmecBP1loTG1aMsvAs23YmL9WIDzTIr+bYZWQIE1w4X/pya/zmKBL0pTyDCxW45mOp0Q7uQSBHtUJia5h63PEN16NYs/fcN2p1e0z+bl8kDvieOINviRp57i95ukDhBlkgqvbbEhN+3sG5KBHSP3wcMP6ahRyfxqBHmlf//y2BaD2bZ3labnmhRUurpHmIsgo2Str6gj0l5hXonk6A2NUPW/EVHAC5irL1CBQwPKeZEGOyW/bPXLTTbQauYKYw5Q5X+RRMx/8LlIUn4FTVxhgVO+Hcm1x3x6FpMljRb6wcsCnlxYB2sFyjjNLeZ39NJmFiYQYfJoN40nXex7ZWZv41Obw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MWHPR0101MB3165.prod.exchangelabs.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(8936002)(83380400001)(2906002)(6916009)(6486002)(6512007)(2616005)(8676002)(956004)(26005)(1076003)(186003)(5660300002)(107886003)(316002)(508600001)(52116002)(66556008)(38350700002)(66476007)(38100700002)(6666004)(4326008)(66946007)(6506007)(86362001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: +uIlaNJfrzM9rBxYOHfy3O7duWhjOZOHiSdWuna00q0RQiHngoOAJh/rHdspt6GBm3S0Vuah77tmHJn6f0zbKyD30yUSoFhyyYOQxhjAsHyTb/3jK28k/s12sfKWDPeG9FQfZrLnq7I+DPeL0NDRdDi8ENac/ZehtFF72wUorARREsZQ3kBCAkz+kSJeqmJ3Aj7sq1fyonzJc6njGT6puT/CqWe89TLjhw99nXe7UfISBRm57ywe68wwZ84RlinPU0SMMsgvMoOp18r68lR7bDqqvWBMJSM173jAx8FKgT0C7QEzbQezB974VcU8sfZG/000LJM+4OgVRCWo2iSeLXQNgK1+KI6/SxEoXdsPnIEWNuhj2BEFXQVyNOyEzPLWgr1mUCDndsvyIakCbwBQQ3jI40aZNjF1tAMM+nY2w3i1J0VXQKyxDCA4ORtsA6ME5H2SZu8g/3gyn6a3VMzkg8Pz4GyS0/KTP6M83CiRJy6dyExGTfga/dXBOzzaT9TNx28jtD76tqAGWEjjeVMmzLNVyVqAefrlhOO7obTXgp8nCfW/E+CBsfB5kaA73pHHceuPGcBP6V7sOBOBwfJm6v0pxMlllXhU7mXRSV5V+R7Tloyk3OBOdVSx2MozNCYMze7iUsxUiQo1i7UbNZiHh1YvaqPN6yFEE0LzpO5nbEjdFmMCR4qQcoupcAkiJbuvEq8IJsmyvPMfHlZQti9KTXC2nbUU7WH0ybfkQQ4Mxjk4WdHTilg8A66F/QsVUb+p4BFrY4p3Y4r8JHElRssoAShpt0+1oFVtuHz85Ta5WoB8Zg4gPQgUbZHpHyEmo1QoJDnQj0XAM/alCCiE4unVtnwQ5sQtYrbtxmVQ2y5q1N8cdLLx2l1BcbGWXK+8UrzQOVb5rFI5omkPYcmRxTZBI88bqonFLiVgJ7HMzQ/GBS4MIOy0IrGZBSGjedVbxECD2SHEb2SybrMaUTGxO7LSp9e7OSqVFXIHiHeFXxzsWhe7ziwWIx/Sgoh8jct3FTHiTkVaiZAClAXHzNegiseVMgvgiT4eKmCn0orbjB0BU6nQvyvUf6yUiOUURXFjdCpRlyZJZmez4f/k7AivjrbVMKp3yOr4tIltHUz8vTCvS6+3CRVl0sOb/J/8ve3dB6/D85stOWdZXDr+ZR2SUn38uHv26+lrW5NF4TjE8k9P5XRt+NcYmkzof27P/yG1QHpCaJbneI3MWioCAFX+qwFep3oj2W6ryW9BAOubh+sN36vQMR9tcQPdZKPcl2sVmkPOerCNL6xaJlwTFeGP7cSKfhpVxPbOC1g2llsN8ehs5rRPdlOrusnHH6T5uhHTczq0 X-OriginatorOrg: os.amperecomputing.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6d6b1acd-9437-4262-1311-08d96e91996e X-MS-Exchange-CrossTenant-AuthSource: MWHPR0101MB3165.prod.exchangelabs.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Sep 2021 04:16:28.3341 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 3bc2b170-fd94-476d-b0ce-4229bdc904a7 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PNWzKH4hwnC6mgvjtvjH905Cg+Ok5NEQoOkyW6zlOSy9rgoirXIQ+HSwYmINBKh3cNBo4jndvwzSj4J3WH2rsQZKRPDiIiP77027BUFMEgggHQqnYuDiM2b4OfKgC5jC X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR01MB2687 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DATE_IN_FUTURE_06_12, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Huang Shijie via Libc-alpha From: Huang Shijie Reply-To: Huang Shijie Cc: Huang Shijie , zwang@amperecomputing.com, patches@amperecomputing.com, libc-alpha@sourceware.org Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" This patch adds LD_NUMA_REPLICATION which influences the linkage of shared libraries at run time. If LD_NUMA_REPLICATION is set for program foo like this: #LD_NUMA_REPLICATION=1 ./foo At the time ld.so mmaps the shared libraries, it will uses mmap(, c->prot | PROT_WRITE, MAP_COPY | MAP_FILE | MAP_POPULATE,) for them, and the mmap will trigger COW(copy on write) for the shared libraries at the NUMA node which the program `foo` runs. After the COW, the foo will have a copy of the shared library segment(mmap covered) which belong to the same NUMA node. So when enable LD_NUMA_REPLICATION, it will consume more memory, but it will reduce the remote-access in NUMA. Signed-off-by: Huang Shijie --- elf/dl-map-segments.h | 28 ++++++++++++++++++++++++---- elf/dl-support.c | 4 ++++ elf/rtld.c | 4 ++++ sysdeps/generic/ldsodefs.h | 4 ++++ 4 files changed, 36 insertions(+), 4 deletions(-) diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h index f9fb110e..ae6661a7 100644 --- a/elf/dl-map-segments.h +++ b/elf/dl-map-segments.h @@ -52,13 +52,33 @@ _dl_map_segments (struct link_map *l, int fd, c->mapstart & GLRO(dl_use_load_bias)) - MAP_BASE_ADDR (l)); - /* Remember which part of the address space this object uses. */ - l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength, + if (__glibc_unlikely(GLRO(dl_numa_replication))) + { + /* Trigger the linux kernel COW(copy on write) on purpose */ + l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength, + c->prot|PROT_WRITE, + MAP_COPY|MAP_FILE|MAP_POPULATE, + fd, c->mapoff); + if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED)) + return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT; + + /* Change back to c->prot if needed */ + if (!(c->prot & PROT_WRITE)) + { + if (__mprotect((caddr_t)l->l_map_start, maplength, c->prot)) + return DL_MAP_SEGMENTS_ERROR_MPROTECT; + } + } + else + { + /* Remember which part of the address space this object uses. */ + l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength, c->prot, MAP_COPY|MAP_FILE, fd, c->mapoff); - if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED)) - return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT; + if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED)) + return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT; + } l->l_map_end = l->l_map_start + maplength; l->l_addr = l->l_map_start - c->mapstart; diff --git a/elf/dl-support.c b/elf/dl-support.c index 01557181..d2eb3164 100644 --- a/elf/dl-support.c +++ b/elf/dl-support.c @@ -79,6 +79,10 @@ const char *_dl_origin_path; /* Nonzero if runtime lookup should not update the .got/.plt. */ int _dl_bind_not; + /* Do we want to do the replication(by linux copy on write) for shared libraries in NUMA? + Only valid in the linux system. */ +int _dl_numa_replication; + /* A dummy link map for the executable, used by dlopen to access the global scope. We don't export any symbols ourselves, so this can be minimal. */ static struct link_map _dl_main_map = diff --git a/elf/rtld.c b/elf/rtld.c index d733359e..10378c00 100644 --- a/elf/rtld.c +++ b/elf/rtld.c @@ -2788,7 +2788,11 @@ process_envvars (struct dl_main_state *state) GLRO(dl_verbose) = 1; GLRO(dl_debug_mask) |= DL_DEBUG_PRELINK; GLRO(dl_trace_prelink) = &envline[17]; + break; } + + if (memcmp (envline, "NUMA_REPLICATION", 16) == 0) + GLRO(dl_numa_replication) = true; break; case 20: diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h index 9c152592..f6114522 100644 --- a/sysdeps/generic/ldsodefs.h +++ b/sysdeps/generic/ldsodefs.h @@ -569,6 +569,10 @@ struct rtld_global_ro /* Nonzero if runtime lookups should not update the .got/.plt. */ EXTERN int _dl_bind_not; + /* Do we want to do the replication(by linux copy on write) for shared libraries in NUMA? + Only valid in the linux system. */ + EXTERN int _dl_numa_replication; + /* Nonzero if references should be treated as weak during runtime linking. */ EXTERN int _dl_dynamic_weak;