From patchwork Mon Jul 31 18:35:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sunil Pandey X-Patchwork-Id: 73396 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D92F2385828D for ; Mon, 31 Jul 2023 18:36:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D92F2385828D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1690828576; bh=5xlZeUw578p9Dprn4Avw+xU/bWQWwz+nM2IC93Cr5UA=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=cccF0oD9DgmV6lq1qM1+fTo7O6DyFY9REWYmTMURiVZd8arXPg9975KvOcaCo6X3D AsUlrzCH6sobQa7ukiVFDPtyV5g0l8u/C99H33j2Stx1pYPKghmSuqCMqO+/5T0/Kf 5ks7W11RTeBmNIF82pLpG/lPK1V5Npi2jPrBCHtY= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (unknown [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 4B97A3858408 for ; Mon, 31 Jul 2023 18:35:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4B97A3858408 X-IronPort-AV: E=McAfee;i="6600,9927,10788"; a="371809757" X-IronPort-AV: E=Sophos;i="6.01,245,1684825200"; d="scan'208";a="371809757" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Jul 2023 11:35:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10788"; a="842395414" X-IronPort-AV: E=Sophos;i="6.01,245,1684825200"; d="scan'208";a="842395414" Received: from scymds03.sc.intel.com ([10.148.94.166]) by fmsmga002.fm.intel.com with ESMTP; 31 Jul 2023 11:35:49 -0700 Received: from gskx-1.sc.intel.com (gskx-1.sc.intel.com [172.25.149.211]) by scymds03.sc.intel.com (Postfix) with ESMTP id B9D405E; Mon, 31 Jul 2023 11:35:49 -0700 (PDT) To: libc-alpha@sourceware.org Cc: hjl.tools@gmail.com Subject: [PATCH v2] x86_64: Optimize ffsll function code size. Date: Mon, 31 Jul 2023 11:35:49 -0700 Message-ID: <20230731183549.2396362-1-skpgkp2@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, FORGED_GMAIL_RCVD, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_DMARC_NONE, KAM_DMARC_STATUS, MAY_BE_FORGED, NML_ADSP_CUSTOM_MED, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sunil K Pandey via Libc-alpha From: Sunil Pandey Reply-To: Sunil K Pandey Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Ffsll function size is 17 byte, this patch optimizes size to 16 byte. Currently ffsll function randomly regress by ~20%, depending on how code get aligned. This patch fixes ffsll function random performance regression. Changes from v1: - Further reduce size ffsll function size to 12 bytes. Reviewed-by: Carlos O'Donell --- sysdeps/x86_64/ffsll.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/sysdeps/x86_64/ffsll.c b/sysdeps/x86_64/ffsll.c index a1c13d4906..6a5803c7c1 100644 --- a/sysdeps/x86_64/ffsll.c +++ b/sysdeps/x86_64/ffsll.c @@ -26,13 +26,13 @@ int ffsll (long long int x) { long long int cnt; - long long int tmp; - asm ("bsfq %2,%0\n" /* Count low bits in X and store in %1. */ - "cmoveq %1,%0\n" /* If number was zero, use -1 as result. */ - : "=&r" (cnt), "=r" (tmp) : "rm" (x), "1" (-1)); + asm ("mov $-1,%k0\n" /* Intialize CNT to -1. */ + "bsf %1,%0\n" /* Count low bits in X and store in CNT. */ + "inc %k0\n" /* Increment CNT by 1. */ + : "=&r" (cnt) : "r" (x)); - return cnt + 1; + return cnt; } #ifndef __ILP32__