From patchwork Wed Jun 29 09:21:57 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Aldy Hernandez <aldyh@redhat.com>
X-Patchwork-Id: 55517
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 0FCFD3851148
	for <patchwork@sourceware.org>; Wed, 29 Jun 2022 09:22:43 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0FCFD3851148
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1656494563;
	bh=liiJ7hI912ozsKcMhEuHCvQsTpAs3Sw0/jGsMoUIkfI=;
	h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:Cc:From;
	b=aicQCvkCJ3oKLfMrE1gWTW/XmrzXocMppL0Mr8fPT8Neb3Tc5Ec3+EhHo0SZ5VrBy
	 G7IhYSNjDEhMMqklFIXOJNwtCLtRg2GGPmZ2O+zD6q25mKPgTwMPfJom3/wkwwu3+O
	 Bu0sL11w5iwYES94S4vCO44iIrXuORQ4Ukx1W03g=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTPS id 1AB8F38582B1
 for <gcc-patches@gcc.gnu.org>; Wed, 29 Jun 2022 09:22:12 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1AB8F38582B1
Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com
 [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-118-V-_yJpflNnu3DX1ytqkQ3g-1; Wed, 29 Jun 2022 05:22:08 -0400
X-MC-Unique: V-_yJpflNnu3DX1ytqkQ3g-1
Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com
 [10.11.54.3])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6111118F0241;
 Wed, 29 Jun 2022 09:22:08 +0000 (UTC)
Received: from abulafia.quesejoda.com (unknown [10.39.192.137])
 by smtp.corp.redhat.com (Postfix) with ESMTPS id B9D041121314;
 Wed, 29 Jun 2022 09:22:07 +0000 (UTC)
Received: from abulafia.quesejoda.com (localhost [127.0.0.1])
 by abulafia.quesejoda.com (8.17.1/8.17.1) with ESMTPS id 25T9M4YX230309
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT);
 Wed, 29 Jun 2022 11:22:04 +0200
Received: (from aldyh@localhost)
 by abulafia.quesejoda.com (8.17.1/8.17.1/Submit) id 25T9M4ND230308;
 Wed, 29 Jun 2022 11:22:04 +0200
To: GCC patches <gcc-patches@gcc.gnu.org>
Subject: [RFC] trailing_wide_ints with runtime variable lengths
Date: Wed, 29 Jun 2022 11:21:57 +0200
Message-Id: <20220629092157.230287-1-aldyh@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Aldy Hernandez via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Aldy Hernandez <aldyh@redhat.com>
Reply-To: Aldy Hernandez <aldyh@redhat.com>
Cc: richard.sandiford@arm.com, Jakub Jelinek <jakub@redhat.com>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

Currently global ranges are stored in SSA_NAME_RANGE_INFO as a pair of
wide_int-like objects along with the nonzero bits.  We frequently lose
precision when streaming out our higher resolution iranges.  The plan
was always to store the full irange between passes.  However, as was
originally discussed eons ago:

	https://gcc.gnu.org/pipermail/gcc-patches/2017-May/475139.html

...we need a memory efficient way of saving iranges, preferably using
the trailing_wide_ints idiom.

The problem with doing so is that trailing_wide_ints assume a
compile-time specified number of elements.  For irange, we need to
determine the size at run-time.

One solution is to adapt trailing_wide_ints such that N is the maximum
number of elements allowed, and allow setting the actual number at
run-time (defaulting to N).  The attached patch does this, while
requiring no changes to existing users.

It uses a byte to store the number of elements in the
trailing_wide_ints control word.  The control word is currently a
16-bit precision, an 8-bit max-length, and the rest is used for
m_len[N].  On a 64-bit architecture, this allows for 5 elements in
m_len without having to use an extra word.  With this patch, m_len[]
would be smaller by one byte (4) before consuming the padding.  This
shouldn't be a problem as the only users of trailing_wide_ints use N=2
for NUM_POLY_INT_COEFFS in aarch64, and N=3 for range_info_def.

For irange, my plan is to use one more word to fit a maximum of 12
elements (the above 4 plus 8 more).  This would allow for 6 pairs of
sub-ranges which would be more than adequate for our needs.  In
previous tests we found that 99% of ranges fit within 3-4 pairs.  More
precisely, this would allow for 5 pairs, plus the nonzero bits, plus a
spare wide-int for future development.

Ultimately this means that streaming an irange would consume one more
word than what we currently do for range_info_def.  IMO this is a nice
trade-off considering we started storing a slew of wide-ints directly
;-).

I'm not above rolling an altogether different approach, but would
prefer to use the existing trailing infrastructure since it's mostly
what we need.

Thoughts?

p.s. Tested and benchmarked on x86-64 Linux.  There was no discernible
performance change in our benchmark suite.

gcc/ChangeLog:

	* wide-int.h (struct trailing_wide_ints): Add m_num_elements.
	(trailing_wide_ints::set_precision): Add num_elements argument.
	(trailing_wide_ints::extra_size): Same.
---
 gcc/wide-int.h | 42 +++++++++++++++++++++++++++++-------------
 1 file changed, 29 insertions(+), 13 deletions(-)

diff --git a/gcc/wide-int.h b/gcc/wide-int.h
index 8041b6104f9..f68ccf0a0c5 100644
--- a/gcc/wide-int.h
+++ b/gcc/wide-int.h
@@ -1373,10 +1373,13 @@ namespace wi
     : public int_traits <wide_int_storage> {};
 }
 
-/* An array of N wide_int-like objects that can be put at the end of
-   a variable-sized structure.  Use extra_size to calculate how many
-   bytes beyond the sizeof need to be allocated.  Use set_precision
-   to initialize the structure.  */
+/* A variable-lengthed array of wide_int-like objects that can be put
+   at the end of a variable-sized structure.  The number of objects is
+   at most N and can be set at runtime by using set_precision().
+
+   Use extra_size to calculate how many bytes beyond the
+   sizeof need to be allocated.  Use set_precision to initialize the
+   structure.  */
 template <int N>
 struct GTY((user)) trailing_wide_ints
 {
@@ -1387,6 +1390,9 @@ private:
   /* The shared maximum length of each number.  */
   unsigned char m_max_len;
 
+  /* The number of elements.  */
+  unsigned char m_num_elements;
+
   /* The current length of each number.
      Avoid char array so the whole structure is not a typeless storage
      that will, in turn, turn off TBAA on gimple, trees and RTL.  */
@@ -1399,12 +1405,15 @@ private:
 public:
   typedef WIDE_INT_REF_FOR (trailing_wide_int_storage) const_reference;
 
-  void set_precision (unsigned int);
+  void set_precision (unsigned int precision, unsigned int num_elements = N);
   unsigned int get_precision () const { return m_precision; }
+  unsigned int num_elements () const { return m_num_elements; }
   trailing_wide_int operator [] (unsigned int);
   const_reference operator [] (unsigned int) const;
-  static size_t extra_size (unsigned int);
-  size_t extra_size () const { return extra_size (m_precision); }
+  static size_t extra_size (unsigned int precision,
+			    unsigned int num_elements = N);
+  size_t extra_size () const { return extra_size (m_precision,
+						  m_num_elements); }
 };
 
 inline trailing_wide_int_storage::
@@ -1457,11 +1466,14 @@ trailing_wide_int_storage::operator = (const T &x)
 }
 
 /* Initialize the structure and record that all elements have precision
-   PRECISION.  */
+   PRECISION.  NUM_ELEMENTS can be no more than N.  */
 template <int N>
 inline void
-trailing_wide_ints <N>::set_precision (unsigned int precision)
+trailing_wide_ints <N>::set_precision (unsigned int precision,
+				       unsigned int num_elements)
 {
+  gcc_checking_assert (num_elements <= N);
+  m_num_elements = num_elements;
   m_precision = precision;
   m_max_len = ((precision + HOST_BITS_PER_WIDE_INT - 1)
 	       / HOST_BITS_PER_WIDE_INT);
@@ -1484,15 +1496,19 @@ trailing_wide_ints <N>::operator [] (unsigned int index) const
 			  m_len[index].len, m_precision);
 }
 
-/* Return how many extra bytes need to be added to the end of the structure
-   in order to handle N wide_ints of precision PRECISION.  */
+/* Return how many extra bytes need to be added to the end of the
+   structure in order to handle N wide_ints of precision PRECISION.
+   NUM_ELEMENTS is the number of elements, or N if none specified.  */
 template <int N>
 inline size_t
-trailing_wide_ints <N>::extra_size (unsigned int precision)
+trailing_wide_ints <N>::extra_size (unsigned int precision,
+				    unsigned int num_elements)
 {
   unsigned int max_len = ((precision + HOST_BITS_PER_WIDE_INT - 1)
 			  / HOST_BITS_PER_WIDE_INT);
-  return (N * max_len - 1) * sizeof (HOST_WIDE_INT);
+  if (num_elements > N)
+    num_elements = N;
+  return (num_elements * max_len - 1) * sizeof (HOST_WIDE_INT);
 }
 
 /* This macro is used in structures that end with a trailing_wide_ints field