rtl-optimization/117467 - limit ext-dce memory use

Message ID 20250110114127.F342113A86@imap1.dmz-prg2.suse.org
State New
Headers
Series rtl-optimization/117467 - limit ext-dce memory use |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply

Commit Message

Richard Biener Jan. 10, 2025, 11:41 a.m. UTC
  The following puts in a hard limit on ext-dce because it might end
up requiring memory on the order of the number of basic blocks
times the number of pseudo registers.  The limiting follows what
GCSE based passes do and thus I re-use --param max-gcse-memory here.

This doesn't in any way address the implementation issues of the pass,
but it reduces the memory-use when compiling the
module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

I plan to push this later today unless I hear objection.

	PR rtl-optimization/117467
	PR rtl-optimization/117934
	* ext-dce.cc (ext_dce_execute): Do nothing if a memory
	allocation estimate exceeds what is allowed by
	--param max-gcse-memory.
---
 gcc/ext-dce.cc | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)
  

Comments

Jeff Law Jan. 10, 2025, 9:15 p.m. UTC | #1
On 1/10/25 4:41 AM, Richard Biener wrote:
> The following puts in a hard limit on ext-dce because it might end
> up requiring memory on the order of the number of basic blocks
> times the number of pseudo registers.  The limiting follows what
> GCSE based passes do and thus I re-use --param max-gcse-memory here.
> 
> This doesn't in any way address the implementation issues of the pass,
> but it reduces the memory-use when compiling the
> module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> I plan to push this later today unless I hear objection.
> 
> 	PR rtl-optimization/117467
> 	PR rtl-optimization/117934
> 	* ext-dce.cc (ext_dce_execute): Do nothing if a memory
> 	allocation estimate exceeds what is allowed by
> 	--param max-gcse-memory.
No objections.  Thanks for taking care of it.  A bit surprised it's this 
bad, though I guess those bitmaps might be dense for pathological cases.

Jeff
  
Richard Biener Jan. 11, 2025, 8:45 a.m. UTC | #2
> Am 10.01.2025 um 22:17 schrieb Jeff Law <jeffreyalaw@gmail.com>:
> 
> 
> 
>> On 1/10/25 4:41 AM, Richard Biener wrote:
>> The following puts in a hard limit on ext-dce because it might end
>> up requiring memory on the order of the number of basic blocks
>> times the number of pseudo registers.  The limiting follows what
>> GCSE based passes do and thus I re-use --param max-gcse-memory here.
>> This doesn't in any way address the implementation issues of the pass,
>> but it reduces the memory-use when compiling the
>> module_first_rk_step_part1.F90 TU from 521.wrf_r from 25GB to 1GB.
>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>> I plan to push this later today unless I hear objection.
>>    PR rtl-optimization/117467
>>    PR rtl-optimization/117934
>>    * ext-dce.cc (ext_dce_execute): Do nothing if a memory
>>    allocation estimate exceeds what is allowed by
>>    --param max-gcse-memory.
> No objections.  Thanks for taking care of it.  A bit surprised it's this bad, though I guess those bitmaps might be dense for pathological cases.

I‘m quite sure there’s something wrong with the pass, in particular how it treats live (it doesn’t seem to prune based on kills?).  But I don’t have time to investigate nor less to rewrite its ‘dataflow’.

Richard 

> 
> Jeff
>
  

Patch

diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc
index 6cf64187349..e257e3bc873 100644
--- a/gcc/ext-dce.cc
+++ b/gcc/ext-dce.cc
@@ -34,6 +34,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "df.h"
 #include "print-rtl.h"
 #include "dbgcnt.h"
+#include "diagnostic-core.h"
 
 /* These should probably move into a C++ class.  */
 static vec<bitmap_head> livein;
@@ -1110,6 +1111,21 @@  static bool ext_dce_rd_confluence_n (edge) { return true; }
 void
 ext_dce_execute (void)
 {
+  /* Limit the amount of memory we use for livein, with 4 bits per
+     reg per basic-block including overhead that maps to one byte
+     per reg per basic-block.  */
+  uint64_t memory_request
+    = (uint64_t)n_basic_blocks_for_fn (cfun) * max_reg_num ();
+  if (memory_request / 1024 > (uint64_t)param_max_gcse_memory)
+    {
+      warning (OPT_Wdisabled_optimization,
+	       "ext-dce disabled: %d basic blocks and %d registers; "
+	       "increase %<--param max-gcse-memory%> above %wu",
+	       n_basic_blocks_for_fn (cfun), max_reg_num (),
+	       memory_request / 1024);
+      return;
+    }
+
   /* Some settings of SUBREG_PROMOTED_VAR_P are actively harmful
      to this pass.  Clear it for those cases.  */
   maybe_clear_subreg_promoted_p ();