IPA: fix reproducibility in IPA MOD REF

Message ID 3b0ef8c4-45df-1d84-158d-f6d98a493a32@suse.cz
State New
Headers
Series IPA: fix reproducibility in IPA MOD REF |

Commit Message

Martin Liška Nov. 18, 2021, 5:58 p.m. UTC
  Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

	* ipa-modref.c (analyze_function): Do not execute the code
	only if dump_file != NULL.
---
  gcc/ipa-modref.c | 14 +++++++-------
  1 file changed, 7 insertions(+), 7 deletions(-)
  

Comments

Jan Hubicka Nov. 18, 2021, 6:11 p.m. UTC | #1
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 	* ipa-modref.c (analyze_function): Do not execute the code
> 	only if dump_file != NULL.
> ---
>  gcc/ipa-modref.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index a3c7c6d6a1f..6cacf9c8ab1 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -2868,15 +2868,15 @@ analyze_function (function *f, bool ipa)
>  	optimization_summaries = modref_summaries::create_ggc (symtab);
>        else /* Remove existing summary if we are re-running the pass.  */
>  	{
> -	  if (dump_file
> -	      && (summary
> -		  = optimization_summaries->get (fnode))
> -		 != NULL
> +	  summary = optimization_summaries->get (fnode);
> +	  if (summary != NULL
How does this affect reproducibility?

Honza
  
Martin Liška Nov. 18, 2021, 6:16 p.m. UTC | #2
On 11/18/21 19:11, Jan Hubicka wrote:
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 	* ipa-modref.c (analyze_function): Do not execute the code
>> 	only if dump_file != NULL.
>> ---
>>   gcc/ipa-modref.c | 14 +++++++-------
>>   1 file changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
>> index a3c7c6d6a1f..6cacf9c8ab1 100644
>> --- a/gcc/ipa-modref.c
>> +++ b/gcc/ipa-modref.c
>> @@ -2868,15 +2868,15 @@ analyze_function (function *f, bool ipa)
>>   	optimization_summaries = modref_summaries::create_ggc (symtab);
>>         else /* Remove existing summary if we are re-running the pass.  */
>>   	{
>> -	  if (dump_file
>> -	      && (summary
>> -		  = optimization_summaries->get (fnode))
>> -		 != NULL
>> +	  summary = optimization_summaries->get (fnode);
>> +	  if (summary != NULL
> How does this affect reproducibility?

In the following way:

$ cat 1.i
__ckd_calloc___n_elem;
     __ckd_calloc___elem_size;
     *__ckd_calloc__() {
    void *mem = calloc(__ckd_calloc___n_elem, __ckd_calloc___elem_size);
    return mem;
  }
     _E__die_error() {
  exit(1);
  }

$ cat 2.i
typedef struct {    int *lmclass }
      lm_t;
      lm_read_dump(int *lmclass) {    lm_t lm;    {     int i;     for (; i; i++)       lm.lmclass[i] = lmclass[i];   }    lm_set_param();  }
      lm_read_ctl_dict_size_n_lmclass_used;
      lm_read_ctl_dict_size() {    int *lmclass = __ckd_calloc__();    while (strcmp()) {     lmclass[lm_read_ctl_dict_size_n_lmclass_used] = 0;     _E__die_error();   }    lm_read_dump(lmclass);    for (; ; )     ;  }

$ cat cmd
rm -f *.o xxx* yyy*
gcc -v
gcc [12].i -O2  -flto=auto -c -w
gcc [12].o -O2  -flto=auto -shared --save-temps -o xxx -w -fdump-tree-optimized

rm -f *.o
gcc [12].i -O2 -flto=auto -c -w
gcc [12].o -O2 -flto=auto -shared --save-temps -o yyy -fdump-tree-modref -w


diff -u -U30 xxx.ltrans0.ltrans*optimized yyy.ltrans0.ltrans*optimized

diff xxx.ltrans0.ltrans.s yyy.ltrans0.ltrans.s
if test $? = 1; then
   exit 0
else
   exit 1
fi


$ ./cmd
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/home/marxin/bin/gcc/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/marxin/Programming/gcc/configure --enable-languages=c,c++,fortran,jit --prefix=/home/marxin/bin/gcc --disable-multilib --enable-host-shared --disable-libsanitizer --enable-valgrind-annotations --disable-bootstrap
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.0 20211118 (experimental) (GCC)
/usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text'
/usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
/usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text'
/usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status
diff: yyy.ltrans0.ltrans*optimized: No such file or directory
54,55d53
< 	movslq	lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax
< 	movl	$0, 0(%rbp,%rax,4)

I tracked that it differs in tree DSE dump.

May I install the patch?

Thanks,
Martin

> 
> Honza
>
  
Jan Hubicka Nov. 18, 2021, 6:22 p.m. UTC | #3
> Supported LTO compression algorithms: zlib zstd
> gcc version 12.0.0 20211118 (experimental) (GCC)
> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text'
> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
> /usr/bin/ld: final link failed: bad value
> collect2: error: ld returned 1 exit status
> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text'
> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
> /usr/bin/ld: final link failed: bad value
> collect2: error: ld returned 1 exit status
> diff: yyy.ltrans0.ltrans*optimized: No such file or directory
> 54,55d53
> < 	movslq	lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax
> < 	movl	$0, 0(%rbp,%rax,4)
> 
> I tracked that it differs in tree DSE dump.
> 
> May I install the patch?

No, I think it is bug of symbol_summary that get is causing a
difference.  It should be pure function. I think it is:
  /* Getter for summary callgraph node pointer.  */
  T* get (cgraph_node *node) ATTRIBUTE_PURE
    {
       return exists (node) ? (*m_vector)[node->get_summary_id ()] : NULL;
    }
It should not be using get_summary_id (which allocates it for no good
reaosn) and simply check that summary_id is non-negative.

Still I wonder how this can make code different - that looks like
another bug somewhere.

Honza
  
Martin Liška Nov. 18, 2021, 6:27 p.m. UTC | #4
On 11/18/21 19:22, Jan Hubicka wrote:
>> Supported LTO compression algorithms: zlib zstd
>> gcc version 12.0.0 20211118 (experimental) (GCC)
>> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text'
>> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
>> /usr/bin/ld: final link failed: bad value
>> collect2: error: ld returned 1 exit status
>> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text'
>> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
>> /usr/bin/ld: final link failed: bad value
>> collect2: error: ld returned 1 exit status
>> diff: yyy.ltrans0.ltrans*optimized: No such file or directory
>> 54,55d53
>> < 	movslq	lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax
>> < 	movl	$0, 0(%rbp,%rax,4)
>>
>> I tracked that it differs in tree DSE dump.
>>
>> May I install the patch?
> 
> No, I think it is bug of symbol_summary that get is causing a
> difference.

Isn't problem that the following code

               past_flags.reserve_exact (summary->arg_flags.length ());
               past_flags.splice (summary->arg_flags);
               past_retslot_flags = summary->retslot_flags;

is guarded with if (dump_file && ... )?



   It should be pure function. I think it is:
>    /* Getter for summary callgraph node pointer.  */
>    T* get (cgraph_node *node) ATTRIBUTE_PURE
>      {
>         return exists (node) ? (*m_vector)[node->get_summary_id ()] : NULL;
>      }
> It should not be using get_summary_id (which allocates it for no good
> reaosn) and simply check that summary_id is non-negative.

   /* Get summary id of the node.  */
   inline int get_summary_id ()
   {
     return m_summary_id;
   }

Where do you see the allocation?

Martin

> 
> Still I wonder how this can make code different - that looks like
> another bug somewhere.
> 
> Honza
>
  
Jan Hubicka Nov. 18, 2021, 6:29 p.m. UTC | #5
> On 11/18/21 19:22, Jan Hubicka wrote:
> > > Supported LTO compression algorithms: zlib zstd
> > > gcc version 12.0.0 20211118 (experimental) (GCC)
> > > /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text'
> > > /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
> > > /usr/bin/ld: final link failed: bad value
> > > collect2: error: ld returned 1 exit status
> > > /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text'
> > > /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
> > > /usr/bin/ld: final link failed: bad value
> > > collect2: error: ld returned 1 exit status
> > > diff: yyy.ltrans0.ltrans*optimized: No such file or directory
> > > 54,55d53
> > > < 	movslq	lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax
> > > < 	movl	$0, 0(%rbp,%rax,4)
> > > 
> > > I tracked that it differs in tree DSE dump.
> > > 
> > > May I install the patch?
> > 
> > No, I think it is bug of symbol_summary that get is causing a
> > difference.
> 
> Isn't problem that the following code
> 
>               past_flags.reserve_exact (summary->arg_flags.length ());
>               past_flags.splice (summary->arg_flags);
>               past_retslot_flags = summary->retslot_flags;

Aha, that makes sense. Sorry.  It used to be saved for dumping only
while we now use it to merge old summaries with new.  So it is indeed a
(quite stupid) bug.

The patch is OK. Thanks for finding it.
Honza
  
Martin Liška Nov. 18, 2021, 6:31 p.m. UTC | #6
On 11/18/21 19:29, Jan Hubicka wrote:
>> On 11/18/21 19:22, Jan Hubicka wrote:
>>>> Supported LTO compression algorithms: zlib zstd
>>>> gcc version 12.0.0 20211118 (experimental) (GCC)
>>>> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: warning: relocation against `lm_read_ctl_dict_size_n_lmclass_used' in read-only section `.text'
>>>> /usr/bin/ld: ./xxx.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
>>>> /usr/bin/ld: final link failed: bad value
>>>> collect2: error: ld returned 1 exit status
>>>> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: warning: relocation against `__ckd_calloc___n_elem' in read-only section `.text'
>>>> /usr/bin/ld: ./yyy.ltrans0.ltrans.o: relocation R_X86_64_PC32 against symbol `__ckd_calloc___elem_size' can not be used when making a shared object; recompile with -fPIC
>>>> /usr/bin/ld: final link failed: bad value
>>>> collect2: error: ld returned 1 exit status
>>>> diff: yyy.ltrans0.ltrans*optimized: No such file or directory
>>>> 54,55d53
>>>> < 	movslq	lm_read_ctl_dict_size_n_lmclass_used(%rip), %rax
>>>> < 	movl	$0, 0(%rbp,%rax,4)
>>>>
>>>> I tracked that it differs in tree DSE dump.
>>>>
>>>> May I install the patch?
>>>
>>> No, I think it is bug of symbol_summary that get is causing a
>>> difference.
>>
>> Isn't problem that the following code
>>
>>                past_flags.reserve_exact (summary->arg_flags.length ());
>>                past_flags.splice (summary->arg_flags);
>>                past_retslot_flags = summary->retslot_flags;
> 
> Aha, that makes sense. Sorry.  It used to be saved for dumping only
> while we now use it to merge old summaries with new.  So it is indeed a
> (quite stupid) bug.

Good :) Good. I thought I overlooked something.

> 
> The patch is OK. Thanks for finding it.
> Honza
> 

You're welcome. Thanks for fast review.

Martin
  
Jan Hubicka Nov. 18, 2021, 6:34 p.m. UTC | #7
> > > 
> > > Isn't problem that the following code
> > > 
> > >                past_flags.reserve_exact (summary->arg_flags.length ());
> > >                past_flags.splice (summary->arg_flags);
> > >                past_retslot_flags = summary->retslot_flags;
> > 
> > Aha, that makes sense. Sorry.  It used to be saved for dumping only
> > while we now use it to merge old summaries with new.  So it is indeed a
> > (quite stupid) bug.
> 
> Good :) Good. I thought I overlooked something.
Hehe, I overlooked a hunk while breaking the patches into more
independent changes.

I planed a cleanup of this code after pushing out the new features.
Pehraps a trivial parts of the cleanup can be done even in stage3.
Honza
  

Patch

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index a3c7c6d6a1f..6cacf9c8ab1 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -2868,15 +2868,15 @@  analyze_function (function *f, bool ipa)
  	optimization_summaries = modref_summaries::create_ggc (symtab);
        else /* Remove existing summary if we are re-running the pass.  */
  	{
-	  if (dump_file
-	      && (summary
-		  = optimization_summaries->get (fnode))
-		 != NULL
+	  summary = optimization_summaries->get (fnode);
+	  if (summary != NULL
  	      && summary->loads)
  	    {
-	      fprintf (dump_file, "Past summary:\n");
-	      optimization_summaries->get
-		 (fnode)->dump (dump_file);
+	      if (dump_file)
+		{
+		  fprintf (dump_file, "Past summary:\n");
+		  optimization_summaries->get (fnode)->dump (dump_file);
+		}
  	      past_flags.reserve_exact (summary->arg_flags.length ());
  	      past_flags.splice (summary->arg_flags);
  	      past_retslot_flags = summary->retslot_flags;