[RFA] diagnostic: avoid repeating include path

Message ID 20220113220836.3781146-1-jason@redhat.com
State New
Headers
Series [RFA] diagnostic: avoid repeating include path |

Commit Message

Jason Merrill Jan. 13, 2022, 10:08 p.m. UTC
  When a sequence of diagnostic messages bounces back and forth repeatedly
between two includes, as with

 #include <map>
 std::map<const char*, const char*> m ("123", "456");

The output is quite a bit longer than necessary because we dump the include
path each time it changes.  I'd think we could print the include path once
for each header file, and then expect that the user can look earlier in the
output if they're wondering.

Tested x86_64-pc-linux-gnu, OK for trunk?

gcc/ChangeLog:

	* diagnostic.c (includes_seen): New.
	(diagnostic_report_current_module): Use it.
---
 gcc/diagnostic.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)


base-commit: b8ffa71e4271ae562c2d315b9b24c4979bbf8227
prerequisite-patch-id: e45065ef320968d982923dd44da7bed07e3326ef
  

Comments

David Malcolm Jan. 13, 2022, 10:30 p.m. UTC | #1
On Thu, 2022-01-13 at 17:08 -0500, Jason Merrill wrote:
> When a sequence of diagnostic messages bounces back and forth
> repeatedly
> between two includes, as with
> 
>  #include <map>
>  std::map<const char*, const char*> m ("123", "456");
> 
> The output is quite a bit longer than necessary because we dump the
> include
> path each time it changes.  I'd think we could print the include path
> once
> for each header file, and then expect that the user can look earlier
> in the
> output if they're wondering.
> 
> Tested x86_64-pc-linux-gnu, OK for trunk?
> 
> gcc/ChangeLog:
> 
>         * diagnostic.c (includes_seen): New.
>         (diagnostic_report_current_module): Use it.
> ---
>  gcc/diagnostic.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> index 58139427d01..e56441a2dbf 100644
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -700,6 +700,16 @@ set_last_module (diagnostic_context *context,
> const line_map_ordinary *map)
>    context->last_module = map;
>  }
>  
> +/* Only dump the "In file included from..." stack once for each
> file.  */
> +
> +static bool
> +includes_seen (const line_map_ordinary *map)
> +{
> +  using hset = hash_set<const line_map_ordinary *>;
> +  static hset *set = new hset;
> +  return set->add (map);
> +}

Overall, I like the idea, but...

- the patch works at the level of line_map_ordinary instances, rather
than header files.  There are various ways in which a single header
file can have multiple line maps e.g. due to very long lines, or
including another file, etc.  I think it makes sense to do it at the
per-file level, assuming we aren't in a horrible situation where a
header is being included repeatedly, with different effects.  So maybe
this ought to look at what include directive led to this map, i.e.
looking at the ord_map->included_from field, and having a
hash_set<location_t> ?

- there's no test coverage, but it's probably not feasible to write
DejaGnu tests for this, given the way prune.exp's prune_gcc_output
strips these strings.  Maybe a dg directive to selectively disable the
pertinent pruning operations in prune_gcc_output???  Gah...

- global state is a pet peeve of mine; can the above state be put
inside the diagnostic_context instead?   (perhaps via a pointer to a
wrapper class to avoid requiring all users of diagnostic.h to include
hash-set.h?).

Hope this is constructive
Dave

> +
>  void
>  diagnostic_report_current_module (diagnostic_context *context,
> location_t where)
>  {
> @@ -721,7 +731,7 @@ diagnostic_report_current_module
> (diagnostic_context *context, location_t where)
>    if (map && last_module_changed_p (context, map))
>      {
>        set_last_module (context, map);
> -      if (! MAIN_FILE_P (map))
> +      if (! MAIN_FILE_P (map) && !includes_seen (map))
>         {
>           bool first = true, need_inc = true, was_module =
> MAP_MODULE_P (map);
>           expanded_location s = {};
> 
> base-commit: b8ffa71e4271ae562c2d315b9b24c4979bbf8227
> prerequisite-patch-id: e45065ef320968d982923dd44da7bed07e3326ef
  
Jason Merrill Jan. 15, 2022, 4:01 a.m. UTC | #2
On 1/13/22 17:30, David Malcolm wrote:
> On Thu, 2022-01-13 at 17:08 -0500, Jason Merrill wrote:
>> When a sequence of diagnostic messages bounces back and forth
>> repeatedly
>> between two includes, as with
>>
>>   #include <map>
>>   std::map<const char*, const char*> m ("123", "456");
>>
>> The output is quite a bit longer than necessary because we dump the
>> include
>> path each time it changes.  I'd think we could print the include path
>> once
>> for each header file, and then expect that the user can look earlier
>> in the
>> output if they're wondering.
>>
>> Tested x86_64-pc-linux-gnu, OK for trunk?
>>
>> gcc/ChangeLog:
>>
>>          * diagnostic.c (includes_seen): New.
>>          (diagnostic_report_current_module): Use it.
>> ---
>>   gcc/diagnostic.c | 12 +++++++++++-
>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
>> index 58139427d01..e56441a2dbf 100644
>> --- a/gcc/diagnostic.c
>> +++ b/gcc/diagnostic.c
>> @@ -700,6 +700,16 @@ set_last_module (diagnostic_context *context,
>> const line_map_ordinary *map)
>>     context->last_module = map;
>>   }
>>   
>> +/* Only dump the "In file included from..." stack once for each
>> file.  */
>> +
>> +static bool
>> +includes_seen (const line_map_ordinary *map)
>> +{
>> +  using hset = hash_set<const line_map_ordinary *>;
>> +  static hset *set = new hset;
>> +  return set->add (map);
>> +}
> 
> Overall, I like the idea, but...
> 
> - the patch works at the level of line_map_ordinary instances, rather
> than header files.  There are various ways in which a single header
> file can have multiple line maps e.g. due to very long lines, or
> including another file, etc.  I think it makes sense to do it at the
> per-file level, assuming we aren't in a horrible situation where a
> header is being included repeatedly, with different effects.  So maybe
> this ought to look at what include directive led to this map, i.e.
> looking at the ord_map->included_from field, and having a
> hash_set<location_t> ?

Done.  This version doesn't affect printing of the module import path 
yet, pending more consideration of whether we always want to identify 
the module it comes from or just the file/line is enough.

> - there's no test coverage, but it's probably not feasible to write
> DejaGnu tests for this, given the way prune.exp's prune_gcc_output
> strips these strings.  Maybe a dg directive to selectively disable the
> pertinent pruning operations in prune_gcc_output???  Gah...
> 
> - global state is a pet peeve of mine; can the above state be put
> inside the diagnostic_context instead?   (perhaps via a pointer to a
> wrapper class to avoid requiring all users of diagnostic.h to include
> hash-set.h?).

It seems that using hash_set directly doesn't break any users.
  
David Malcolm Jan. 17, 2022, 1:27 a.m. UTC | #3
On Fri, 2022-01-14 at 23:01 -0500, Jason Merrill wrote:
> On 1/13/22 17:30, David Malcolm wrote:
> > On Thu, 2022-01-13 at 17:08 -0500, Jason Merrill wrote:
> > > When a sequence of diagnostic messages bounces back and forth
> > > repeatedly
> > > between two includes, as with
> > > 
> > >   #include <map>
> > >   std::map<const char*, const char*> m ("123", "456");
> > > 
> > > The output is quite a bit longer than necessary because we dump
> > > the
> > > include
> > > path each time it changes.  I'd think we could print the include
> > > path
> > > once
> > > for each header file, and then expect that the user can look
> > > earlier
> > > in the
> > > output if they're wondering.
> > > 
> > > Tested x86_64-pc-linux-gnu, OK for trunk?
> > > 
> > > gcc/ChangeLog:
> > > 
> > >          * diagnostic.c (includes_seen): New.
> > >          (diagnostic_report_current_module): Use it.
> > > ---
> > >   gcc/diagnostic.c | 12 +++++++++++-
> > >   1 file changed, 11 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> > > index 58139427d01..e56441a2dbf 100644
> > > --- a/gcc/diagnostic.c
> > > +++ b/gcc/diagnostic.c
> > > @@ -700,6 +700,16 @@ set_last_module (diagnostic_context
> > > *context,
> > > const line_map_ordinary *map)
> > >     context->last_module = map;
> > >   }
> > >   
> > > +/* Only dump the "In file included from..." stack once for each
> > > file.  */
> > > +
> > > +static bool
> > > +includes_seen (const line_map_ordinary *map)
> > > +{
> > > +  using hset = hash_set<const line_map_ordinary *>;
> > > +  static hset *set = new hset;
> > > +  return set->add (map);
> > > +}
> > 
> > Overall, I like the idea, but...
> > 
> > - the patch works at the level of line_map_ordinary instances,
> > rather
> > than header files.  There are various ways in which a single header
> > file can have multiple line maps e.g. due to very long lines, or
> > including another file, etc.  I think it makes sense to do it at
> > the
> > per-file level, assuming we aren't in a horrible situation where a
> > header is being included repeatedly, with different effects.  So
> > maybe
> > this ought to look at what include directive led to this map, i.e.
> > looking at the ord_map->included_from field, and having a
> > hash_set<location_t> ?
> 
> Done.  This version doesn't affect printing of the module import path
> yet, pending more consideration of whether we always want to identify
> the module it comes from or just the file/line is enough.

Seems like a good choice given that everyone's going to be much less
familiar with modules than with include files, for some time.

> 
> > - there's no test coverage, but it's probably not feasible to write
> > DejaGnu tests for this, given the way prune.exp's prune_gcc_output
> > strips these strings.  Maybe a dg directive to selectively disable
> > the
> > pertinent pruning operations in prune_gcc_output???  Gah...
> > 
> > - global state is a pet peeve of mine; can the above state be put
> > inside the diagnostic_context instead?   (perhaps via a pointer to
> > a
> > wrapper class to avoid requiring all users of diagnostic.h to
> > include
> > hash-set.h?).
> 
> It seems that using hash_set directly doesn't break any users.

Thanks.  The updated patch looks good to me.

Dave
  

Patch

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 58139427d01..e56441a2dbf 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -700,6 +700,16 @@  set_last_module (diagnostic_context *context, const line_map_ordinary *map)
   context->last_module = map;
 }
 
+/* Only dump the "In file included from..." stack once for each file.  */
+
+static bool
+includes_seen (const line_map_ordinary *map)
+{
+  using hset = hash_set<const line_map_ordinary *>;
+  static hset *set = new hset;
+  return set->add (map);
+}
+
 void
 diagnostic_report_current_module (diagnostic_context *context, location_t where)
 {
@@ -721,7 +731,7 @@  diagnostic_report_current_module (diagnostic_context *context, location_t where)
   if (map && last_module_changed_p (context, map))
     {
       set_last_module (context, map);
-      if (! MAIN_FILE_P (map))
+      if (! MAIN_FILE_P (map) && !includes_seen (map))
 	{
 	  bool first = true, need_inc = true, was_module = MAP_MODULE_P (map);
 	  expanded_location s = {};