Remove some variables in favor of using gdb::optional

Message ID 829ab63c-55b0-07bc-1517-0efe9aeecc95@polymtl.ca
State New, archived
Headers

Commit Message

Simon Marchi Aug. 23, 2019, 7:33 p.m. UTC
  On 2019-08-23 11:35 a.m., Pedro Alves wrote:
>> Boost is very widely available, so I'd prefer if we could depend on the system-installed
>> boost, or have the user provide it, but I would understand if others didn't want to add
>> that burden to those who build GDB.  
> 
> Yeah, boost would be a too-big dependency, IMO.  It's huge.

True, libbost1.67-dev on Debian unpacks to 150 MB of header files.  It would be nice to be able to
select which features one actually need and check-in just the necessary files (a bit like gnulib).
I bet that for tribool, it would be just a few files.

> 
>> So an alternative would be to carry a version of boost
>> (maybe just the files that we need) in our repo.  This would have the advantage that everybody
>> would build with the same version, hence less chances of build failures with particular
>> combinations of compilers and boost versions, or using features from a too recent boost.
>>
>> What do you think?
> 
> I would really not like to carry boost.  It's very big and intertwined.  I think the
> costs outweighs the benefits here, because we'd be pulling in a huge codebase when we
> only need minimal things.  I'd also like to move more in the direction of sharing more
> code across the toolchain (gdb, gcc, gold, etc.) and depending on boost would
> certainly prevent that.  LLVM doesn't depend on boost either, for example.  Given
> that it's a C++ project from the get go, I think that's telling.  Most certainly for
> the same reasons.
> 
> On reusing parts, when I was working on the original gdb::unique_ptr emulation for
> C++98, before we upgraded to C++11, I looked at reusing boost's version.  What I
> found was a huge horrible mess, with lots of incomprehensible #ifdefery to support
> compilers that no one cares about anymore (old Borland C++, etc.), impossible to
> decouple.  Another issue is that a boost API may have been originally designed
> with C++98 in mind, and it'd be done differently with C++11 and later in mind.
> It might have been the case here, for example note how boost's tribool uses
> the C++98 safe bool idiom, instead of C++11 operator bool().

Ok, all good points.

>> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
>> index de9755f6ce30..915e0b25cea1 100644
>> --- a/gdb/dwarf2read.c
>> +++ b/gdb/dwarf2read.c
>> @@ -90,6 +90,7 @@
>>  #include <forward_list>
>>  #include "rust-lang.h"
>>  #include "gdbsupport/pathstuff.h"
>> +#include <boost/logic/tribool.hpp>
>>
>>  /* When == 1, print basic high level tracing messages.
>>     When > 1, be more verbose.
>> @@ -5843,7 +5844,7 @@ dw2_debug_names_iterator::next ()
>>        return NULL;
>>      }
>>    const mapped_debug_names::index_val &indexval = indexval_it->second;
>> -  gdb::optional<bool> is_static;
>> +  boost::tribool is_static(boost::indeterminate);
> 
> Here I imagine we'd more naturally want to treat a tribool like
> a built-in, similar to an enum class that accepts {true, false, unknown},
> so we'd write:
> 
>   tribool is_static = true;
>   tribool is_static = false;
>   tribool is_static = tribool::unknown;

Yes that sounds good.

> 
> ("indeterminate" is really a mouthful, hence "unknown".)
I also wasn't sure about this, what the third state should be, more on that below.

> with tribool::unknown being e.g., a constexpr global.
>>    dwarf2_per_cu_data *per_cu = NULL;
>>    for (const mapped_debug_names::index_val::attr &attr : indexval.attr_vec)
>>      {
>> @@ -5910,10 +5911,10 @@ dw2_debug_names_iterator::next ()
>>      goto again;
>>
>>    /* Check static vs global.  */
>> -  if (is_static.has_value () && m_block_index.has_value ())
>> +  if (!boost::indeterminate(is_static) && m_block_index.has_value ())
> 
> and here:
> 
> +  if (is_static != tribool::unknown && m_block_index.has_value ())
> 
> Here's a quick-prototype starter implementation.  It's missing sprinkling
> with constexpr, inline, noexcept, relational operators (op|| and op&& ?),
> unit tests, etc., but should show the idea.
> 
> #include <stdio.h>
> 
> class tribool 
> {
> private:
>   struct unknown_t {};
> public:
>   /* Keep it trivial.  */
>   tribool () = default;
> 
>   tribool (bool val)
>     : m_value (val)
>   {}
> 
>   operator bool () const
>   {
>      return m_value == 1;
>   }
> 
>   bool operator== (const tribool &other)
>   {
>     return m_value == other.m_value;
>   }
> 
>   tribool operator! ()
>   {
>     if (m_value == 0)
>       return true;
>     else if (m_value == 1)
>       return false;
>     else
>       return unknown;
>   }
> 
>   static tribool unknown;
> 
> private:
>   tribool (unknown_t dummy)
>     : m_value (-1)
>   {}
> 
>   int m_value; // -1 for unknown.
> };
> 
> tribool tribool::unknown (unknown_t{});
> 
> int
> main ()
> {
>   tribool b1 = true;
>   tribool b2 = false;
>   tribool b3 = tribool::unknown;
>   tribool b = b3;
> 
>   if (b)
>     printf ("then\n");
>   else if (!b)
>     printf ("else\n");
>   else 
>     printf ("unknown\n");
>   
>   return b1;
> }
> 
> Would you like to run with this?

So I wasn't sure about what the third state should be.  I think it depends on
the particular context.  In different contexts, it could mean "unknown", "unspecified",
"auto", "don't care", etc.  There's no one-size word that fits all case, so I don't really
like the idea of having just one word and have it represent poorly what we actually mean.

That lead me to think, if we want to represent three states and if the states are
specific to each use case, why not just define an enum and be explicit about it?

A bit like why I prefer defining an explicit type with two fields rather than using
std::pair: the "first" and "second" members are not very descriptive.

Here's a patch that does that.  What do you think?


From ba180d647f6ceb69b9a01c46807c102439b8cae1 Mon Sep 17 00:00:00 2001
From: Simon Marchi <simon.marchi@efficios.com>
Date: Fri, 23 Aug 2019 14:53:59 -0400
Subject: [PATCH] dwarf2read: replace gdb::optional<bool> with enum

gdb::optional<bool> is dangerous, because it's easy to do:

  if (opt_bool)

when you actually meant

  if (*opt_bool)

or vice-versa.  The first checks if the optional is set, the second
checks if the wrapped bool is true.

Replace it with an enum that explicitly defines the three possible
states.

gdb/ChangeLog:

	* dwarf2read.c (dw2_debug_names_iterator::next): Use enum to
	represent whether the symbol is static, dynamic, or we don't
	know.
---
 gdb/dwarf2read.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)
  

Comments

Pedro Alves Aug. 23, 2019, 7:47 p.m. UTC | #1
On 8/23/19 8:33 PM, Simon Marchi wrote:
> On 2019-08-23 11:35 a.m., Pedro Alves wrote:

>> Would you like to run with this?
> 
> So I wasn't sure about what the third state should be.  I think it depends on
> the particular context.  In different contexts, it could mean "unknown", "unspecified",
> "auto", "don't care", etc.  There's no one-size word that fits all case, so I don't really
> like the idea of having just one word and have it represent poorly what we actually mean.
> 
> That lead me to think, if we want to represent three states and if the states are
> specific to each use case, why not just define an enum and be explicit about it?

That's a very good point actually.  I agree and I'm convinced.

Let's shelve the tribool idea until/if we find a better use for it.

> 
> A bit like why I prefer defining an explicit type with two fields rather than using
> std::pair: the "first" and "second" members are not very descriptive.

Right, agreed, the fact that std::map/std::unordered_map searching returns pairs
is one of those things I hate the most about C++.

> Here's a patch that does that.  What do you think?

I think I like it!

Thanks,
Pedro Alves
  
Simon Marchi Aug. 23, 2019, 7:52 p.m. UTC | #2
On 2019-08-23 3:47 p.m., Pedro Alves wrote:
> On 8/23/19 8:33 PM, Simon Marchi wrote:
>> On 2019-08-23 11:35 a.m., Pedro Alves wrote:
> 
>>> Would you like to run with this?
>>
>> So I wasn't sure about what the third state should be.  I think it depends on
>> the particular context.  In different contexts, it could mean "unknown", "unspecified",
>> "auto", "don't care", etc.  There's no one-size word that fits all case, so I don't really
>> like the idea of having just one word and have it represent poorly what we actually mean.
>>
>> That lead me to think, if we want to represent three states and if the states are
>> specific to each use case, why not just define an enum and be explicit about it?
> 
> That's a very good point actually.  I agree and I'm convinced.
> 
> Let's shelve the tribool idea until/if we find a better use for it.
> 
>>
>> A bit like why I prefer defining an explicit type with two fields rather than using
>> std::pair: the "first" and "second" members are not very descriptive.
> 
> Right, agreed, the fact that std::map/std::unordered_map searching returns pairs
> is one of those things I hate the most about C++.
> 
>> Here's a patch that does that.  What do you think?
> 
> I think I like it!

Yay, less work for me!  I'll run it through the buildbot and push it if it's all good.

Thanks,

Simon
  
Pedro Alves Aug. 23, 2019, 8:23 p.m. UTC | #3
On 8/23/19 8:52 PM, Simon Marchi wrote:
> On 2019-08-23 3:47 p.m., Pedro Alves wrote:
>> On 8/23/19 8:33 PM, Simon Marchi wrote:
>>> On 2019-08-23 11:35 a.m., Pedro Alves wrote:
>>
>>>> Would you like to run with this?
>>>
>>> So I wasn't sure about what the third state should be.  I think it depends on
>>> the particular context.  In different contexts, it could mean "unknown", "unspecified",
>>> "auto", "don't care", etc.  There's no one-size word that fits all case, so I don't really
>>> like the idea of having just one word and have it represent poorly what we actually mean.
>>>
>>> That lead me to think, if we want to represent three states and if the states are
>>> specific to each use case, why not just define an enum and be explicit about it?
>>
>> That's a very good point actually.  I agree and I'm convinced.
>>
>> Let's shelve the tribool idea until/if we find a better use for it.
>>
>>>
>>> A bit like why I prefer defining an explicit type with two fields rather than using
>>> std::pair: the "first" and "second" members are not very descriptive.
>>
>> Right, agreed, the fact that std::map/std::unordered_map searching returns pairs
>> is one of those things I hate the most about C++.
>>
>>> Here's a patch that does that.  What do you think?
>>
>> I think I like it!
> 
> Yay, less work for me!  I'll run it through the buildbot and push it if it's all good.

Sorry for the delayed nit, but the first time I didn't really pay attention
the the enumerator names you were using.  Reading back, I notice that you
used "dynamic".  I don't know what that means here?  Don't you mean
"external", as opposed to static?  Like, wouldn't this be more to the point?

  enum {
    symbol_linkage_unknown,
    symbol_linkage_static,
    symbol_linkage_extern,
  } symbol_linkage = symbol_linkage_unknown;

Thanks,
Pedro Alves
  
Simon Marchi Aug. 23, 2019, 10:23 p.m. UTC | #4
On 2019-08-23 4:23 p.m., Pedro Alves wrote:
> Sorry for the delayed nit, but the first time I didn't really pay attention
> the the enumerator names you were using.  Reading back, I notice that you
> used "dynamic".  I don't know what that means here?  Don't you mean
> "external", as opposed to static?  Like, wouldn't this be more to the point?
> 
>   enum {
>     symbol_linkage_unknown,
>     symbol_linkage_static,
>     symbol_linkage_extern,
>   } symbol_linkage = symbol_linkage_unknown;

Oh yes of course, I was confused static/extern with static/dynamic.  I'll use that, thanks.

Simon
  
Simon Marchi Aug. 23, 2019, 10:24 p.m. UTC | #5
On 2019-08-23 6:23 p.m., Simon Marchi wrote:
> Oh yes of course, I was confused static/extern with static/dynamic.  I'll use that, thanks.

Arrrg, "I have confused".
  
Ruslan Kabatsayev Aug. 24, 2019, 11:22 a.m. UTC | #6
Hi,

On Fri, 23 Aug 2019 at 22:33, Simon Marchi <simon.marchi@polymtl.ca> wrote:
>
> On 2019-08-23 11:35 a.m., Pedro Alves wrote:
> >> Boost is very widely available, so I'd prefer if we could depend on the system-installed
> >> boost, or have the user provide it, but I would understand if others didn't want to add
> >> that burden to those who build GDB.
> >
> > Yeah, boost would be a too-big dependency, IMO.  It's huge.
>
> True, libbost1.67-dev on Debian unpacks to 150 MB of header files.  It would be nice to be able to
> select which features one actually need and check-in just the necessary files (a bit like gnulib).
> I bet that for tribool, it would be just a few files.
>
> >
> >> So an alternative would be to carry a version of boost
> >> (maybe just the files that we need) in our repo.  This would have the advantage that everybody
> >> would build with the same version, hence less chances of build failures with particular
> >> combinations of compilers and boost versions, or using features from a too recent boost.
> >>
> >> What do you think?
> >
> > I would really not like to carry boost.  It's very big and intertwined.  I think the
> > costs outweighs the benefits here, because we'd be pulling in a huge codebase when we
> > only need minimal things.  I'd also like to move more in the direction of sharing more
> > code across the toolchain (gdb, gcc, gold, etc.) and depending on boost would
> > certainly prevent that.  LLVM doesn't depend on boost either, for example.  Given
> > that it's a C++ project from the get go, I think that's telling.  Most certainly for
> > the same reasons.
> >
> > On reusing parts, when I was working on the original gdb::unique_ptr emulation for
> > C++98, before we upgraded to C++11, I looked at reusing boost's version.  What I
> > found was a huge horrible mess, with lots of incomprehensible #ifdefery to support
> > compilers that no one cares about anymore (old Borland C++, etc.), impossible to
> > decouple.  Another issue is that a boost API may have been originally designed
> > with C++98 in mind, and it'd be done differently with C++11 and later in mind.
> > It might have been the case here, for example note how boost's tribool uses
> > the C++98 safe bool idiom, instead of C++11 operator bool().
>
> Ok, all good points.
>
> >> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> >> index de9755f6ce30..915e0b25cea1 100644
> >> --- a/gdb/dwarf2read.c
> >> +++ b/gdb/dwarf2read.c
> >> @@ -90,6 +90,7 @@
> >>  #include <forward_list>
> >>  #include "rust-lang.h"
> >>  #include "gdbsupport/pathstuff.h"
> >> +#include <boost/logic/tribool.hpp>
> >>
> >>  /* When == 1, print basic high level tracing messages.
> >>     When > 1, be more verbose.
> >> @@ -5843,7 +5844,7 @@ dw2_debug_names_iterator::next ()
> >>        return NULL;
> >>      }
> >>    const mapped_debug_names::index_val &indexval = indexval_it->second;
> >> -  gdb::optional<bool> is_static;
> >> +  boost::tribool is_static(boost::indeterminate);
> >
> > Here I imagine we'd more naturally want to treat a tribool like
> > a built-in, similar to an enum class that accepts {true, false, unknown},
> > so we'd write:
> >
> >   tribool is_static = true;
> >   tribool is_static = false;
> >   tribool is_static = tribool::unknown;
>
> Yes that sounds good.
>
> >
> > ("indeterminate" is really a mouthful, hence "unknown".)
> I also wasn't sure about this, what the third state should be, more on that below.
>
> > with tribool::unknown being e.g., a constexpr global.
> >>    dwarf2_per_cu_data *per_cu = NULL;
> >>    for (const mapped_debug_names::index_val::attr &attr : indexval.attr_vec)
> >>      {
> >> @@ -5910,10 +5911,10 @@ dw2_debug_names_iterator::next ()
> >>      goto again;
> >>
> >>    /* Check static vs global.  */
> >> -  if (is_static.has_value () && m_block_index.has_value ())
> >> +  if (!boost::indeterminate(is_static) && m_block_index.has_value ())
> >
> > and here:
> >
> > +  if (is_static != tribool::unknown && m_block_index.has_value ())
> >
> > Here's a quick-prototype starter implementation.  It's missing sprinkling
> > with constexpr, inline, noexcept, relational operators (op|| and op&& ?),
> > unit tests, etc., but should show the idea.
> >
> > #include <stdio.h>
> >
> > class tribool
> > {
> > private:
> >   struct unknown_t {};
> > public:
> >   /* Keep it trivial.  */
> >   tribool () = default;
> >
> >   tribool (bool val)
> >     : m_value (val)
> >   {}
> >
> >   operator bool () const
> >   {
> >      return m_value == 1;
> >   }
> >
> >   bool operator== (const tribool &other)
> >   {
> >     return m_value == other.m_value;
> >   }
> >
> >   tribool operator! ()
> >   {
> >     if (m_value == 0)
> >       return true;
> >     else if (m_value == 1)
> >       return false;
> >     else
> >       return unknown;
> >   }
> >
> >   static tribool unknown;
> >
> > private:
> >   tribool (unknown_t dummy)
> >     : m_value (-1)
> >   {}
> >
> >   int m_value; // -1 for unknown.
> > };
> >
> > tribool tribool::unknown (unknown_t{});
> >
> > int
> > main ()
> > {
> >   tribool b1 = true;
> >   tribool b2 = false;
> >   tribool b3 = tribool::unknown;
> >   tribool b = b3;
> >
> >   if (b)
> >     printf ("then\n");
> >   else if (!b)
> >     printf ("else\n");
> >   else
> >     printf ("unknown\n");
> >
> >   return b1;
> > }
> >
> > Would you like to run with this?
>
> So I wasn't sure about what the third state should be.  I think it depends on
> the particular context.  In different contexts, it could mean "unknown", "unspecified",
> "auto", "don't care", etc.  There's no one-size word that fits all case, so I don't really
> like the idea of having just one word and have it represent poorly what we actually mean.
>
> That lead me to think, if we want to represent three states and if the states are
> specific to each use case, why not just define an enum and be explicit about it?
>
> A bit like why I prefer defining an explicit type with two fields rather than using
> std::pair: the "first" and "second" members are not very descriptive.
>
> Here's a patch that does that.  What do you think?
>
>
> From ba180d647f6ceb69b9a01c46807c102439b8cae1 Mon Sep 17 00:00:00 2001
> From: Simon Marchi <simon.marchi@efficios.com>
> Date: Fri, 23 Aug 2019 14:53:59 -0400
> Subject: [PATCH] dwarf2read: replace gdb::optional<bool> with enum
>
> gdb::optional<bool> is dangerous, because it's easy to do:
>
>   if (opt_bool)
>
> when you actually meant
>
>   if (*opt_bool)
>
> or vice-versa.  The first checks if the optional is set, the second
> checks if the wrapped bool is true.
>
> Replace it with an enum that explicitly defines the three possible
> states.
>
> gdb/ChangeLog:
>
>         * dwarf2read.c (dw2_debug_names_iterator::next): Use enum to
>         represent whether the symbol is static, dynamic, or we don't
>         know.
> ---
>  gdb/dwarf2read.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
> index de9755f6ce30..3dc34ab5a339 100644
> --- a/gdb/dwarf2read.c
> +++ b/gdb/dwarf2read.c
> @@ -5843,7 +5843,11 @@ dw2_debug_names_iterator::next ()
>        return NULL;
>      }
>    const mapped_debug_names::index_val &indexval = indexval_it->second;
> -  gdb::optional<bool> is_static;
> +  enum {
> +    symbol_nature_unknown,
> +    symbol_nature_static,
> +    symbol_nature_dynamic,
> +  } symbol_nature = symbol_nature_unknown;

Since GDB uses C++11, and we don't really rely on conversion to int
here, why not use enum class? This would protect from unwanted
conversions. Additionally, we'll be forced to use a "scoped" name
instead of "prefixed" one, like symbol_nature::unknown vs
symbol_nature_unknown, which seems a bit more explicit (will need to
do something with "static" though, since it's a keyword).

>    dwarf2_per_cu_data *per_cu = NULL;
>    for (const mapped_debug_names::index_val::attr &attr : indexval.attr_vec)
>      {
> @@ -5895,12 +5899,12 @@ dw2_debug_names_iterator::next ()
>         case DW_IDX_GNU_internal:
>           if (!m_map.augmentation_is_gdb)
>             break;
> -         is_static = true;
> +         symbol_nature = symbol_nature_static;
>           break;
>         case DW_IDX_GNU_external:
>           if (!m_map.augmentation_is_gdb)
>             break;
> -         is_static = false;
> +         symbol_nature = symbol_nature_dynamic;
>           break;
>         }
>      }
> @@ -5910,10 +5914,11 @@ dw2_debug_names_iterator::next ()
>      goto again;
>
>    /* Check static vs global.  */
> -  if (is_static.has_value () && m_block_index.has_value ())
> +  if (symbol_nature != symbol_nature_unknown && m_block_index.has_value ())
>      {
>         const bool want_static = *m_block_index == STATIC_BLOCK;
> -       if (want_static != *is_static)
> +       const bool symbol_is_static = symbol_nature == symbol_nature_static;
> +       if (want_static != symbol_is_static)
>           goto again;
>      }
>
> --
> 2.23.0
>

Regards,
Ruslan
  
Simon Marchi Aug. 24, 2019, 11:55 p.m. UTC | #7
On 2019-08-24 07:22, Ruslan Kabatsayev wrote:
> Since GDB uses C++11, and we don't really rely on conversion to int
> here, why not use enum class? This would protect from unwanted
> conversions. Additionally, we'll be forced to use a "scoped" name
> instead of "prefixed" one, like symbol_nature::unknown vs
> symbol_nature_unknown, which seems a bit more explicit (will need to
> do something with "static" though, since it's a keyword).

Good point, I'll try this for the new version.

Simon
  

Patch

diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c
index de9755f6ce30..3dc34ab5a339 100644
--- a/gdb/dwarf2read.c
+++ b/gdb/dwarf2read.c
@@ -5843,7 +5843,11 @@  dw2_debug_names_iterator::next ()
       return NULL;
     }
   const mapped_debug_names::index_val &indexval = indexval_it->second;
-  gdb::optional<bool> is_static;
+  enum {
+    symbol_nature_unknown,
+    symbol_nature_static,
+    symbol_nature_dynamic,
+  } symbol_nature = symbol_nature_unknown;
   dwarf2_per_cu_data *per_cu = NULL;
   for (const mapped_debug_names::index_val::attr &attr : indexval.attr_vec)
     {
@@ -5895,12 +5899,12 @@  dw2_debug_names_iterator::next ()
 	case DW_IDX_GNU_internal:
 	  if (!m_map.augmentation_is_gdb)
 	    break;
-	  is_static = true;
+	  symbol_nature = symbol_nature_static;
 	  break;
 	case DW_IDX_GNU_external:
 	  if (!m_map.augmentation_is_gdb)
 	    break;
-	  is_static = false;
+	  symbol_nature = symbol_nature_dynamic;
 	  break;
 	}
     }
@@ -5910,10 +5914,11 @@  dw2_debug_names_iterator::next ()
     goto again;

   /* Check static vs global.  */
-  if (is_static.has_value () && m_block_index.has_value ())
+  if (symbol_nature != symbol_nature_unknown && m_block_index.has_value ())
     {
 	const bool want_static = *m_block_index == STATIC_BLOCK;
-	if (want_static != *is_static)
+	const bool symbol_is_static = symbol_nature == symbol_nature_static;
+	if (want_static != symbol_is_static)
 	  goto again;
     }