[2/9] Introduce and use language_set
Commit Message
I noticed that objfile_per_bfd_storage::demangled_hash_languages is a
std::vector, which seemed quite large for something that,
fundamentally, can be represented as a bit-set. This patch
reimplements it as a bit-set.
gdb/ChangeLog
2019-03-07 Tom Tromey <tom@tromey.com>
* objfiles.h (struct objfile_per_bfd_storage)
<demangled_hash_languages>: Now a language_set.
* minsyms.c (add_minsym_to_demangled_hash_table): Update.
* language.h (class language_set): New.
---
gdb/ChangeLog | 7 +++++
gdb/language.h | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++
gdb/minsyms.c | 7 ++---
gdb/objfiles.h | 6 ++--
4 files changed, 95 insertions(+), 9 deletions(-)
Comments
On 3/7/19 12:57 PM, Tom Tromey wrote:
> I noticed that objfile_per_bfd_storage::demangled_hash_languages is a
> std::vector, which seemed quite large for something that,
> fundamentally, can be represented as a bit-set. This patch
> reimplements it as a bit-set.
Hmm, did you consider using std::bitset<nr_languages> for langauge_set? You'd
still have to write your own iterator class, so it wouldn't save much in terms
of lines of code. It just seems a bit odd to write a bitset class from scratch?
>>>>> "John" == John Baldwin <jhb@FreeBSD.org> writes:
John> On 3/7/19 12:57 PM, Tom Tromey wrote:
>> I noticed that objfile_per_bfd_storage::demangled_hash_languages is a
>> std::vector, which seemed quite large for something that,
>> fundamentally, can be represented as a bit-set. This patch
>> reimplements it as a bit-set.
John> Hmm, did you consider using std::bitset<nr_languages> for langauge_set? You'd
John> still have to write your own iterator class, so it wouldn't save much in terms
John> of lines of code. It just seems a bit odd to write a bitset class from scratch?
I didn't really consider it.
I looked now and on my machine, std::bitset<nr_languages> is 8 bytes,
whereas langauge_set is just 4. So, with the custom implementation, I
could pack it into a hole in objfile_per_bfd_storage.
That said, due to the minsym arrays, objfile_per_bfd_storage is
enormous, so saving 4 bytes probably isn't important. So, I'll redo
this.
Tom
@@ -747,4 +747,88 @@ private:
enum language m_lang;
};
+/* A set object specialized for keeping track of languages. */
+
+class language_set
+{
+public:
+
+ language_set ()
+ {
+ }
+
+ /* Insert LANG into this set. */
+ void insert (enum language lang)
+ {
+ m_mask |= 1 << lang;
+ }
+
+ /* The underlying type used to represent the languages. */
+ typedef unsigned mask_type;
+
+ /* An iterator that iterates over languages contained in a language
+ set. */
+ class iterator
+ {
+ public:
+
+ iterator (unsigned int n, mask_type mask)
+ : m_index (n),
+ m_mask (mask)
+ {
+ gdb_assert (m_index <= nr_languages);
+ update ();
+ }
+
+ enum language operator* () const
+ {
+ return (enum language) m_index;
+ }
+
+ iterator &operator++ ()
+ {
+ ++m_index;
+ update ();
+ return *this;
+ }
+
+ bool operator!= (const iterator &other) const
+ {
+ return m_index != other.m_index;
+ }
+
+ private:
+
+ /* A helper function that will advance M_INDEX to the next
+ language in the set. */
+ void update ()
+ {
+ while (m_index < nr_languages && (m_mask & (1 << m_index)) == 0)
+ ++m_index;
+ }
+
+ /* The current index of the iterator. */
+ unsigned int m_index;
+ /* The mask from the underlying set. */
+ mask_type m_mask;
+ };
+
+ iterator begin () const
+ {
+ return iterator (0, m_mask);
+ }
+
+ iterator end () const
+ {
+ return iterator (nr_languages, m_mask);
+ }
+
+private:
+
+ /* A bit mask of languages which are included in this set. */
+ mask_type m_mask = 0;
+
+ gdb_static_assert (nr_languages <= (sizeof (mask_type) * CHAR_BIT));
+};
+
#endif /* defined (LANGUAGE_H) */
@@ -160,11 +160,8 @@ add_minsym_to_demangled_hash_table (struct minimal_symbol *sym,
unsigned int hash = search_name_hash (MSYMBOL_LANGUAGE (sym),
MSYMBOL_SEARCH_NAME (sym));
- auto &vec = objfile->per_bfd->demangled_hash_languages;
- auto it = std::lower_bound (vec.begin (), vec.end (),
- MSYMBOL_LANGUAGE (sym));
- if (it == vec.end () || *it != MSYMBOL_LANGUAGE (sym))
- vec.insert (it, MSYMBOL_LANGUAGE (sym));
+ objfile->per_bfd->demangled_hash_languages.insert
+ (MSYMBOL_LANGUAGE (sym));
struct minimal_symbol **table
= objfile->per_bfd->msymbol_demangled_hash;
@@ -313,10 +313,8 @@ struct objfile_per_bfd_storage
minimal_symbol *msymbol_demangled_hash[MINIMAL_SYMBOL_HASH_SIZE] {};
/* All the different languages of symbols found in the demangled
- hash table. A flat/vector-based map is more efficient than a map
- or hash table here, since this will only usually contain zero or
- one entries. */
- std::vector<enum language> demangled_hash_languages;
+ hash table. */
+ language_set demangled_hash_languages;
};
/* Master structure for keeping track of each file from which