From patchwork Wed Mar 18 12:12:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Giuliano Procida X-Patchwork-Id: 39043 From: gprocida@google.com (Giuliano Procida) Date: Wed, 18 Mar 2020 12:12:41 +0000 Subject: [PATCH v2] dwarf-reader: Use all bits of Bloom filter words. In-Reply-To: <20200318113754.GI211970@google.com> References: <20200318113754.GI211970@google.com> Message-ID: <20200318121241.146259-1-gprocida@google.com> Most of the bit values used for GNU hash ELF section Bloom filtering were being ignored due to integer narrowing, reducing missing symbol filtering efficiency considerably. This patch fixes this. * src/abg-dwarf-reader.cc (lookup_symbol_from_gnu_hash_tab): Don't narrow calculated Bloom word to 8 bits before using it to mask the fetched Bloom word. Note on testing. The .gnu.hash section seems to be present in all the .so but none of the .o test files. abisym doesn't appear to find dynamic symbols (nm -D), only normal ones, so it was a little tricky to test this. I found a Debian .so (libpthread) with both the .gnu.hash section and normal symbols. abisym behaves identically with my change, looking up lots of present and non-present (as far as it's concerned) symbols. I just extracted a full list with nm/sed and looked up each one. 389 symbols looked up, 241 present, 148 absent 8-bit filter: 336 maybe, 53 no (53/148 filtering efficiency) 64-bit filter: 255 maybe, 134 no (134/148 filtering efficiency) Signed-off-by: Giuliano Procida --- src/abg-dwarf-reader.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/abg-dwarf-reader.cc b/src/abg-dwarf-reader.cc index 3454fcf5..5556bde5 100644 --- a/src/abg-dwarf-reader.cc +++ b/src/abg-dwarf-reader.cc @@ -2025,7 +2025,7 @@ lookup_symbol_from_gnu_hash_tab(const environment* env, // filter, in bits. int c = get_elf_class_size_in_bytes(elf_handle) * 8; int n = (h1 / c) % ht.bf_nwords; - unsigned char bitmask = (1ul << (h1 % c)) | (1ul << (h2 % c)); + GElf_Word bitmask = (1ul << (h1 % c)) | (1ul << (h2 % c)); // Test if the symbol is *NOT* present in this ELF file. if ((bloom_word_at(elf_handle, ht.bloom_filter, n) & bitmask) != bitmask)