Message ID | 20180719115216.6a08ad9a@pinnacle.lan |
---|---|
State | New |
Headers | show |
On 2018-07-19 02:52 PM, Kevin Buettner wrote: > The description and patch below are intended as a replacement for my > original patch. It uses the approach outlined by Simon Marchi for > checking the find_pc_partial_function cache. > > - - - - > > This change adds an optional output parameter BLOCK to > find_pc_partial_function. If BLOCK is non-null, then *BLOCK will be > set to the address of the block corresponding to the function symbol > if such a symbol was found during lookup. Otherwise it's set to a > NULL value. Callers may wish to use the block information to > determine whether the block contains any non-contiguous ranges. The > caller may also iterate over or examine those ranges. > > When I first started looking at the broken stepping behavior associated > with functions w/ non-contiguous ranges, I found that I could "fix" > the problem by disabling the find_pc_partial_function cache. It would > sometimes happen that the PC passed in would be between the low and > high cache values, but would be in some other function that happens to > be placed in between the ranges for the cached function. This caused > incorrect values to be returned. > > So dealing with this cache turns out to be very important for fixing > this problem. I explored three different ways of dealing with the cache. > > My first approach was to clear the cache when a block was encountered > with more than one range. This would cause the non-cache pathway to > be executed on the next call to find_pc_partial_function. > > Another approach, which I suspect is slightly faster, checks to see > whether the PC is within one of the ranges associated with the cached > block. If so, then the cached values can be used. It falls back to > the original behavior if there is no cached block. > > The current approach, suggested by Simon Marchi, is to restrict the > low/high pc values recorded for the cache to the beginning and end of > the range containing the PC value under consideration. This allows us > to retain the simple (and fast) test for determining whether the > memoized (cached) values apply to the PC passed to > find_pc_partial_function. > > Another choice that had to be made was whether to have ADDRESS > continue to represent the lowest address associated with the function > or with the entry pc associated with the function. For the moment, > I've decided to keep the current behavior of having it represent the > lowest address. In cases where the entry pc is needed, this can be > found by passing a non-NULL value for BLOCK which will cause the block > (pointer) associated with the function to be returned. BLOCK_ENTRY_PC > can then be used on that block to obtain the entry pc. This LGTM overall, just a few nits/suggestions/questions. > diff --git a/gdb/blockframe.c b/gdb/blockframe.c > index b3c9aa3..a3b2a11 100644 > --- a/gdb/blockframe.c > +++ b/gdb/blockframe.c > @@ -159,6 +159,7 @@ static CORE_ADDR cache_pc_function_low = 0; > static CORE_ADDR cache_pc_function_high = 0; > static const char *cache_pc_function_name = 0; > static struct obj_section *cache_pc_function_section = NULL; > +static const struct block *cache_pc_function_block = nullptr; > > /* Clear cache, e.g. when symbol table is discarded. */ > > @@ -169,24 +170,14 @@ clear_pc_function_cache (void) > cache_pc_function_high = 0; > cache_pc_function_name = (char *) 0; > cache_pc_function_section = NULL; > + cache_pc_function_block = nullptr; We might want to rename cache_pc_function_low and cache_pc_function_high to reflect their new usage (since they don't represent the low/high bounds of the function anymore, but the range). Alternatively, it might make things simpler to keep cache_pc_function_low and cache_pc_function_high as they are right now, and introduce two new variables for the bounds of the matched range. > } > > -/* Finds the "function" (text symbol) that is smaller than PC but > - greatest of all of the potential text symbols in SECTION. Sets > - *NAME and/or *ADDRESS conditionally if that pointer is non-null. > - If ENDADDR is non-null, then set *ENDADDR to be the end of the > - function (exclusive), but passing ENDADDR as non-null means that > - the function might cause symbols to be read. This function either > - succeeds or fails (not halfway succeeds). If it succeeds, it sets > - *NAME, *ADDRESS, and *ENDADDR to real information and returns 1. > - If it fails, it sets *NAME, *ADDRESS and *ENDADDR to zero and > - returns 0. */ > - > -/* Backward compatibility, no section argument. */ > +/* See symtab.h. */ > > int > find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, > - CORE_ADDR *endaddr) > + CORE_ADDR *endaddr, const struct block **block) > { > struct obj_section *section; > struct symbol *f; > @@ -232,13 +223,37 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, > f = find_pc_sect_function (mapped_pc, section); > if (f != NULL > && (msymbol.minsym == NULL > - || (BLOCK_START (SYMBOL_BLOCK_VALUE (f)) > + || (BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (f)) I don't understand this change, can you explain it briefly? > >= BMSYMBOL_VALUE_ADDRESS (msymbol)))) > { > - cache_pc_function_low = BLOCK_START (SYMBOL_BLOCK_VALUE (f)); > - cache_pc_function_high = BLOCK_END (SYMBOL_BLOCK_VALUE (f)); > + const struct block *b = SYMBOL_BLOCK_VALUE (f); > + > + if (BLOCK_CONTIGUOUS_P (b)) > + { > + cache_pc_function_low = BLOCK_START (b); > + cache_pc_function_high = BLOCK_END (b); > + } > + else > + { > + int i; > + for (i = 0; i < BLOCK_NRANGES (b); i++) > + { > + if (BLOCK_RANGE_START (b, i) <= mapped_pc > + && mapped_pc < BLOCK_RANGE_END (b, i)) > + { > + cache_pc_function_low = BLOCK_RANGE_START (b, i); > + cache_pc_function_high = BLOCK_RANGE_END (b, i); > + break; > + } > + } > + /* Above loop should exit via the break. */ > + gdb_assert (i < BLOCK_NRANGES (b)); > + } > + > cache_pc_function_name = SYMBOL_LINKAGE_NAME (f); > cache_pc_function_section = section; > + cache_pc_function_block = b; > + > goto return_cached_value; > } > } > @@ -268,15 +283,33 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, > cache_pc_function_name = MSYMBOL_LINKAGE_NAME (msymbol.minsym); > cache_pc_function_section = section; > cache_pc_function_high = minimal_symbol_upper_bound (msymbol); > + cache_pc_function_block = nullptr; > > return_cached_value: > > + CORE_ADDR f_low, f_high; > + > + /* The low and high addresses for the cache do not necessarily > + correspond to the low and high addresses for the function. > + Extract the function low/high addresses from the cached block > + if there is one; otherwise use the cached low & high values. */ > + if (cache_pc_function_block) if (cache_pc_function_block != nullptr) > + { > + f_low = BLOCK_START (cache_pc_function_block); > + f_high = BLOCK_END (cache_pc_function_block); > + } > + else > + { > + f_low = cache_pc_function_low; > + f_high = cache_pc_function_high; > + } This, for example, could probably be kept simple if new variables were introduced for the matched block range, and cache_pc_function_{low,high} stayed as-is. I haven't actually tried, so it may not be a good idea at all. > + > if (address) > { > if (pc_in_unmapped_range (pc, section)) > - *address = overlay_unmapped_address (cache_pc_function_low, section); > + *address = overlay_unmapped_address (f_low, section); > else > - *address = cache_pc_function_low; > + *address = f_low; > } > > if (name) > @@ -291,13 +324,15 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, > the overlay), we must actually convert (high - 1) and > then add one to that. */ > > - *endaddr = 1 + overlay_unmapped_address (cache_pc_function_high - 1, > - section); > + *endaddr = 1 + overlay_unmapped_address (f_high - 1, section); > } > else > - *endaddr = cache_pc_function_high; > + *endaddr = f_high; > } > > + if (block) if (block != nullptr) Thanks, Simon
On Tue, 31 Jul 2018 21:59:57 -0400 Simon Marchi <simon.marchi@ericsson.com> wrote: > > int > > find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, > > - CORE_ADDR *endaddr) > > + CORE_ADDR *endaddr, const struct block **block) > > { > > struct obj_section *section; > > struct symbol *f; > > @@ -232,13 +223,37 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, > > f = find_pc_sect_function (mapped_pc, section); > > if (f != NULL > > && (msymbol.minsym == NULL > > - || (BLOCK_START (SYMBOL_BLOCK_VALUE (f)) > > + || (BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (f)) > > I don't understand this change, can you explain it briefly? > > > >= BMSYMBOL_VALUE_ADDRESS (msymbol)))) > > { A function's minimal symbol generally (maybe even always) refers to the entry pc. Therefore, we want to compare the block's entry pc to the minimal symbol address instead of the block start - which might not be the same as the entry pc. BLOCK_START will refer to the lowest address in all of the ranges. I'll add a comment to the code explaining this. Kevin
diff --git a/gdb/blockframe.c b/gdb/blockframe.c index b3c9aa3..a3b2a11 100644 --- a/gdb/blockframe.c +++ b/gdb/blockframe.c @@ -159,6 +159,7 @@ static CORE_ADDR cache_pc_function_low = 0; static CORE_ADDR cache_pc_function_high = 0; static const char *cache_pc_function_name = 0; static struct obj_section *cache_pc_function_section = NULL; +static const struct block *cache_pc_function_block = nullptr; /* Clear cache, e.g. when symbol table is discarded. */ @@ -169,24 +170,14 @@ clear_pc_function_cache (void) cache_pc_function_high = 0; cache_pc_function_name = (char *) 0; cache_pc_function_section = NULL; + cache_pc_function_block = nullptr; } -/* Finds the "function" (text symbol) that is smaller than PC but - greatest of all of the potential text symbols in SECTION. Sets - *NAME and/or *ADDRESS conditionally if that pointer is non-null. - If ENDADDR is non-null, then set *ENDADDR to be the end of the - function (exclusive), but passing ENDADDR as non-null means that - the function might cause symbols to be read. This function either - succeeds or fails (not halfway succeeds). If it succeeds, it sets - *NAME, *ADDRESS, and *ENDADDR to real information and returns 1. - If it fails, it sets *NAME, *ADDRESS and *ENDADDR to zero and - returns 0. */ - -/* Backward compatibility, no section argument. */ +/* See symtab.h. */ int find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, - CORE_ADDR *endaddr) + CORE_ADDR *endaddr, const struct block **block) { struct obj_section *section; struct symbol *f; @@ -232,13 +223,37 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, f = find_pc_sect_function (mapped_pc, section); if (f != NULL && (msymbol.minsym == NULL - || (BLOCK_START (SYMBOL_BLOCK_VALUE (f)) + || (BLOCK_ENTRY_PC (SYMBOL_BLOCK_VALUE (f)) >= BMSYMBOL_VALUE_ADDRESS (msymbol)))) { - cache_pc_function_low = BLOCK_START (SYMBOL_BLOCK_VALUE (f)); - cache_pc_function_high = BLOCK_END (SYMBOL_BLOCK_VALUE (f)); + const struct block *b = SYMBOL_BLOCK_VALUE (f); + + if (BLOCK_CONTIGUOUS_P (b)) + { + cache_pc_function_low = BLOCK_START (b); + cache_pc_function_high = BLOCK_END (b); + } + else + { + int i; + for (i = 0; i < BLOCK_NRANGES (b); i++) + { + if (BLOCK_RANGE_START (b, i) <= mapped_pc + && mapped_pc < BLOCK_RANGE_END (b, i)) + { + cache_pc_function_low = BLOCK_RANGE_START (b, i); + cache_pc_function_high = BLOCK_RANGE_END (b, i); + break; + } + } + /* Above loop should exit via the break. */ + gdb_assert (i < BLOCK_NRANGES (b)); + } + cache_pc_function_name = SYMBOL_LINKAGE_NAME (f); cache_pc_function_section = section; + cache_pc_function_block = b; + goto return_cached_value; } } @@ -268,15 +283,33 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, cache_pc_function_name = MSYMBOL_LINKAGE_NAME (msymbol.minsym); cache_pc_function_section = section; cache_pc_function_high = minimal_symbol_upper_bound (msymbol); + cache_pc_function_block = nullptr; return_cached_value: + CORE_ADDR f_low, f_high; + + /* The low and high addresses for the cache do not necessarily + correspond to the low and high addresses for the function. + Extract the function low/high addresses from the cached block + if there is one; otherwise use the cached low & high values. */ + if (cache_pc_function_block) + { + f_low = BLOCK_START (cache_pc_function_block); + f_high = BLOCK_END (cache_pc_function_block); + } + else + { + f_low = cache_pc_function_low; + f_high = cache_pc_function_high; + } + if (address) { if (pc_in_unmapped_range (pc, section)) - *address = overlay_unmapped_address (cache_pc_function_low, section); + *address = overlay_unmapped_address (f_low, section); else - *address = cache_pc_function_low; + *address = f_low; } if (name) @@ -291,13 +324,15 @@ find_pc_partial_function (CORE_ADDR pc, const char **name, CORE_ADDR *address, the overlay), we must actually convert (high - 1) and then add one to that. */ - *endaddr = 1 + overlay_unmapped_address (cache_pc_function_high - 1, - section); + *endaddr = 1 + overlay_unmapped_address (f_high - 1, section); } else - *endaddr = cache_pc_function_high; + *endaddr = f_high; } + if (block) + *block = cache_pc_function_block; + return 1; } diff --git a/gdb/symtab.h b/gdb/symtab.h index 84fc897..e4de868 100644 --- a/gdb/symtab.h +++ b/gdb/symtab.h @@ -1683,10 +1683,22 @@ extern struct symbol *find_pc_sect_function (CORE_ADDR, struct obj_section *); extern struct symbol *find_symbol_at_address (CORE_ADDR); -/* lookup function from address, return name, start addr and end addr. */ - -extern int find_pc_partial_function (CORE_ADDR, const char **, CORE_ADDR *, - CORE_ADDR *); +/* Finds the "function" (text symbol) that is smaller than PC but + greatest of all of the potential text symbols in SECTION. Sets + *NAME and/or *ADDRESS conditionally if that pointer is non-null. + If ENDADDR is non-null, then set *ENDADDR to be the end of the + function (exclusive). If the optional parameter BLOCK is non-null, + then set *BLOCK to the address of the block corresponding to the + function symbol, if such a symbol could be found during the lookup; + nullptr is used as a return value for *BLOCK if no block is found. + This function either succeeds or fails (not halfway succeeds). If + it succeeds, it sets *NAME, *ADDRESS, and *ENDADDR to real + information and returns 1. If it fails, it sets *NAME, *ADDRESS + and *ENDADDR to zero and returns 0. */ + +extern int find_pc_partial_function (CORE_ADDR pc, const char **name, + CORE_ADDR *address, CORE_ADDR *endaddr, + const struct block **block = nullptr); /* Return the type of a function with its first instruction exactly at the PC address. Return NULL otherwise. */