Removal of uses of MAX_REGISTER_SIZE
Commit Message
On 02/01/2017 12:41 PM, Yao Qi wrote:
> On 17-01-27 12:11:09, Pedro Alves wrote:
>>
>> Makes me wonder whether this is the right approach. :-/
>> (No, I don't have a formed opinion for what that would be.)
>>
>
> Hi Pedro,
> What do you mean by "this" here? We can't take MAX_REGISTER_SIZE for
> ever, as we may have bigger and bigger single register. We do need a
> gdbarch specific max register size. Given there is no standard VLA in
> C++, using alloca is the best way we can do so far.
>
I meant "alloca". And having to worry about loops. Along with other
oddities that come with alloca, like alloca(-1), the most common type
of bad huge allocation (I think), crashing gdb vs
xmalloc/new -> malloc_failure causing a recoverable internal
error, for example.
Some options I was pondering:
#1 - Use VLAs anyway.
I.e., assume we can only build with C++ compilers that support VLAs as extension.
GCC does, clang does, Intel's does I believe. ARM's probably does too?
I don't know whether there's actually any C++ compiler that does not.
Maybe MSVC, though I don't know of anyone trying to build GDB with that.
Though arguably, we shouldn't close the door on "standard" compilers like that.
#2 - Switch to heap allocation
Or std::vector or something like that that hides it. It's not clear whether
that would have a noticeable performance impact.
#3 - Use alloca, move inner loop bodies to functions/lambdas
I.e., avoid having to put the allocas before the loops by moving the body
of loops to separate functions, or to lambdas (effectively
the same), like:
void
function ()
{
for (int regnum = 0; regnum < allregisters; regnum++)
{
[&] {
char *buf = (char *) alloca (register_size (gdbarch, regnum));
/// do something with buf.
} ();
}
}
though that may end up looking a bit ... odd if you're not too
familiar with alloca and lambdas. It also has the problem that
a "return" inside that scope doesn't do what one would expect
if the lambda wasn't there (more on that below).
#4 - Like above, but wrap in macros.
I.e., make #3 a bit more explicit and to the point, by wrapping
the lambda syntax with a couple macros. Like:
#define ALLOCA_SCOPE_START [&] ()
#define ALLOCA_SCOPE_END ();
void
function ()
{
for (int regnum = 0; regnum < allregisters; regnum++)
{
ALLOCA_SCOPE_START
{
char *buf = (char *) alloca (register_size (gdbarch, regnum));
// do something
}
ALLOCA_SCOPE_END
}
}
#5 - Like above, but fix the "return" issue too.
It's doable, but it'll require a bit more "magic". See attached patch.
I picked a few random examples of cases where alloca was being moved
to outside a loops in Alan's patch, and converted them (against master).
I've never seen anyone do something like this. I just thought of it
now, tried it, and it works (tried g++ 4.8 / 5.3 / 7)...
The main advantages of something like ALLOCA_SCOPE_START/ALLOCA_SCOPE_END
would be:
- ALLOCA_SCOPE_START/ALLOCA_SCOPE_END stand out, which in case of alloca
is probably a good thing.
- Allows keeping the alloca buffer allocation closer to where it is used.
Disadvantages:
- Might not be up to everyone's taste?
From ce55d77461a5fa429cea8837b09679b301fa9b40 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Wed, 1 Feb 2017 15:41:14 +0000
Subject: [PATCH] Introduce and use ALLOCA_SCOPE_START / ALLOCA_SCOPE_END
Allows using alloca in loops.
---
gdb/aarch64-tdep.c | 25 ++++++++-----
gdb/common/alloca-scope.h | 89 +++++++++++++++++++++++++++++++++++++++++++++++
gdb/common/common-defs.h | 3 ++
gdb/frame.c | 21 ++++++++---
gdb/ia64-tdep.c | 15 ++++++--
5 files changed, 136 insertions(+), 17 deletions(-)
create mode 100644 gdb/common/alloca-scope.h
Comments
> #2 - Switch to heap allocation
>
> Or std::vector or something like that that hides it. It's not clear
> whether that would have a noticeable performance impact.
I would try that. I think using one of the standard C++ classes
looks a little more attractive to me, and would only consider
the lambda functions if we can show a noticeable performance
impact. Those two are not exclusive, by the way, but in the past,
we've always frowned on calls to alloca in a loop, and using
a xmalloc+cleanup combination has never been an issue in my cases.
I'd imagine that a standard C++ memory management class would be
fast enough for those same situations.
@@ -62,6 +62,7 @@
#include "opcode/aarch64.h"
#include <algorithm>
+#include "alloca-scope.h"
#define submask(x) ((1L << ((x) + 1)) - 1)
#define bit(obj,st) (((obj) >> (st)) & 1)
@@ -1982,18 +1983,24 @@ aarch64_store_return_value (struct type *type, struct regcache *regs,
for (i = 0; i < elements; i++)
{
int regno = AARCH64_V0_REGNUM + i;
- bfd_byte tmpbuf[MAX_REGISTER_SIZE];
- if (aarch64_debug)
+ ALLOCA_SCOPE_START
{
- debug_printf ("write HFA or HVA return value element %d to %s\n",
- i + 1,
- gdbarch_register_name (gdbarch, regno));
- }
+ bfd_byte *tmpbuf
+ = (bfd_byte *) alloca (register_size (gdbarch, regno));
- memcpy (tmpbuf, valbuf, len);
- regcache_cooked_write (regs, regno, tmpbuf);
- valbuf += len;
+ if (aarch64_debug)
+ {
+ debug_printf ("write HFA or HVA return value element %d to %s\n",
+ i + 1,
+ gdbarch_register_name (gdbarch, regno));
+ }
+
+ memcpy (tmpbuf, valbuf, len);
+ regcache_cooked_write (regs, regno, tmpbuf);
+ valbuf += len;
+ }
+ ALLOCA_SCOPE_END
}
}
else if (TYPE_CODE (type) == TYPE_CODE_ARRAY && TYPE_VECTOR (type)
new file mode 100644
@@ -0,0 +1,89 @@
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+
+ This file is part of GDB.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+
+/* Stack allocated with "alloca" is released on function exit, not
+ scope exit. This makes it dangerous to use alloca in loops. To
+ work around that, create an alloca scope using the
+ ALLOCA_SCOPE_START and ALLOCA_SCOPE_END macros like this:
+
+ for (...)
+ {
+ ALLOCA_SCOPE_START
+ {
+ gdb_byte *buf = (gdb_byte *) alloca (some size);
+
+ // use BUF
+ }
+ ALLOCA_SCOPE_END
+ }
+
+ ALLOCA_SCOPE_START / ALLOCA_SCOPE_END define and call a lambda
+ behind the scenes. Any memory allocated with alloca is thus
+ released when the lambda returns, since the lambda is just an
+ (inline) function.
+
+ Note that it's not possible to "return" directly from within an
+ alloca scope, since that would return from the lambda, instead of
+ from the lambda's caller. The macros are defined in such a way
+ that return expressions won't even compile, to avoid such mistakes.
+
+ The scope defined by ALLOCA_SCOPE_START/END has access to all the
+ local variables of the containing scope (because the lambda
+ captures all by reference).
+
+ This wrapping has no run time overhead, since the lambda never
+ escapes out of the local, calling scope.
+ __attribute__((always_inline)) also helps a bit, by instructing the
+ compiler that it should inline everything about the lambda,
+ regardless of optimization level.
+*/
+
+#ifndef COMMON_ALLOCA_SCOPE_H
+#define COMMON_ALLOCA_SCOPE_H
+
+#include <alloca.h>
+
+/* A couple types used to prevent trying to "return" from within an
+ ALLOCA_SCOPE lambda. */
+
+struct alloca_scopes_cannot_return_empty_base {};
+
+/* Inheriting from an empty class prevents "return {};" inside the
+ lambda, by making alloca_scopes_cannot_return a non-aggregate. */
+struct alloca_scopes_cannot_return : alloca_scopes_cannot_return_empty_base
+{
+private:
+ alloca_scopes_cannot_return () = default;
+
+public:
+ static inline alloca_scopes_cannot_return make ()
+ ATTRIBUTE_ALWAYS_INLINE
+ { return alloca_scopes_cannot_return({}); }
+};
+
+/* Wraps a block in a lambda, so that stack memory "allocated" with
+ alloca is released on scope end. */
+#define ALLOCA_SCOPE_START \
+ [&] () ATTRIBUTE_ALWAYS_INLINE -> alloca_scopes_cannot_return \
+ {
+
+#define ALLOCA_SCOPE_END \
+ return alloca_scopes_cannot_return::make(); \
+ } ();
+
+#endif /* COMMON_ALLOCA_SCOPE_H */
@@ -69,6 +69,9 @@
#undef ATTRIBUTE_PRINTF
#define ATTRIBUTE_PRINTF _GL_ATTRIBUTE_FORMAT_PRINTF
+/* Could go to ansidecl.h. */
+#define ATTRIBUTE_ALWAYS_INLINE __attribute__((always_inline))
+
#include "libiberty.h"
#include "pathmax.h"
#include "gdb/signals.h"
@@ -42,6 +42,7 @@
#include "tracepoint.h"
#include "hashtab.h"
#include "valprint.h"
+#include "alloca-scope.h"
/* The sentinel frame terminates the innermost end of the frame chain.
If unwound, it returns the information needed to construct an
@@ -1410,16 +1411,26 @@ get_frame_register_bytes (struct frame_info *frame, int regnum,
}
else
{
- gdb_byte buf[MAX_REGISTER_SIZE];
enum lval_type lval;
CORE_ADDR addr;
int realnum;
+ bool valid;
- frame_register (frame, regnum, optimizedp, unavailablep,
- &lval, &addr, &realnum, buf);
- if (*optimizedp || *unavailablep)
+ ALLOCA_SCOPE_START
+ {
+ gdb_byte *buf
+ = (gdb_byte *) alloca (register_size (gdbarch, regnum));
+
+ frame_register (frame, regnum, optimizedp, unavailablep,
+ &lval, &addr, &realnum, buf);
+ valid = !*optimizedp && !*unavailablep;
+ if (valid)
+ memcpy (myaddr, buf + offset, curr_len);
+ }
+ ALLOCA_SCOPE_END
+
+ if (!valid)
return 0;
- memcpy (myaddr, buf + offset, curr_len);
}
myaddr += curr_len;
@@ -38,6 +38,7 @@
#include "osabi.h"
#include "ia64-tdep.h"
#include "cp-abi.h"
+#include "alloca-scope.h"
#ifdef HAVE_LIBUNWIND_IA64_H
#include "elf/ia64.h" /* for PT_IA_64_UNWIND value */
@@ -1516,7 +1517,6 @@ examine_prologue (CORE_ADDR pc, CORE_ADDR lim_pc,
else if (qp == 0 && rN == 2
&& ((rM == fp_reg && fp_reg != 0) || rM == 12))
{
- gdb_byte buf[MAX_REGISTER_SIZE];
CORE_ADDR saved_sp = 0;
/* adds r2, spilloffset, rFramePointer
or
@@ -1534,8 +1534,17 @@ examine_prologue (CORE_ADDR pc, CORE_ADDR lim_pc,
{
struct gdbarch *gdbarch = get_frame_arch (this_frame);
enum bfd_endian byte_order = gdbarch_byte_order (gdbarch);
- get_frame_register (this_frame, sp_regnum, buf);
- saved_sp = extract_unsigned_integer (buf, 8, byte_order);
+
+ ALLOCA_SCOPE_START
+ {
+ gdb_byte *buf
+ = (gdb_byte *) alloca (register_size (gdbarch,
+ sp_regnum));
+
+ get_frame_register (this_frame, sp_regnum, buf);
+ saved_sp = extract_unsigned_integer (buf, 8, byte_order);
+ }
+ ALLOCA_SCOPE_END
}
spill_addr = saved_sp
+ (rM == 12 ? 0 : mem_stack_frame_size)