[v3,3/5] GDB: Only make data actually retrieved into value history available
Commit Message
While it makes sense to allow accessing out-of-bounds elements in the
debuggee and see whatever there might happen to be there in memory (we
are a debugger and not a programming rules enforcement facility and we
want to make people's life easier in chasing bugs), e.g.:
(gdb) print one_hundred[-1]
$1 = 0
(gdb) print one_hundred[100]
$2 = 0
(gdb)
we shouldn't really pretend that we have any meaningful data around
values recorded in history (what these commands really retrieve are
current debuggee memory contents outside the original data accessed,
really confusing in my opinion). Add code to mark data around values
recorded in history unavailable then:
(gdb) print one_hundred[-1]
$1 = <unavailable>
(gdb) print one_hundred[100]
$2 = <unavailable>
Adjust `value_entirely_available' accordingly; also zero-length types
need special handling in `value_entirely_covered_by_range_vector'.
Add a suitable test case, which also covers integer overflows in data
location calculation.
---
Hi,
While at it I noticed that while values recorded in the history are just
as one would expect immutable, one can poke at the original data object
data in history has originated from by writing to the value in the
history, e.g.:
(gdb) print -elements 3 -- one_hundred
$1 = {0, 1, 2...}
(gdb) print -elements 3 -- $1
$2 = {0, 1, 2...}
(gdb) set $1[0] = -1
(gdb) print -elements 3 -- $1
$3 = {0, 1, 2...}
(gdb) print -elements 3 -- one_hundred
$4 = {-1, 1, 2...}
(gdb)
This is undocumented and I find it surprising to say the least. Shall I
file a bug for this phenomenon?
LONG_MIN is not used, but I have thought we can have it anyway for
consistency and so that people do not try to invent local variants.
Maciej
Changes from v2:
- Correct the copyright year in new files.
New change in v2.
---
gdb/defs.h | 8 ++
gdb/testsuite/gdb.base/value-history-unavailable.c | 29 +++++++
gdb/testsuite/gdb.base/value-history-unavailable.exp | 73 +++++++++++++++++++
gdb/valarith.c | 15 +++
gdb/value.c | 24 ++++--
5 files changed, 144 insertions(+), 5 deletions(-)
gdb-value-history-unavailable.diff
Comments
>>>>> "Maciej" == Maciej W Rozycki <macro@embecosm.com> writes:
Maciej> While it makes sense to allow accessing out-of-bounds elements in the
Maciej> debuggee and see whatever there might happen to be there in memory (we
Maciej> are a debugger and not a programming rules enforcement facility and we
Maciej> want to make people's life easier in chasing bugs), e.g.:
Thank you for the patch.
Maciej> Index: src/gdb/defs.h
Maciej> ===================================================================
Maciej> --- src.orig/gdb/defs.h
Maciej> +++ src/gdb/defs.h
Maciej> @@ -469,6 +469,10 @@ enum val_prettyformat
Maciej> #define LONG_MAX ((long)(ULONG_MAX >> 1)) /* 0x7FFFFFFF for 32-bits */
Maciej> #endif
Maciej> +#if !defined (LONG_MIN) /* 0x80000000 for 32-bits */
Maciej> +#define LONG_MIN ((long) (~(long) 0 ^ LONG_MAX))
Maciej> +#endif
I wasn't aware of this code being here.
For LONG_*, I tend to think we don't need it. <limits.h> should provide
it, and gnulib seems to think there aren't any systems without this:
https://www.gnu.org/software/gnulib/manual/html_node/limits_002eh.html
Maciej> +#if !defined (LONGEST_MIN) /* 0x8000000000000000 for 64-bits */
Maciej> +#define LONGEST_MIN ((LONGEST) (~(LONGEST) 0 ^ LONGEST_MAX))
Maciej> +#endif
This one probably belongs in gdbsupport/common-types.h.
We should clean up this part of defs.h but I can do that.
Maciej> @@ -1950,6 +1955,15 @@ record_latest_value (struct value *val)
Maciej> a value on the value history never changes. */
Maciej> if (value_lazy (val))
Maciej> value_fetch_lazy (val);
Maciej> +
Maciej> + /* Don't pretend we have anything available there in the history beyond
Maciej> + the boundaries of the value recorded. It's not like inferior memory
Maciej> + where there is actual stuff underneath. */
Maciej> + ULONGEST length = value_enclosing_type (val)->length ();
Maciej> + mark_value_bits_unavailable (val, LONGEST_MIN, 0 ^ LONGEST_MIN);
Maciej> + mark_value_bits_unavailable (val, length * TARGET_CHAR_BIT,
Maciej> + LONGEST_MAX - length * TARGET_CHAR_BIT);
I'm struggling to understand why this is needed.
Like it seems to me that a value should always report that bytes outside
of its content range are unavailable.
Tom
On Tue, 31 Jan 2023, Tom Tromey wrote:
> Maciej> +#if !defined (LONG_MIN) /* 0x80000000 for 32-bits */
> Maciej> +#define LONG_MIN ((long) (~(long) 0 ^ LONG_MAX))
> Maciej> +#endif
>
> I wasn't aware of this code being here.
>
> For LONG_*, I tend to think we don't need it. <limits.h> should provide
> it, and gnulib seems to think there aren't any systems without this:
>
> https://www.gnu.org/software/gnulib/manual/html_node/limits_002eh.html
I guess it's still C thinking kicking around here at least on my side.
Since we've switched to C++11 it's that language standard that we have
been driven by, and it mandates the presence of these macros. Except we
need to switch to <climits>.
I don't like to maintain a mess, so I'll repost this patch series with
a suitable update included.
> Maciej> +#if !defined (LONGEST_MIN) /* 0x8000000000000000 for 64-bits */
> Maciej> +#define LONGEST_MIN ((LONGEST) (~(LONGEST) 0 ^ LONGEST_MAX))
> Maciej> +#endif
>
> This one probably belongs in gdbsupport/common-types.h.
>
> We should clean up this part of defs.h but I can do that.
Already done with the upcoming <climits> update.
> Maciej> @@ -1950,6 +1955,15 @@ record_latest_value (struct value *val)
> Maciej> a value on the value history never changes. */
> Maciej> if (value_lazy (val))
> Maciej> value_fetch_lazy (val);
> Maciej> +
> Maciej> + /* Don't pretend we have anything available there in the history beyond
> Maciej> + the boundaries of the value recorded. It's not like inferior memory
> Maciej> + where there is actual stuff underneath. */
> Maciej> + ULONGEST length = value_enclosing_type (val)->length ();
> Maciej> + mark_value_bits_unavailable (val, LONGEST_MIN, 0 ^ LONGEST_MIN);
> Maciej> + mark_value_bits_unavailable (val, length * TARGET_CHAR_BIT,
> Maciej> + LONGEST_MAX - length * TARGET_CHAR_BIT);
>
> I'm struggling to understand why this is needed.
> Like it seems to me that a value should always report that bytes outside
> of its content range are unavailable.
I have investigated whether a check based solely on the length of the
data type associated with the value recorded would do, but that triggered
Python regressions:
FAIL: gdb.python/flexible-array-member.exp: python print(ns['items'][0])
FAIL: gdb.python/flexible-array-member.exp: python print(ns['items'][1])
FAIL: gdb.python/flexible-array-member.exp: python print(ns['items'][2])
FAIL: gdb.python/flexible-array-member.exp: python print(zs['items'][0])
FAIL: gdb.python/flexible-array-member.exp: python print(zs['items'][1])
FAIL: gdb.python/flexible-array-member.exp: python print(zs['items'][2])
FAIL: gdb.python/flexible-array-member.exp: python print(zso['items'][0])
FAIL: gdb.python/flexible-array-member.exp: python print(zso['items'][1])
FAIL: gdb.python/flexible-array-member.exp: python print(zso['items'][2])
where flexible array member elements are accessed like:
(gdb) python ns = gdb.parse_and_eval('ns').dereference()
(gdb) python print(ns['items'][0])
that is using a value associated with a Python variable created on the fly
rather than a value history entry. These now showed as <unavailable> and
it broke the test cases, which expect actual values, i.e.:
python print(ns['items'][0])
<unavailable>
(gdb) FAIL: gdb.python/flexible-array-member.exp: python print(ns['items'][0])
compared to:
python print(ns['items'][0])
101
(gdb) PASS: gdb.python/flexible-array-member.exp: python print(ns['items'][0])
So instead I looked into making value history entries a distinct lvalue
type (it would have a nice side effect of showing "no such vector element"
when accessing out-of-bounds array elements) replacing `lval_memory',
`lval_register', etc. It might also cure the history mutability issue I
have observed (for which I have filed PR exp/30107 to track BTW). But
while possibly a nice clean-up I realised it would be a major task, so
it's something perhaps for another occasion.
So finally I came up with just a boolean flag set for entries recorded in
the value history. It has passed my regression testing now and I'll be
posting it shortly along with the rest of the patches. Unless I hear to
the contrary, I'm going to follow your earlier approval for the changes
already handled, but this gives you or someone else a chance to chime in
just in case an issue gets spotted.
Thank you for your review.
Maciej
>>>>> "Maciej" == Maciej W Rozycki <macro@embecosm.com> writes:
>> I'm struggling to understand why this is needed.
>> Like it seems to me that a value should always report that bytes outside
>> of its content range are unavailable.
Maciej> I have investigated whether a check based solely on the length of the
Maciej> data type associated with the value recorded would do, but that triggered
Maciej> Python regressions:
Interesting, thanks. I guess the issue here is we don't know how big
the value should be in the "struct hack" case.
Tom
===================================================================
@@ -469,6 +469,10 @@ enum val_prettyformat
#define LONG_MAX ((long)(ULONG_MAX >> 1)) /* 0x7FFFFFFF for 32-bits */
#endif
+#if !defined (LONG_MIN) /* 0x80000000 for 32-bits */
+#define LONG_MIN ((long) (~(long) 0 ^ LONG_MAX))
+#endif
+
#if !defined (ULONGEST_MAX)
#define ULONGEST_MAX (~(ULONGEST)0) /* 0xFFFFFFFFFFFFFFFF for 64-bits */
#endif
@@ -477,6 +481,10 @@ enum val_prettyformat
#define LONGEST_MAX ((LONGEST)(ULONGEST_MAX >> 1))
#endif
+#if !defined (LONGEST_MIN) /* 0x8000000000000000 for 64-bits */
+#define LONGEST_MIN ((LONGEST) (~(LONGEST) 0 ^ LONGEST_MAX))
+#endif
+
/* * Convert a LONGEST to an int. This is used in contexts (e.g. number of
arguments to a function, number in a value history, register number, etc.)
where the value must not be larger than can fit in an int. */
===================================================================
@@ -0,0 +1,29 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+ Copyright (C) 2023 Free Software Foundation, Inc.
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see <http://www.gnu.org/licenses/>. */
+
+struct
+{
+ unsigned char x[1024];
+ unsigned char a[1024];
+ unsigned char y[1024];
+} a = { {-1} };
+
+int
+main ()
+{
+ return 0;
+}
===================================================================
@@ -0,0 +1,73 @@
+# Copyright (C) 2023 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+# Test GDB's value availability ranges.
+
+standard_testfile
+
+if { [prepare_for_testing "failed to prepare" $testfile $srcfile] } {
+ return -1
+}
+
+if ![runto_main] then {
+ perror "couldn't run to breakpoint"
+ continue
+}
+
+set target_char_mask [get_valueof "/u" "a.x\[0]" "255" "get target char mask"]
+set target_char_bit 0
+for {set i $target_char_mask} {$i > 0} {set i [expr $i >> 1]} {
+ incr target_char_bit
+}
+set target_char_rank -1
+for {set i $target_char_bit} {$i > 0} {set i [expr $i >> 1]} {
+ incr target_char_rank
+}
+
+# Verify accesses to original inferior data.
+gdb_test "print a.a" "\\\$2 = '\\\\000' <repeats 1023 times>"
+gdb_test "print a.a\[-1\]" "\\\$3 = 0 '\\\\000'"
+gdb_test "print a.a\[1024\]" "\\\$4 = 0 '\\\\000'"
+
+# Verify in-range value history accesses.
+gdb_test "print \$2" "\\\$5 = '\\\\000' <repeats 1023 times>"
+gdb_test "print \$2\[0\]" "\\\$6 = 0 '\\\\000'"
+gdb_test "print \$2\[1023\]" "\\\$7 = 0 '\\\\000'"
+
+# Values outside the array recorded will have not been retrieved.
+gdb_test "print \$2\[-1\]" "\\\$8 = <unavailable>"
+gdb_test "print \$2\[1024\]" "\\\$9 = <unavailable>"
+gdb_test "print \$2\[-1LL << 63 - $target_char_rank\]" \
+ "\\\$10 = <unavailable>"
+gdb_test "print \$2\[(1LL << 63 - $target_char_rank) - 1\]" \
+ "\\\$11 = <unavailable>"
+
+# Accesses through pointers in history go straight to the inferior though.
+gdb_test "print \$2\[0\]@1" "\\\$12 = \"\""
+gdb_test "print \$2\[-1\]@1" "\\\$13 = \"\""
+gdb_test "print \$2\[1024\]@1" "\\\$14 = \"\""
+
+# Verify out-of-range value history accesses.
+gdb_test "print \$2\[(-1LL << 63 - $target_char_rank) - 1\]" \
+ "Integer overflow in data location calculation"
+gdb_test "print \$2\[(1LL << 63 - $target_char_rank)\]" \
+ "Integer overflow in data location calculation"
+gdb_test "print \$2\[-1LL << 63\]" \
+ "Integer overflow in data location calculation"
+gdb_test "print \$2\[(1ULL << 63) - 1\]" \
+ "Integer overflow in data location calculation"
+
+# Sanity-check a copy of an unavailable value.
+gdb_test "print \$11" "\\\$15 = <unavailable>"
===================================================================
@@ -182,6 +182,21 @@ value_subscript (struct value *array, LO
}
index -= *lowerbound;
+
+ /* Do not try to dereference a pointer to an unavailable value.
+ Instead mock up a new one and give it the original address. */
+ struct type *elt_type = check_typedef (tarray->target_type ());
+ LONGEST elt_size = type_length_units (elt_type);
+ if (!value_lazy (array)
+ && !value_bytes_available (array, elt_size * index, elt_size))
+ {
+ struct value *val = allocate_value (elt_type);
+ mark_value_bytes_unavailable (val, 0, elt_size);
+ VALUE_LVAL (val) = lval_memory;
+ set_value_address (val, value_address (array) + elt_size * index);
+ return val;
+ }
+
array = value_coerce_array (array);
}
===================================================================
@@ -421,7 +421,9 @@ value_entirely_available (struct value *
if (value->lazy)
value_fetch_lazy (value);
- if (value->unavailable.empty ())
+ if (value->unavailable.empty ()
+ || value_bytes_available (value, 0,
+ value_enclosing_type (value)->length ()))
return 1;
return 0;
}
@@ -441,11 +443,12 @@ value_entirely_covered_by_range_vector (
if (ranges.size () == 1)
{
+ ULONGEST length = value_enclosing_type (value)->length ();
const struct range &t = ranges[0];
- if (t.offset == 0
- && t.length == (TARGET_CHAR_BIT
- * value_enclosing_type (value)->length ()))
+ if (length == 0)
+ return t.offset == 0 && t.length == 0;
+ if (t.offset <= 0 && t.offset + t.length >= TARGET_CHAR_BIT * length)
return 1;
}
@@ -1277,7 +1280,9 @@ require_not_optimized_out (const struct
static void
require_available (const struct value *value)
{
- if (!value->unavailable.empty ())
+ if (!value->unavailable.empty ()
+ && !value_bytes_available (value, 0,
+ value_enclosing_type (value)->length ()))
throw_error (NOT_AVAILABLE_ERROR, _("value is not available"));
}
@@ -1950,6 +1955,15 @@ record_latest_value (struct value *val)
a value on the value history never changes. */
if (value_lazy (val))
value_fetch_lazy (val);
+
+ /* Don't pretend we have anything available there in the history beyond
+ the boundaries of the value recorded. It's not like inferior memory
+ where there is actual stuff underneath. */
+ ULONGEST length = value_enclosing_type (val)->length ();
+ mark_value_bits_unavailable (val, LONGEST_MIN, 0 ^ LONGEST_MIN);
+ mark_value_bits_unavailable (val, length * TARGET_CHAR_BIT,
+ LONGEST_MAX - length * TARGET_CHAR_BIT);
+
/* We preserve VALUE_LVAL so that the user can find out where it was fetched
from. This is a bit dubious, because then *&$1 does not just return $1
but the current contents of that location. c'est la vie... */