Force array coercion in c_get_string
Commit Message
Pedro> So with the patch, we coerce the array to inferior memory, in order
Pedro> to read its contents back from the inferior? Is that right?
Pedro> Why not skip the detour and save the memory allocation in the inferior?
Tom> Yeah, that's weird, isn't it.
Plan B is a bit more complicated, but may be nicer because in some cases
it will require fewer memory reads, and also doesn't force coercion.
Let me know what you think.
Tom
commit 9acd65e278c569f47e73f9d455127b3379792c34
Author: Tom Tromey <tromey@adacore.com>
Date: Thu Apr 25 12:14:58 2019 -0600
Correctly handle non-C-style arrays in c_get_string
A user here noticed that the Python Value.string method did not work
for Ada arrays. I tracked this down to an oddity in value_as_address
-- namely, it calls coerce_array, but that function will not force
array coercion when the language has c_style_arrays=false, as Ada
does.
This patch fixes the problem by changing c_get_string so that arrays
take the "in GDB's memory" branch. The actual patch is somewhat more
complicated than you might think, because the caller can request more
array elements than the type allows. This is normal when the type is
using the C struct hack.
Tested on x86-64 Fedora 29.
gdb/ChangeLog
2019-04-25 Tom Tromey <tromey@adacore.com>
* c-lang.c (c_get_string): Handle non-C-style arrays.
gdb/testsuite/ChangeLog
2019-04-25 Tom Tromey <tromey@adacore.com>
* gdb.python/py-value.exp (test_value_in_inferior): Add Ada test.
Comments
On 4/25/19 9:05 PM, Tom Tromey wrote:
>The actual patch is somewhat more
> complicated than you might think, because the caller can request more
> array elements than the type allows. This is normal when the type is
> using the C struct hack.
Which test exposes this issue?
I'm a bit confused here. If we ever coerce an array and then read more
elements than the type allows, won't we be reading garbage passed the
coerced array elements?
Thanks,
Pedro Alves
>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
>> The actual patch is somewhat more
>> complicated than you might think, because the caller can request more
>> array elements than the type allows. This is normal when the type is
>> using the C struct hack.
Pedro> Which test exposes this issue?
From py-value.exp:
# Test fetching a string longer than its declared (in C) size.
# PR 16286
gdb_py_test_silent_cmd "python xstr = gdb.parse_and_eval('xstr')" "get xstr" 1
gdb_test "python print(xstr\['text'\].string (length = xstr\['length'\]))" "x{100}" \
"read string beyond declared size"
Pedro> I'm a bit confused here. If we ever coerce an array and then read more
Pedro> elements than the type allows, won't we be reading garbage passed the
Pedro> coerced array elements?
In the case where we want to read past the array limit, c_get_string
will take the second branch, which does an explicit read_string.
Tom
On 4/26/19 5:49 PM, Tom Tromey wrote:
>>>>>> "Pedro" == Pedro Alves <palves@redhat.com> writes:
>
>>> The actual patch is somewhat more
>>> complicated than you might think, because the caller can request more
>>> array elements than the type allows. This is normal when the type is
>>> using the C struct hack.
>
> Pedro> Which test exposes this issue?
>
> From py-value.exp:
>
> # Test fetching a string longer than its declared (in C) size.
> # PR 16286
> gdb_py_test_silent_cmd "python xstr = gdb.parse_and_eval('xstr')" "get xstr" 1
> gdb_test "python print(xstr\['text'\].string (length = xstr\['length'\]))" "x{100}" \
> "read string beyond declared size"
>
> Pedro> I'm a bit confused here. If we ever coerce an array and then read more
> Pedro> elements than the type allows, won't we be reading garbage passed the
> Pedro> coerced array elements?
>
> In the case where we want to read past the array limit, c_get_string
> will take the second branch, which does an explicit read_string.
Try this:
(gdb) python print (gdb.parse_and_eval("\"ab\"").string (length = 10))
This will go to the "in GDB's memory" branch, but it will
read beyond the value's contents buffer.
That's preexisting, but it's related to what I was saying above.
Looking at this comment:
> + An array is assumed to live in GDB's memory, so we take this path
> + here. However, it's possible for the caller to request more
> + array elements than apparently exist -- this can happen when
> + using the C struct hack. So, only do this for an array if either
> + no length was specified, or the length is within the existing
> + bounds. */
in my previous message, I was thinking that in the second branch we'd
end up copying the array to target memory and then read past its end.
We won't copy the array to memory, because the other hunk skips
value_as_address.
However,
> else
> {
> - CORE_ADDR addr = value_as_address (value);
> + /* value_as_address calls coerce_array, but that won't actually
> + coerce the array if c_style_arrays is false. In this case,
> + get the address of the array directly. */
Note: I found this comment confusing. The first sentence suggests that you
_want_ the coercion, including copying a "GDB memory" array to target
memory, but you don't get it because of c_style_arrays.
But I don't think getting the raw address of an array is coercion,
not in the sense of converting to a different type, like a pointer
to the first element, as in C.
> + CORE_ADDR addr;
> + if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
> + addr = value_address (value);
Can we get here with an LVAL != lval_memory?
I assume so, given the condition:
if ((VALUE_LVAL (value) == not_lval
|| VALUE_LVAL (value) == lval_internalvar)
&& fetchlimit != UINT_MAX)
So if we get to the second branch with:
VALUE_LVAL (value) == not_lval
TYPE_CODE (type) == TYPE_CODE_ARRAY
fetchlimit == UINT_MAX
We'll be calling value_address on a non-lval_memory value.
value_address returns address 0 in that case, does not throw
an error (maybe it should).
value_coerce_array, if it were reached, has this:
if (VALUE_LVAL (arg1) != lval_memory)
error (_("Attempt to take address of value not located in memory."));
So shouldn't we have such a check here too?
Thanks,
Pedro Alves
@@ -1,3 +1,7 @@
+2019-04-25 Tom Tromey <tromey@adacore.com>
+
+ * c-lang.c (c_get_string): Handle non-C-style arrays.
+
2019-04-25 Tom Tromey <tromey@adacore.com>
PR gdb/24475:
@@ -279,9 +279,18 @@ c_get_string (struct value *value, gdb::unique_xmalloc_ptr<gdb_byte> *buffer,
/* If the string lives in GDB's memory instead of the inferior's,
then we just need to copy it to BUFFER. Also, since such strings
are arrays with known size, FETCHLIMIT will hold the size of the
- array. */
+ array.
+
+ An array is assumed to live in GDB's memory, so we take this path
+ here. However, it's possible for the caller to request more
+ array elements than apparently exist -- this can happen when
+ using the C struct hack. So, only do this for an array if either
+ no length was specified, or the length is within the existing
+ bounds. */
if ((VALUE_LVAL (value) == not_lval
- || VALUE_LVAL (value) == lval_internalvar)
+ || VALUE_LVAL (value) == lval_internalvar
+ || (TYPE_CODE (type) == TYPE_CODE_ARRAY
+ && (*length < 0 || *length <= fetchlimit)))
&& fetchlimit != UINT_MAX)
{
int i;
@@ -306,7 +315,14 @@ c_get_string (struct value *value, gdb::unique_xmalloc_ptr<gdb_byte> *buffer,
}
else
{
- CORE_ADDR addr = value_as_address (value);
+ /* value_as_address calls coerce_array, but that won't actually
+ coerce the array if c_style_arrays is false. In this case,
+ get the address of the array directly. */
+ CORE_ADDR addr;
+ if (TYPE_CODE (type) == TYPE_CODE_ARRAY)
+ addr = value_address (value);
+ else
+ addr = value_as_address (value);
/* Prior to the fix for PR 16196 read_string would ignore fetchlimit
if length > 0. The old "broken" behaviour is the behaviour we want:
@@ -1,3 +1,7 @@
+2019-04-25 Tom Tromey <tromey@adacore.com>
+
+ * gdb.python/py-value.exp (test_value_in_inferior): Add Ada test.
+
2019-04-25 Sergio Durigan Junior <sergiodj@redhat.com>
PR corefiles/11608
@@ -315,6 +315,13 @@ proc test_value_in_inferior {} {
gdb_test "python print (\"---\"+st.string (length = 0)+\"---\")" "------" "test string (length = 0) is empty"
gdb_test "python print (len(st.string (length = 0)))" "0" "test length is 0"
+ # We choose Ada here to test a language where c_style_arrays is
+ # false.
+ gdb_test "set lang ada" \
+ "Warning: the current language does not match this frame."
+ gdb_test "python print (st.string ())" "divide et impera" \
+ "Test string with no length in ada"
+ gdb_test_no_output "set lang auto"
# Fetch a string that has embedded nulls.
gdb_test "print nullst" "\"divide\\\\000et\\\\000impera\".*"