[v6,1/1] fortran: Fix arrays of variable length strings for FORTRAN

Message ID 20240105131912.22658-2-abdul.b.ijaz@intel.com
State New
Headers
Series Fix arrays of variable length strings for FORTRAN |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Ijaz, Abdul B Jan. 5, 2024, 1:19 p.m. UTC
  From: "Ijaz, Abdul B" <abdul.b.ijaz@intel.com>

Before this change resolve_dynamic_array_or_string was called for
all TYPE_CODE_ARRAY and TYPE_CODE_STRING types, but, in the end,
this function always called create_array_type_with_stride, which
creates a TYPE_CODE_ARRAY type.

Suppose we have

subroutine vla_array (arr1, arr2)
  character (len=*):: arr1 (:)
  character (len=5):: arr2 (:)

  print *, arr1 ! break-here
  print *, arr2
end subroutine vla_array

The "print arr1" and "print arr2" command at the "break-here" line
gives the following output:

(gdb) print arr1
$1 = <incomplete type>
(gdb) print arr2
$2 = ('abcde', 'abcde', 'abcde')
(gdb) ptype arr1
type = Type
End Type
(gdb) ptype arr2
type = character*5 (3)

Dwarf info using Intel® Fortran Compiler for such case contains following:
 <1><fd>: Abbrev Number: 12 (DW_TAG_string_type)
    <fe>   DW_AT_name        : (indirect string, offset: 0xd2): .str.ARR1
    <102>   DW_AT_string_length: 3 byte block: 97 23 8 (DW_OP_push_object_address; DW_OP_plus_uconst: 8)

After this change resolve_dynamic_array_or_string now calls
create_array_type_with_stride or create_string_type, so if the
incoming dynamic type is a TYPE_CODE_STRING then we'll get back a
TYPE_CODE_STRING type.  Now gdb shows following:

(gdb) p arr1
$1 = ('abddefghij', 'abddefghij', 'abddefghij', 'abddefghij', 'abddefghij')
(gdb) p arr2
$2 = ('abcde', 'abcde', 'abcde')
(gdb) ptype arr1
type = character*10 (5)
(gdb) ptype arr2
type = character*5 (3)

Fixing above issue introduce regression in gdb.fortran/mixed-lang-stack.exp,
i.e. the test forces the language to C/C++ and print a Fortran string value.
The string value is a dynamic type with code TYPE_CODE_STRING.

Before this commit the dynamic type resolution would always convert this to
a TYPE_CODE_ARRAY of characters, which the C value printing could handle.

But now after this commit we get a TYPE_CODE_STRING, which
neither the C value printing, or the generic value printing code can
support.  And so, I've added support for TYPE_CODE_STRING to the C value
printing, strings are handled just like arrays.

Lastly, in gdb.opt/fortran-string.exp and gdb.fortran/string-types.exp
tests it expects type of character array in 'character (3)' format but now
after this change we get 'character*3', so tests are updated accordingly.
---
 gdb/c-valprint.c                           |  1 +
 gdb/gdbtypes.c                             | 42 ++++++++++++++-
 gdb/testsuite/gdb.fortran/string-types.exp |  4 +-
 gdb/testsuite/gdb.fortran/vla-array.exp    | 60 ++++++++++++++++++++++
 gdb/testsuite/gdb.fortran/vla-array.f90    | 45 ++++++++++++++++
 gdb/testsuite/gdb.opt/fortran-string.exp   |  2 +-
 6 files changed, 149 insertions(+), 5 deletions(-)
 create mode 100644 gdb/testsuite/gdb.fortran/vla-array.exp
 create mode 100644 gdb/testsuite/gdb.fortran/vla-array.f90
  

Comments

Tom Tromey Jan. 11, 2024, 6:02 p.m. UTC | #1
>>>>> Abdul Basit Ijaz <abdul.b.ijaz@intel.com> writes:

> From: "Ijaz, Abdul B" <abdul.b.ijaz@intel.com>
> Before this change resolve_dynamic_array_or_string was called for
> all TYPE_CODE_ARRAY and TYPE_CODE_STRING types, but, in the end,
> this function always called create_array_type_with_stride, which
> creates a TYPE_CODE_ARRAY type.

Thank you for the patch.

> +  else if (ary_dim != NULL && ary_dim->code () == TYPE_CODE_STRING)
> +    {
> +      /* The following special case for TYPE_CODE_STRING should not be
> +	 needed, ideally we would defer resolving the dynamic type of the
> +	 array elements until needed later, and indeed, the resolved type
> +	 of each array element might be different, so attempting to resolve
> +	 the type here makes no sense.
> +
> +	 However, in Fortran, for arrays of strings, each element must be
> +	 the same type, as such, the DWARF for the string length relies on
> +	 the object address of the array itself.
> +
> +	 The problem here is that, when we create value's from the dynamic

The apostrophe should be removed here.

> +	 array type, we resolve the data location, and use that as the
> +	 value address, this completely discards the original value
> +	 address, and it is this original value address that is the
> +	 descriptor for the dynamic array, the very address that the DWARF
> +	 needs us to push in order to resolve the dynamic string length.
> +
> +	 What this means then, is that, given the current state of GDB, if
> +	 we don't resolve the string length now, then we will have lost
> +	 access to the address of the dynamic object descriptor, and so we
> +	 will not be able to resolve the dynamic string later.
> +
> +	 For now then, we handle special case TYPE_CODE_STRING on behalf of
> +	 Fortran, and hope that this doesn't cause problems for anyone
> +	 else.  */

I guess this is due to the approach taken by the Intel compiler.  Is
this true for all Fortran compilers, though?  For example, did you try
this with gfortran?

These DWARF approaches are really unfortunate -- there should be a more
standard way to separate the type's meaning and the type's
representation, especially for "slice" types -- but I guess we're stuck
with it for now.

> +  if (prop != nullptr && type->code () == TYPE_CODE_STRING)
> +    {
> +       prop = nullptr;
> +       warning (_("byte stride property ignored on string type"));
> +    }

I'm sort of meh on this.  Like, there are other warnings from this
function, but on the other hand, this is not something a user can do
anything about.  I guess it serves to say "you're about to see garbage".

thanks,
Tom
  
Ijaz, Abdul B Jan. 12, 2024, 6:57 p.m. UTC | #2
Hi Tom,

Thanks for the feedback.

>> +	 The problem here is that, when we create value's from the dynamic

Tom > The apostrophe should be removed here.

Will fix in V7.

Tom > I guess this is due to the approach taken by the Intel compiler.  Is this true for all Fortran compilers, though?  For example, did you try this with gfortran?

I have tested it with ifort/ifx/gfortran. As per DWARF for VLA strings should be encoded like this for all Fortran compiler. After this change GDB  works fine with ifort/ifx. But in case of gfortran it fails though vla-string itself is also encoded using TYPE_CODE_STRING  but there only issue is unexpectedly string is passed as reference to the structure so gdb is not able to handle. I have created following gcc Bugzilla  issue sometime back and XFAIL is used for gfortran in the vla-array.exp newly added test. 

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101826

Tom > I'm sort of meh on this.  Like, there are other warnings from this function, but on the other hand, this is not something a user can do anything about.  I guess it serves to say "you're about to see garbage".

This was added on feedback so just to handle it and notify, given how closely related arrays and strings seem to be. 

Thanks & Best Regards
Abdul Basit

-----Original Message-----
From: Tom Tromey <tom@tromey.com> 
Sent: Thursday, January 11, 2024 7:03 PM
To: Ijaz, Abdul B <abdul.b.ijaz@intel.com>
Cc: gdb-patches@sourceware.org; aburgess@redhat.com
Subject: Re: [PATCH v6 1/1] fortran: Fix arrays of variable length strings for FORTRAN

>>>>> Abdul Basit Ijaz <abdul.b.ijaz@intel.com> writes:

> From: "Ijaz, Abdul B" <abdul.b.ijaz@intel.com> Before this change 
> resolve_dynamic_array_or_string was called for all TYPE_CODE_ARRAY and 
> TYPE_CODE_STRING types, but, in the end, this function always called 
> create_array_type_with_stride, which creates a TYPE_CODE_ARRAY type.

Thank you for the patch.

> +  else if (ary_dim != NULL && ary_dim->code () == TYPE_CODE_STRING)
> +    {
> +      /* The following special case for TYPE_CODE_STRING should not be
> +	 needed, ideally we would defer resolving the dynamic type of the
> +	 array elements until needed later, and indeed, the resolved type
> +	 of each array element might be different, so attempting to resolve
> +	 the type here makes no sense.
> +
> +	 However, in Fortran, for arrays of strings, each element must be
> +	 the same type, as such, the DWARF for the string length relies on
> +	 the object address of the array itself.
> +
> +	 The problem here is that, when we create value's from the dynamic

The apostrophe should be removed here.

> +	 array type, we resolve the data location, and use that as the
> +	 value address, this completely discards the original value
> +	 address, and it is this original value address that is the
> +	 descriptor for the dynamic array, the very address that the DWARF
> +	 needs us to push in order to resolve the dynamic string length.
> +
> +	 What this means then, is that, given the current state of GDB, if
> +	 we don't resolve the string length now, then we will have lost
> +	 access to the address of the dynamic object descriptor, and so we
> +	 will not be able to resolve the dynamic string later.
> +
> +	 For now then, we handle special case TYPE_CODE_STRING on behalf of
> +	 Fortran, and hope that this doesn't cause problems for anyone
> +	 else.  */

I guess this is due to the approach taken by the Intel compiler.  Is this true for all Fortran compilers, though?  For example, did you try this with gfortran?

These DWARF approaches are really unfortunate -- there should be a more standard way to separate the type's meaning and the type's representation, especially for "slice" types -- but I guess we're stuck with it for now.

> +  if (prop != nullptr && type->code () == TYPE_CODE_STRING)
> +    {
> +       prop = nullptr;
> +       warning (_("byte stride property ignored on string type"));
> +    }

I'm sort of meh on this.  Like, there are other warnings from this function, but on the other hand, this is not something a user can do anything about.  I guess it serves to say "you're about to see garbage".

thanks,
Tom
Intel Deutschland GmbH
Registered Address: Am Campeon 10, 85579 Neubiberg, Germany
Tel: +49 89 99 8853-0, www.intel.de <http://www.intel.de>
Managing Directors: Christin Eisenschmid, Sharon Heck, Tiffany Doon Silva  
Chairperson of the Supervisory Board: Nicole Lau
Registered Office: Munich
Commercial Register: Amtsgericht Muenchen HRB 186928
  

Patch

diff --git a/gdb/c-valprint.c b/gdb/c-valprint.c
index 34a9d0f6075..11e0e133271 100644
--- a/gdb/c-valprint.c
+++ b/gdb/c-valprint.c
@@ -428,6 +428,7 @@  c_value_print_inner (struct value *val, struct ui_file *stream, int recurse,
   switch (type->code ())
     {
     case TYPE_CODE_ARRAY:
+    case TYPE_CODE_STRING:
       c_value_print_array (val, stream, recurse, options);
       break;
 
diff --git a/gdb/gdbtypes.c b/gdb/gdbtypes.c
index 2580b4f19ee..a4841f23832 100644
--- a/gdb/gdbtypes.c
+++ b/gdb/gdbtypes.c
@@ -2312,10 +2312,45 @@  resolve_dynamic_array_or_string_1 (struct type *type,
 						    frame, rank - 1,
 						    resolve_p);
     }
+  else if (ary_dim != NULL && ary_dim->code () == TYPE_CODE_STRING)
+    {
+      /* The following special case for TYPE_CODE_STRING should not be
+	 needed, ideally we would defer resolving the dynamic type of the
+	 array elements until needed later, and indeed, the resolved type
+	 of each array element might be different, so attempting to resolve
+	 the type here makes no sense.
+
+	 However, in Fortran, for arrays of strings, each element must be
+	 the same type, as such, the DWARF for the string length relies on
+	 the object address of the array itself.
+
+	 The problem here is that, when we create value's from the dynamic
+	 array type, we resolve the data location, and use that as the
+	 value address, this completely discards the original value
+	 address, and it is this original value address that is the
+	 descriptor for the dynamic array, the very address that the DWARF
+	 needs us to push in order to resolve the dynamic string length.
+
+	 What this means then, is that, given the current state of GDB, if
+	 we don't resolve the string length now, then we will have lost
+	 access to the address of the dynamic object descriptor, and so we
+	 will not be able to resolve the dynamic string later.
+
+	 For now then, we handle special case TYPE_CODE_STRING on behalf of
+	 Fortran, and hope that this doesn't cause problems for anyone
+	 else.  */
+      elt_type = resolve_dynamic_type_internal (type->target_type (),
+						addr_stack, frame, 0);
+    }
   else
     elt_type = type->target_type ();
 
   prop = type->dyn_prop (DYN_PROP_BYTE_STRIDE);
+  if (prop != nullptr && type->code () == TYPE_CODE_STRING)
+    {
+       prop = nullptr;
+       warning (_("byte stride property ignored on string type"));
+    }
   if (prop != NULL && resolve_p)
     {
       if (dwarf2_evaluate_property (prop, frame, addr_stack, &value))
@@ -2336,8 +2371,11 @@  resolve_dynamic_array_or_string_1 (struct type *type,
     bit_stride = type->field (0).bitsize ();
 
   type_allocator alloc (type, type_allocator::SMASH);
-  return create_array_type_with_stride (alloc, elt_type, range_type, NULL,
-					bit_stride);
+  if (type->code () == TYPE_CODE_STRING)
+    return create_string_type (alloc, elt_type, range_type);
+  else
+    return create_array_type_with_stride (alloc, elt_type, range_type, NULL,
+					  bit_stride);
 }
 
 /* Resolve an array or string type with dynamic properties, return a new
diff --git a/gdb/testsuite/gdb.fortran/string-types.exp b/gdb/testsuite/gdb.fortran/string-types.exp
index 3114fbaac78..379857b2138 100644
--- a/gdb/testsuite/gdb.fortran/string-types.exp
+++ b/gdb/testsuite/gdb.fortran/string-types.exp
@@ -52,7 +52,7 @@  with_test_prefix "third breakpoint, first time" {
     # Continue to the third breakpoint.
     gdb_continue_to_breakpoint "continue"
     gdb_test "print s" " = 'foo'"
-    gdb_test "ptype s" "type = character \\(3\\)"
+    gdb_test "ptype s" "type = character\\*3"
 }
 
 with_test_prefix "third breakpoint, second time" {
@@ -65,5 +65,5 @@  with_test_prefix "third breakpoint, second time" {
     # by most users, so seems good enough.
     gdb_continue_to_breakpoint "continue"
     gdb_test "print s" " = 'foo\\\\n\\\\t\\\\r\\\\000bar'"
-    gdb_test "ptype s" "type = character \\(10\\)"
+    gdb_test "ptype s" "type = character\\*10"
 }
diff --git a/gdb/testsuite/gdb.fortran/vla-array.exp b/gdb/testsuite/gdb.fortran/vla-array.exp
new file mode 100644
index 00000000000..55d5a6597f8
--- /dev/null
+++ b/gdb/testsuite/gdb.fortran/vla-array.exp
@@ -0,0 +1,60 @@ 
+# Copyright 2024 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+standard_testfile ".f90"
+load_lib "fortran.exp"
+
+require allow_fortran_tests
+
+if {[prepare_for_testing "failed to prepare" ${testfile} ${srcfile} \
+    {debug f90 quiet}]} {
+    return -1
+}
+
+if ![fortran_runto_main] {
+    untested "could not run to main"
+    return -1
+}
+
+# Try to access vla string / vla string array / string array values.
+gdb_breakpoint [gdb_get_line_number "arr_vla1-print"]
+gdb_continue_to_breakpoint "arr_vla1-print"
+
+# GFortran does not emit DW_TAG_string_type for array of variable length
+# string.
+if [test_compiler_info "gfortran*" f90] { setup_xfail *-*-* gcc/101826 }
+gdb_test "print arr_vla1"  \
+    " = \\\('vlaaryvlaary', 'vlaaryvlaary', 'vlaaryvlaary', 'vlaaryvlaary', 'vlaaryvlaary'\\\)"  \
+    "print vla string array"
+
+if [test_compiler_info "gfortran*" f90] { setup_xfail *-*-* gcc/101826 }
+gdb_test "ptype arr_vla1"  \
+    "type = character\\*12 \\(5\\)"  \
+    "print variable length string array type"
+gdb_test "print arr_vla2"  \
+    " = 'vlaary'"  \
+    "print variable length string"
+gdb_test "ptype arr_vla2"  \
+    "type = character\\*6"  \
+    "print variable length string type"
+gdb_test "print arr2"  \
+    " = \\\('vlaaryvla', 'vlaaryvla', 'vlaaryvla'\\\)"  \
+    "print string array"
+gdb_test "ptype arr2"  \
+    "type = character\\*9 \\(3\\)"  \
+    "print string array type"
+gdb_test "print rank(arr_vla1)"  \
+    "$decimal"  \
+    "print string array rank"
diff --git a/gdb/testsuite/gdb.fortran/vla-array.f90 b/gdb/testsuite/gdb.fortran/vla-array.f90
new file mode 100644
index 00000000000..56dd85b1551
--- /dev/null
+++ b/gdb/testsuite/gdb.fortran/vla-array.f90
@@ -0,0 +1,45 @@ 
+! Copyright 2024 Free Software Foundation, Inc.
+!
+! This program is free software; you can redistribute it and/or modify
+! it under the terms of the GNU General Public License as published by
+! the Free Software Foundation; either version 3 of the License, or
+! (at your option) any later version.
+!
+! This program is distributed in the hope that it will be useful,
+! but WITHOUT ANY WARRANTY; without even the implied warranty of
+! MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+! GNU General Public License for more details.
+!
+! You should have received a copy of the GNU General Public License
+! along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+subroutine vla_array_func (arr_vla1, arr_vla2, arr2)
+  character (len=*):: arr_vla1 (:)
+  character (len=*):: arr_vla2
+  character (len=9):: arr2 (:)
+
+  print *, arr_vla1    ! arr_vla1-print
+  print *, arr_vla2
+  print *, arr2
+  print *, rank(arr_vla1)
+end subroutine vla_array_func
+
+program vla_array_main
+interface
+  subroutine vla_array_func (arr_vla1, arr_vla2, arr2)
+    character (len=*):: arr_vla1 (:)
+    character (len=*):: arr_vla2
+    character (len=9):: arr2 (:)
+  end subroutine vla_array_func
+end interface
+  character (len=9) :: arr1 (3)
+  character (len=6) :: arr2
+  character (len=12) :: arr3 (5)
+
+  arr1 = 'vlaaryvla'
+  arr2 = 'vlaary'
+  arr3 = 'vlaaryvlaary'
+
+  call vla_array_func (arr3, arr2, arr1)
+
+end program vla_array_main
diff --git a/gdb/testsuite/gdb.opt/fortran-string.exp b/gdb/testsuite/gdb.opt/fortran-string.exp
index e42b2905578..01750e359da 100644
--- a/gdb/testsuite/gdb.opt/fortran-string.exp
+++ b/gdb/testsuite/gdb.opt/fortran-string.exp
@@ -33,5 +33,5 @@  if {![runto f]} {
 
 gdb_test_no_output "set print frame-arguments all"
 gdb_test "frame" ".*s='foo'.*"
-gdb_test "ptype s" "type = character \\(3\\)"
+gdb_test "ptype s" "type = character\\*3"
 gdb_test "p s" "\\$\[0-9\]* = 'foo'"