PR fortran/50549 - should detect different type parameters in structure constructors

Message ID trinity-e33ad297-0829-4d0c-b715-502f50f62f5b-1648410290770@3c-app-gmx-bap05
State New
Headers
Series PR fortran/50549 - should detect different type parameters in structure constructors |

Commit Message

Harald Anlauf March 27, 2022, 7:44 p.m. UTC
  Dear all,

when assigning character pointers, we have a check for same length,
which however does not trigger for character pointers within a
structure constructor.

The attached patch extends the character checks slightly to fix
this loophole.  I've verified that NAG and Crayftn behave similarly,
while Intel does not detect the length difference.

Regtested on x86_64-pc-linux-gnu.

OK for mainline?  Would it be still ok for 12, or rather wait until
branching for 13?

Thanks,
Harald
  

Comments

Tobias Burnus March 28, 2022, 10:05 a.m. UTC | #1
Hi Harald,

On 27.03.22 21:44, Harald Anlauf via Fortran wrote:
> when assigning character pointers, we have a check for same length,
> which however does not trigger for character pointers within a
> structure constructor.
>
> The attached patch extends the character checks slightly to fix
> this loophole.  I've verified that NAG and Crayftn behave similarly,
> while Intel does not detect the length difference.
>
> Regtested on x86_64-pc-linux-gnu.
>
> OK for mainline?

Thanks for the patch! LGTM and I think GCC 12 is still okay.

However, I have a nit:

> --- a/gcc/fortran/resolve.cc
> +++ b/gcc/fortran/resolve.cc
> @@ -1375,11 +1375,22 @@ resolve_structure_cons (gfc_expr *expr, int init)
> ...
> +           long len_a, len_b;
> +           len_a = mpz_get_si (comp->ts.u.cl->length->value.integer);
> +           len_b = mpz_get_si (cons->expr->ts.u.cl->length->value.integer);
> +           gfc_error ("Unequal character lengths (%ld/%ld) for pointer "
> +                      "component %qs in constructor at %L",
> +                      len_a, len_b, comp->name, &cons->expr->where);

'long' might be int32_t instead of int64_t (e.g. on Windows, I think both
MinGW32 and MinGW64, but I am not quite sure). Thus, I wonder whether it
makes more sense to use:

   HOST_WIDE_INT, gfc_mpz_get_hwi() and '%wd'

I note that '%wd' (and '%lld') is only supported since last August
(commit https://gcc.gnu.org/r12-3044-g1b507b1e3c5 ), but now that it is,
I think we should use it at places where the value can be larger than INT_MAX.

I think at some point, we should also check the rest of the code and
change those mpz_get_si to gfc_mpz_get_hwi which can exceed INT_MAX.
Likewise, some of the %ld/%lu or %lld/%llu code should be also converted to %wd/%wu.

Tobias

PS: For using HWI with 'sprintf' instead of diagnostic's error/warning,
HOST_WIDE_INT_PRINT_DEC exists and has to be used.

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Harald Anlauf March 28, 2022, 8:03 p.m. UTC | #2
Hi Tobias,

Am 28.03.22 um 12:05 schrieb Tobias Burnus:
> Thanks for the patch! LGTM and I think GCC 12 is still okay.
>
> However, I have a nit:
>
>> --- a/gcc/fortran/resolve.cc
>> +++ b/gcc/fortran/resolve.cc
>> @@ -1375,11 +1375,22 @@ resolve_structure_cons (gfc_expr *expr, int init)
>> ...
>> +           long len_a, len_b;
>> +           len_a = mpz_get_si (comp->ts.u.cl->length->value.integer);
>> +           len_b = mpz_get_si
>> (cons->expr->ts.u.cl->length->value.integer);
>> +           gfc_error ("Unequal character lengths (%ld/%ld) for pointer "
>> +                      "component %qs in constructor at %L",
>> +                      len_a, len_b, comp->name, &cons->expr->where);
>
> 'long' might be int32_t instead of int64_t (e.g. on Windows, I think both
> MinGW32 and MinGW64, but I am not quite sure). Thus, I wonder whether it
> makes more sense to use:
>
>    HOST_WIDE_INT, gfc_mpz_get_hwi() and '%wd'
>
> I note that '%wd' (and '%lld') is only supported since last August
> (commit https://gcc.gnu.org/r12-3044-g1b507b1e3c5 ), but now that it is,
> I think we should use it at places where the value can be larger than
> INT_MAX.

using HOST_WIDE_INT as in the updated patch (sort of) works, but for
some reason I do not yet understand the format check kicks in for
gfc_error, producing:

../../gcc-trunk/gcc/fortran/resolve.cc: In function 'bool
resolve_structure_cons(gfc_expr*, int)':
../../gcc-trunk/gcc/fortran/resolve.cc:1388:43: warning: unknown
conversion type character 'w' in format [-Wformat=]
      la, lb, comp->name, &cons->expr->where);
                                            ^
../../gcc-trunk/gcc/fortran/resolve.cc:1388:43: warning: unknown
conversion type character 'w' in format [-Wformat=]
../../gcc-trunk/gcc/fortran/resolve.cc:1388:43: warning: format '%s'
expects argument of type 'char*', but argument 2 has type 'long int'
[-Wformat=]
../../gcc-trunk/gcc/fortran/resolve.cc:1388:43: warning: format '%L'
expects argument of type 'locus*', but argument 3 has type 'long int'
[-Wformat=]
../../gcc-trunk/gcc/fortran/resolve.cc:1388:43: warning: too many
arguments for format [-Wformat-extra-args]

This would likely lead to a bootstrap error.

Do I need to add some forgotten include?  Or some annotation to
suppress the warning?

Or should I rather convert the character lengths via sprintf first
before generating the error message?  (That would be the quick fix.)

> I think at some point, we should also check the rest of the code and
> change those mpz_get_si to gfc_mpz_get_hwi which can exceed INT_MAX.
> Likewise, some of the %ld/%lu or %lld/%llu code should be also converted
> to %wd/%wu.
>
> Tobias
>
> PS: For using HWI with 'sprintf' instead of diagnostic's error/warning,
> HOST_WIDE_INT_PRINT_DEC exists and has to be used.

All current cases of printing a HOST_WIDE_INT in gcc/fortran/ use
'sprintf', and I did not find any other use of %wd/%wu.  So the
mentioned implementation is not really stressed yet... ;-)

Thanks,
Harald

> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
> Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
> Registergericht München, HRB 106955
>
  
Harald Anlauf March 28, 2022, 9:08 p.m. UTC | #3
Hi Tobias,

sorry for replying to myself now, but

Am 28.03.22 um 22:03 schrieb Harald Anlauf via Fortran:
> All current cases of printing a HOST_WIDE_INT in gcc/fortran/ use
> 'sprintf', and I did not find any other use of %wd/%wu.  So the
> mentioned implementation is not really stressed yet... ;-)

using HOST_WIDE_INT_PRINT_DEC in the format argument to gfc_error
instead of using %wd does not produce a warning and works.
(Also verified with insane character lengths on x86_64).

Harald
  
Joseph Myers March 28, 2022, 9:52 p.m. UTC | #4
On Mon, 28 Mar 2022, Harald Anlauf via Gcc-patches wrote:

> Hi Tobias,
> 
> sorry for replying to myself now, but
> 
> Am 28.03.22 um 22:03 schrieb Harald Anlauf via Fortran:
> > All current cases of printing a HOST_WIDE_INT in gcc/fortran/ use
> > 'sprintf', and I did not find any other use of %wd/%wu.  So the
> > mentioned implementation is not really stressed yet... ;-)
> 
> using HOST_WIDE_INT_PRINT_DEC in the format argument to gfc_error
> instead of using %wd does not produce a warning and works.
> (Also verified with insane character lengths on x86_64).

Using string concatenation with a macro is not appropriate in a message 
argument to a diagnostic function, because it means the full string (which 
has to be host-independent) doesn't get extracted for translation.

HOST_WIDE_INT_PRINT_* are printf formats for the host printf function (for 
example, they might use %I64d on Windows host), and are not generally 
understood by the diagnostic.cc machinery at all; functions using the GCC 
diagnostic machinery need to use GCC diagnostic formats (which are 
independent of the host printf function), such as %wd/%wu.
  
Tobias Burnus March 29, 2022, 7:14 a.m. UTC | #5
Hi Harald,

On 28.03.22 22:03, Harald Anlauf via Fortran wrote:
> All current cases of printing a HOST_WIDE_INT in gcc/fortran/ use
> 'sprintf', and I did not find any other use of %wd/%wu.  So the
> mentioned implementation is not really stressed yet... ;-)

That's all your fault ;-)

(Your commit
https://gcc.gnu.org/r12-3273-ge4cb3bb9ac11b4126ffa718287dd509a4b10a658
did remove the only user.)

> ../../gcc-trunk/gcc/fortran/resolve.cc: In function 'bool
> resolve_structure_cons(gfc_expr*, int)':
> ../../gcc-trunk/gcc/fortran/resolve.cc:1388:43: warning: unknown
> conversion type character 'w' in format [-Wformat=]
>      la, lb, comp->name, &cons->expr->where);
>                                            ^

That's only a warning. Have you tried whether it works at runtime?
My bet is that it does!

Question: Do you build with --disable-bootstrap ? Or do you do a proper bootstrap?

I am asking because:
* Here, it bootstraps *without* warnings/errors (I do a full bootstrap)
* GCC is bootstrapped with -Werror, i.e. I had expected an error and not a warning,
   while with --disable-bootstrap, the -Werror is not used as some random
   compiler may have additional (correct or bogus) warnings.
* %wd is only supported since GCC 12.
   The supported formats + the warning is bound to the compiler version,
   i.e. an older compiler does not support newer flags and, thus, warns for
   them when compiling. (But as the support depends on the current source,
   the compile-time warning of older compilers can be ignored.)

   * * *

Can you check & try again?  I don't mind getting a format warning with
GCC < GCC 12. But with GCC 12 compiled (either installed compiler or
when bootstrapping) it should compile without errors.

If you can confirm my suspicion, the patch LGTM.

Tobias

PS: I played around a bit. And with a GCC 12 bootstrap,
I get as expected an error (not a warning) for something unsupported (%Wd)
while %wd is supported. Additionally, I see a '|' in the output.

The "|" appeared with GCC 9. Thus, I wonder whether you compile w/o
bootstrapping with GCC 8?


../gcc/fortran/resolve.cc:1386:55: error: unknown conversion type character ‘W’ in format [-Werror=format=]
  1386 |               gfc_error ("Unequal character lengths (%Wd/%wd) for pointer "
       |                                                       ^

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
  
Harald Anlauf March 29, 2022, 6:48 p.m. UTC | #6
Hi Tobias,

Am 29.03.22 um 09:14 schrieb Tobias Burnus:
> Hi Harald,
>
> On 28.03.22 22:03, Harald Anlauf via Fortran wrote:
>> All current cases of printing a HOST_WIDE_INT in gcc/fortran/ use
>> 'sprintf', and I did not find any other use of %wd/%wu.  So the
>> mentioned implementation is not really stressed yet... ;-)
>
> That's all your fault ;-)

true; I'm pleading guilty for that one.

> (Your commit
> https://gcc.gnu.org/r12-3273-ge4cb3bb9ac11b4126ffa718287dd509a4b10a658
> did remove the only user.)

I've now made good for it.  ;-)

> That's only a warning. Have you tried whether it works at runtime?
> My bet is that it does!

Yes, it did work, it was just the warning alerting me ...

> Question: Do you build with --disable-bootstrap ? Or do you do a proper
> bootstrap?

... because I did build with --disable-bootstrap to save on time for
building the compiler on my local machine, and the system's default
gcc is older.

> Can you check & try again?  I don't mind getting a format warning with
> GCC < GCC 12. But with GCC 12 compiled (either installed compiler or
> when bootstrapping) it should compile without errors.
>
> If you can confirm my suspicion, the patch LGTM.

I've just pushed that version as

   https://gcc.gnu.org/g:0712f356374c2cf26015cccfa3141537e42cbb12

Sorry for the noise, and thanks for the review!

Harald

> Tobias
  

Patch

From 3b88c941619bc4996ef7d4ba247086f04deb8683 Mon Sep 17 00:00:00 2001
From: Harald Anlauf <anlauf@gmx.de>
Date: Sun, 27 Mar 2022 21:35:15 +0200
Subject: [PATCH] Fortran: character length of pointer assignments in structure
 constructors

gcc/fortran/ChangeLog:

	PR fortran/50549
	* resolve.cc (resolve_structure_cons): Reject pointer assignments
	of character with different lengths in structure constructor.

gcc/testsuite/ChangeLog:

	PR fortran/50549
	* gfortran.dg/char_pointer_assign_7.f90: New test.
---
 gcc/fortran/resolve.cc                        | 13 ++++++-
 .../gfortran.dg/char_pointer_assign_7.f90     | 38 +++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/char_pointer_assign_7.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 5522be75199..b4400a7ab8d 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1375,11 +1375,22 @@  resolve_structure_cons (gfc_expr *expr, int init)
 	  && comp->ts.u.cl->length->expr_type == EXPR_CONSTANT
 	  && cons->expr->ts.u.cl && cons->expr->ts.u.cl->length
 	  && cons->expr->ts.u.cl->length->expr_type == EXPR_CONSTANT
-	  && cons->expr->rank != 0
 	  && mpz_cmp (cons->expr->ts.u.cl->length->value.integer,
 		      comp->ts.u.cl->length->value.integer) != 0)
 	{
+	  if (comp->attr.pointer)
+	    {
+	      long len_a, len_b;
+	      len_a = mpz_get_si (comp->ts.u.cl->length->value.integer);
+	      len_b = mpz_get_si (cons->expr->ts.u.cl->length->value.integer);
+	      gfc_error ("Unequal character lengths (%ld/%ld) for pointer "
+			 "component %qs in constructor at %L",
+			 len_a, len_b, comp->name, &cons->expr->where);
+	      t = false;
+	    }
+
 	  if (cons->expr->expr_type == EXPR_VARIABLE
+	      && cons->expr->rank != 0
 	      && cons->expr->symtree->n.sym->attr.flavor == FL_PARAMETER)
 	    {
 	      /* Wrap the parameter in an array constructor (EXPR_ARRAY)
diff --git a/gcc/testsuite/gfortran.dg/char_pointer_assign_7.f90 b/gcc/testsuite/gfortran.dg/char_pointer_assign_7.f90
new file mode 100644
index 00000000000..08bdf176d8b
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/char_pointer_assign_7.f90
@@ -0,0 +1,38 @@ 
+! { dg-do compile }
+! PR fortran/50549 - should reject pointer assignments of different lengths
+! in structure constructors
+
+program test
+  implicit none
+  type t
+     character(2), pointer ::  p2
+  end type t
+  type t2
+     character(2), pointer ::  p(:)
+  end type t2
+  type td
+     character(:), pointer ::  pd
+  end type td
+  interface
+     function f1 ()
+       character(1), pointer :: f1
+     end function f1
+     function f2 ()
+       character(2), pointer :: f2
+     end function f2
+  end interface
+
+  character(1),    target  ::  p1
+  character(1),    pointer ::  q1(:)
+  character(2),    pointer ::  q2(:)
+  type(t)  :: u
+  type(t2) :: u2
+  type(td) :: v
+  u  = t(p1)    ! { dg-error "Unequal character lengths" }
+  u  = t(f1())  ! { dg-error "Unequal character lengths" }
+  u  = t(f2())  ! OK
+  u2 = t2(q1)   ! { dg-error "Unequal character lengths" }
+  u2 = t2(q2)   ! OK
+  v  = td(p1)   ! OK
+  v  = td(f1()) ! OK
+end
--
2.34.1