[RFC,v3,3/3] c: Add __lengthof__() operator

Message ID 20240803231702.89150-4-alx@kernel.org
State New
Headers
Series c: Add __lengthof__ operator |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed

Commit Message

Alejandro Colomar Aug. 3, 2024, 11:17 p.m. UTC
  This operator is similar to sizeof() but can only be applied to an
array, and returns its length (number of elements).

TO BE DECIDED BEFORE MERGING:

	It would be better to not evaluate the operand if the top-level
	array is not a VLA.

FUTURE DIRECTIONS:

	We could make it work with array parameters to functions, and
	somehow magically return the length designator of the array,
	regardless of it being really a pointer.

Link: <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2529.pdf>
Cc: Xavier Del Campo Romero <xavi.dcr@tutanota.com>
Cc: Gabriel Ravier <gabravier@gmail.com>
Cc: Martin Uecker <uecker@tugraz.at>
Cc: Joseph Myers <josmyers@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 gcc/c-family/c-common.cc  | 26 +++++++++++
 gcc/c-family/c-common.def |  3 ++
 gcc/c-family/c-common.h   |  2 +
 gcc/c/c-decl.cc           | 20 ++++++---
 gcc/c/c-parser.cc         | 61 +++++++++++++++++++------
 gcc/c/c-tree.h            |  4 ++
 gcc/c/c-typeck.cc         | 95 +++++++++++++++++++++++++++++++++++++--
 gcc/cp/operators.def      |  1 +
 gcc/doc/extend.texi       | 12 +++++
 gcc/target.h              |  3 ++
 10 files changed, 203 insertions(+), 24 deletions(-)
  

Comments

Martin Uecker Aug. 4, 2024, 5:38 a.m. UTC | #1
Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:


> 
> FUTURE DIRECTIONS:
> 
> 	We could make it work with array parameters to functions, and
> 	somehow magically return the length designator of the array,
> 	regardless of it being really a pointer.

And maybe flexible array members with "counted_by" attribute.



> +
> +/* Implement the lengthof keyword: Return the length of an array,
> +   that is, the number of elements in the array.  */
> +
> +tree
> +c_lengthof_type (location_t loc, tree type)
> +{
> +  enum tree_code type_code;
> +
> +  type_code = TREE_CODE (type);
> +  if (!COMPLETE_TYPE_P (type))
> +    {
> +      error_at (loc,
> +		"invalid application of %<lengthof%> to incomplete type %qT",
> +		type);
> +      return error_mark_node;
> +    }
> +  if (type_code != ARRAY_TYPE)
> +    {
> +      error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
> +      return error_mark_node;
> +    }

I would swap these two errors, because the first is more specific and
less helpful if you pass an incomplete struct, where it would be better
to get the second error.

Martin

> +
> +  return array_type_nelts_top (type);
> +}
  
Alejandro Colomar Aug. 4, 2024, 8:25 a.m. UTC | #2
Hi Martin,

On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > 
> > FUTURE DIRECTIONS:
> > 
> > 	We could make it work with array parameters to functions, and
> > 	somehow magically return the length designator of the array,
> > 	regardless of it being really a pointer.
> 
> And maybe flexible array members with "counted_by" attribute.

Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
that in this patch set.

> > +
> > +/* Implement the lengthof keyword: Return the length of an array,
> > +   that is, the number of elements in the array.  */
> > +
> > +tree
> > +c_lengthof_type (location_t loc, tree type)
> > +{
> > +  enum tree_code type_code;
> > +
> > +  type_code = TREE_CODE (type);
> > +  if (!COMPLETE_TYPE_P (type))
> > +    {
> > +      error_at (loc,
> > +		"invalid application of %<lengthof%> to incomplete type %qT",
> > +		type);
> > +      return error_mark_node;
> > +    }
> > +  if (type_code != ARRAY_TYPE)
> > +    {
> > +      error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
> > +      return error_mark_node;
> > +    }
> 
> I would swap these two errors, because the first is more specific and
> less helpful if you pass an incomplete struct, where it would be better
> to get the second error.

Agree.

BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
within array_type_nelts_minus_one().  What code triggers that condition?
Am I missing error handling for that?  Thanks!

Have a lovely day!
Alex

> 
> Martin
> 
> > +
> > +  return array_type_nelts_top (type);
> > +}
>
  
Martin Uecker Aug. 4, 2024, 9:39 a.m. UTC | #3
Am Sonntag, dem 04.08.2024 um 10:25 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> > Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > > 
> > > FUTURE DIRECTIONS:
> > > 
> > > 	We could make it work with array parameters to functions, and
> > > 	somehow magically return the length designator of the array,
> > > 	regardless of it being really a pointer.
> > 
> > And maybe flexible array members with "counted_by" attribute.
> 
> Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
> that in this patch set.

But maybe wait for this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

> 
> > > +
> > > +/* Implement the lengthof keyword: Return the length of an array,
> > > +   that is, the number of elements in the array.  */
> > > +
> > > +tree
> > > +c_lengthof_type (location_t loc, tree type)
> > > +{
> > > +  enum tree_code type_code;
> > > +
> > > +  type_code = TREE_CODE (type);
> > > +  if (!COMPLETE_TYPE_P (type))
> > > +    {
> > > +      error_at (loc,
> > > +		"invalid application of %<lengthof%> to incomplete type %qT",
> > > +		type);
> > > +      return error_mark_node;
> > > +    }
> > > +  if (type_code != ARRAY_TYPE)
> > > +    {
> > > +      error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
> > > +      return error_mark_node;
> > > +    }
> > 
> > I would swap these two errors, because the first is more specific and
> > less helpful if you pass an incomplete struct, where it would be better
> > to get the second error.
> 
> Agree.
> 
> BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> within array_type_nelts_minus_one().  What code triggers that condition?
> Am I missing error handling for that?  Thanks!

For incomplete arrays, basically we have the following different
variants for arrays:

T[ ] incomplete: !TYPE_DOMAIN 
T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
  (ISO version T[0] has TYPE_SIZE == NULL_TREE)
T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE

The first should give an error. The next two should work giving an
integer constant expression or run-time size, respectively. 
The ISO FAM case should also give an error, while the GNU fam case
could return zero to be consistent with sizeof (not sure). The last 
case should return a non-constant.

If you reuse the sizeof code, it should be mostly correct, but
maybe the last case needs to be revisted. In the following
examples

void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);

the array '*x' should be a regular fixed size array in foo
but a VLA in 'bar'.


Martin
  
Alejandro Colomar Aug. 4, 2024, 4:40 p.m. UTC | #4
Hi Martin,

On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> Am Sonntag, dem 04.08.2024 um 10:25 +0200 schrieb Alejandro Colomar:
> > Hi Martin,
> > 
> > On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> > > Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > > > 
> > > > FUTURE DIRECTIONS:
> > > > 
> > > > 	We could make it work with array parameters to functions, and
> > > > 	somehow magically return the length designator of the array,
> > > > 	regardless of it being really a pointer.
> > > 
> > > And maybe flexible array members with "counted_by" attribute.
> > 
> > Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
> > that in this patch set.
> 
> But maybe wait for this:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016

Maybe; if I don't find a way to implement it now, their patches may give
me some tool to do it.  I'll still try to do it now, just to try.

I've drafted something by making build_counted_by_ref() an extern
function, and then I wrote

	diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
	index 9f5feb83345..875d471a1f4 100644
	--- a/gcc/c-family/c-common.cc
	+++ b/gcc/c-family/c-common.cc
	@@ -4078,6 +4078,7 @@ c_alignof_expr (location_t loc, tree expr)
	 tree
	 c_lengthof_type (location_t loc, tree type)
	 {
	+  tree counted_by;
	   enum tree_code type_code;
	 
	   type_code = TREE_CODE (type);
	@@ -4086,6 +4087,12 @@ c_lengthof_type (location_t loc, tree type)
	       error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
	       return error_mark_node;
	     }
	+
	+  counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES (type));
	+  if (counted_by)
	+    // XXX: How to build the counted_by ref without the parent struct type?
	+    return build_counted_by_ref (NULL, type, counted_by);
	+
	   if (!COMPLETE_TYPE_P (type))
	     {
	       error_at (loc,

But since I don't have the struct to which the FAM belongs, I don't know
how to get that working.  Do you have any idea?  Or maybe it's better to
just wait for those patches to be merged, and reuse their
infrastructure?


> > > > +
> > > > +/* Implement the lengthof keyword: Return the length of an array,
> > > > +   that is, the number of elements in the array.  */
> > > > +
> > > > +tree
> > > > +c_lengthof_type (location_t loc, tree type)
> > > > +{
> > > > +  enum tree_code type_code;
> > > > +
> > > > +  type_code = TREE_CODE (type);
> > > > +  if (!COMPLETE_TYPE_P (type))
> > > > +    {
> > > > +      error_at (loc,
> > > > +		"invalid application of %<lengthof%> to incomplete type %qT",
> > > > +		type);
> > > > +      return error_mark_node;
> > > > +    }
> > > > +  if (type_code != ARRAY_TYPE)
> > > > +    {
> > > > +      error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
> > > > +      return error_mark_node;
> > > > +    }
> > > 
> > > I would swap these two errors, because the first is more specific and
> > > less helpful if you pass an incomplete struct, where it would be better
> > > to get the second error.
> > 
> > Agree.
> > 
> > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > within array_type_nelts_minus_one().  What code triggers that condition?
> > Am I missing error handling for that?  Thanks!
> 
> For incomplete arrays, basically we have the following different
> variants for arrays:

Thanks!  This list is very useful.

> T[ ] incomplete: !TYPE_DOMAIN 
> T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST

I guess that old flexible-array members using [1] are understood as
normal arrays [42], right?

> T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
>   (ISO version T[0] has TYPE_SIZE == NULL_TREE)

ISO version is T[], right?  Did ISO add support for zero-sized arrays?

> T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE

This is not allowed aoutside of function-prototype scope.  And there it
decays to pointer way before reaching __lengthof__.  So we can't handle
that at the moment.  However, we'll have to keep it in mind for when you
do the change to keep the array types of function prototypes.  When that
happens, I guess we'll have to reject these arrays.

> 
> The first should give an error.

Agree (and already implemented []).

> The next two should work giving an
> integer constant expression or run-time size, respectively. 

Agree (and already implemented [42] and [n]).

> The ISO FAM case should also give an error,

Agree (and already implemented []).  (Although I didn't really
distinguish it from an incomplete type.)

Although, if attributed with counted_by, we'd like it to work.  But this
is not yet implemented.

> while the GNU fam case
> could return zero to be consistent with sizeof (not sure).

Agree.  I've made it consistent with sizeof, and it returns 0.

BTW, I'd like to add full support for zero-sized arrays in the language.
I was discussing it with Jens Gustedt last week, and will start some
discussion about it in a separate thread.

> The last 
> case should return a non-constant.

The last case [*] is only allowed in prototypes.  How should we get the
non-constant value?  It's just another way to say [], isn't it?

> If you reuse the sizeof code, it should be mostly correct, but
> maybe the last case needs to be revisted. In the following
> examples
> 
> void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 
> the array '*x' should be a regular fixed size array in foo
> but a VLA in 'bar'.

In the function prototype, it seems to be completely ignoring
__lengthof__, just as it ignores any expression, so I don't know if it's
working (I could try to print some debugging values to stderr from the
compiler, if we care about it).

	$ cat muecker.h 
	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
	void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
	void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
	$

I assume the code above is not reaching my code.

In the function definition, it doesn't accept [*] at all, so I can't
handle it:

	$ cat muecker.c
	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]) {}
	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]) {}
	void f(char (*a)[3][*], int (*x)[sizeof(*a)]) {}
	void b(char (*a)[*][3], int (*x)[sizeof(*a)]) {}
	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.c -S
	muecker.c:1:1: error: ‘[*]’ not allowed in other than function prototype scope
	    1 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]) {}
	      | ^~~~
	muecker.c:2:1: error: ‘[*]’ not allowed in other than function prototype scope
	    2 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]) {}
	      | ^~~~
	muecker.c:3:1: error: ‘[*]’ not allowed in other than function prototype scope
	    3 | void f(char (*a)[3][*], int (*x)[sizeof(*a)]) {}
	      | ^~~~
	muecker.c:4:1: error: ‘[*]’ not allowed in other than function prototype scope
	    4 | void b(char (*a)[*][3], int (*x)[sizeof(*a)]) {}
	      | ^~~~


Have a lovely day!
Alex

> Martin
  
Alejandro Colomar Aug. 4, 2024, 4:43 p.m. UTC | #5
On Sun, Aug 04, 2024 at 06:40:14PM GMT, Alejandro Colomar wrote:
> > The last 
> > case should return a non-constant.
> 
> The last case [*] is only allowed in prototypes.  How should we get the
> non-constant value?  It's just another way to say [], isn't it?
> 
> > If you reuse the sizeof code, it should be mostly correct, but
> > maybe the last case needs to be revisted. In the following
> > examples
> > 
> > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > 
> > the array '*x' should be a regular fixed size array in foo
> > but a VLA in 'bar'.
> 
> In the function prototype, it seems to be completely ignoring
> __lengthof__, just as it ignores any expression, so I don't know if it's
> working (I could try to print some debugging values to stderr from the
> compiler, if we care about it).

Huh, no, my bad.  It must be using the lengthof value.  It needs to
match pointers to arrays of a given size.  I'll test this.

> 
> 	$ cat muecker.h 
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 	void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
> 	void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
> 	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
> 	$
> 
> I assume the code above is not reaching my code.
  
Martin Uecker Aug. 4, 2024, 5:28 p.m. UTC | #6
Hi Alex,

Am Sonntag, dem 04.08.2024 um 18:40 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > Am Sonntag, dem 04.08.2024 um 10:25 +0200 schrieb Alejandro Colomar:
> > > Hi Martin,
> > > 
> > > On Sun, Aug 04, 2024 at 07:38:49AM GMT, Martin Uecker wrote:
> > > > Am Sonntag, dem 04.08.2024 um 01:17 +0200 schrieb Alejandro Colomar:
> > > > > 
> > > > > FUTURE DIRECTIONS:
> > > > > 
> > > > > 	We could make it work with array parameters to functions, and
> > > > > 	somehow magically return the length designator of the array,
> > > > > 	regardless of it being really a pointer.
> > > > 
> > > > And maybe flexible array members with "counted_by" attribute.
> > > 
> > > Hmmm; I didn't know that one.  Indeed.  I'll have a look at implementing
> > > that in this patch set.
> > 
> > But maybe wait for this:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016
> 
> Maybe; if I don't find a way to implement it now, their patches may give
> me some tool to do it.  I'll still try to do it now, just to try.
> 
> I've drafted something by making build_counted_by_ref() an extern
> function, and then I wrote
> 
> 	diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
> 	index 9f5feb83345..875d471a1f4 100644
> 	--- a/gcc/c-family/c-common.cc
> 	+++ b/gcc/c-family/c-common.cc
> 	@@ -4078,6 +4078,7 @@ c_alignof_expr (location_t loc, tree expr)
> 	 tree
> 	 c_lengthof_type (location_t loc, tree type)
> 	 {
> 	+  tree counted_by;
> 	   enum tree_code type_code;
> 	 
> 	   type_code = TREE_CODE (type);
> 	@@ -4086,6 +4087,12 @@ c_lengthof_type (location_t loc, tree type)
> 	       error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
> 	       return error_mark_node;
> 	     }
> 	+
> 	+  counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES (type));
> 	+  if (counted_by)
> 	+    // XXX: How to build the counted_by ref without the parent struct type?
> 	+    return build_counted_by_ref (NULL, type, counted_by);
> 	+
> 	   if (!COMPLETE_TYPE_P (type))
> 	     {
> 	       error_at (loc,
> 
> But since I don't have the struct to which the FAM belongs, I don't know
> how to get that working.  Do you have any idea?  Or maybe it's better to
> just wait for those patches to be merged, and reuse their
> infrastructure?

I would wait, also not to duplicate work.

> 
> > > > > +
> > > > > +/* Implement the lengthof keyword: Return the length of an array,
> > > > > +   that is, the number of elements in the array.  */
> > > > > +
> > > > > +tree
> > > > > +c_lengthof_type (location_t loc, tree type)
> > > > > +{
> > > > > +  enum tree_code type_code;
> > > > > +
> > > > > +  type_code = TREE_CODE (type);
> > > > > +  if (!COMPLETE_TYPE_P (type))
> > > > > +    {
> > > > > +      error_at (loc,
> > > > > +		"invalid application of %<lengthof%> to incomplete type %qT",
> > > > > +		type);
> > > > > +      return error_mark_node;
> > > > > +    }
> > > > > +  if (type_code != ARRAY_TYPE)
> > > > > +    {
> > > > > +      error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
> > > > > +      return error_mark_node;
> > > > > +    }
> > > > 
> > > > I would swap these two errors, because the first is more specific and
> > > > less helpful if you pass an incomplete struct, where it would be better
> > > > to get the second error.
> > > 
> > > Agree.
> > > 
> > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > Am I missing error handling for that?  Thanks!
> > 
> > For incomplete arrays, basically we have the following different
> > variants for arrays:
> 
> Thanks!  This list is very useful.
> 
> > T[ ] incomplete: !TYPE_DOMAIN 
> > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> 
> I guess that old flexible-array members using [1] are understood as
> normal arrays [42], right?

I think so.

> 
> > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> 
> ISO version is T[], right?  Did ISO add support for zero-sized arrays?

No, my fault. ISO FAMs are T[].

> 
> > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> 
> This is not allowed aoutside of function-prototype scope.  And there it
> decays to pointer way before reaching __lengthof__.  So we can't handle
> that at the moment.  

But __lengthof__ can occur at function prototype scope,
although this is then not evaluated

> However, we'll have to keep it in mind for when you
> do the change to keep the array types of function prototypes.  When that
> happens, I guess we'll have to reject these arrays.

I am not yet sure this will work, but let's see.
> 
> > 
> > The first should give an error.
> 
> Agree (and already implemented []).
> 
> > The next two should work giving an
> > integer constant expression or run-time size, respectively. 
> 
> Agree (and already implemented [42] and [n]).
> 
> > The ISO FAM case should also give an error,
> 
> Agree (and already implemented []).  (Although I didn't really
> distinguish it from an incomplete type.)
> 
> Although, if attributed with counted_by, we'd like it to work.  But this
> is not yet implemented.
> 
> > while the GNU fam case
> > could return zero to be consistent with sizeof (not sure).
> 
> Agree.  I've made it consistent with sizeof, and it returns 0.
> 
> BTW, I'd like to add full support for zero-sized arrays in the language.
> I was discussing it with Jens Gustedt last week, and will start some
> discussion about it in a separate thread.

Ok, I would support this, but there are plenty of issues.


> > The last 
> > case should return a non-constant.
> 
> The last case [*] is only allowed in prototypes.  How should we get the
> non-constant value?  It's just another way to say [], isn't it?

No, it is the same as [n] but can be used only where the
length is not needed because it is never evaluated.  So
you do not get a value at run-time, but as shown below it
matters that the value you would get is not an integer constant.

Basically, you need to check whether the first dimension is
variable and this case should be handled correctly.  You
can look at comptypes_internal etc. how this could be done.

Martin

> 
> > If you reuse the sizeof code, it should be mostly correct, but
> > maybe the last case needs to be revisted. In the following
> > examples
> > 
> > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > 
> > the array '*x' should be a regular fixed size array in foo
> > but a VLA in 'bar'.
> 
> In the function prototype, it seems to be completely ignoring
> __lengthof__, just as it ignores any expression, so I don't know if it's
> working (I could try to print some debugging values to stderr from the
> compiler, if we care about it).
> 
> 	$ cat muecker.h 
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 	void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
> 	void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
> 	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
> 	$
> 
> I assume the code above is not reaching my code.
> 
> In the function definition, it doesn't accept [*] at all, so I can't
> handle it:
> 
> 	$ cat muecker.c
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]) {}
> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]) {}
> 	void f(char (*a)[3][*], int (*x)[sizeof(*a)]) {}
> 	void b(char (*a)[*][3], int (*x)[sizeof(*a)]) {}
> 	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.c -S
> 	muecker.c:1:1: error: ‘[*]’ not allowed in other than function prototype scope
> 	    1 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]) {}
> 	      | ^~~~
> 	muecker.c:2:1: error: ‘[*]’ not allowed in other than function prototype scope
> 	    2 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]) {}
> 	      | ^~~~
> 	muecker.c:3:1: error: ‘[*]’ not allowed in other than function prototype scope
> 	    3 | void f(char (*a)[3][*], int (*x)[sizeof(*a)]) {}
> 	      | ^~~~
> 	muecker.c:4:1: error: ‘[*]’ not allowed in other than function prototype scope
> 	    4 | void b(char (*a)[*][3], int (*x)[sizeof(*a)]) {}
> 	      | ^~~~
> 
> 
> Have a lovely day!
> Alex
> 
> > Martin
>
  
Alejandro Colomar Aug. 4, 2024, 5:49 p.m. UTC | #7
Hi Martin,

On Sun, Aug 04, 2024 at 06:43:46PM GMT, Alejandro Colomar wrote:
> On Sun, Aug 04, 2024 at 06:40:14PM GMT, Alejandro Colomar wrote:
> > > The last 
> > > case should return a non-constant.
> > 
> > The last case [*] is only allowed in prototypes.  How should we get the
> > non-constant value?  It's just another way to say [], isn't it?
> > 
> > > If you reuse the sizeof code, it should be mostly correct, but
> > > maybe the last case needs to be revisted. In the following
> > > examples
> > > 
> > > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > > 
> > > the array '*x' should be a regular fixed size array in foo
> > > but a VLA in 'bar'.
> > 
> > In the function prototype, it seems to be completely ignoring
> > __lengthof__, just as it ignores any expression, so I don't know if it's
> > working (I could try to print some debugging values to stderr from the
> > compiler, if we care about it).
> 
> Huh, no, my bad.  It must be using the lengthof value.  It needs to
> match pointers to arrays of a given size.  I'll test this.

Is this missing diagnostics?

	$ cat star.c 
	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
	void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
	void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);

	int
	main(void)
	{
		int  i3[3];
		int  i5[5];
		char c35[3][5];
		char c53[5][3];

		foo(&c35, &i3);
		foo(&c35, &i5);  // I'd expect this to fail
		bar(&c53, &i3);  // I'd expect this to fail
		bar(&c53, &i5);

		foos(&c35, &i3);
		foos(&c35, &i5);  // I'd expect this to fail
		bars(&c53, &i3);  // I'd expect this to fail
		bars(&c53, &i5);
	}
	$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
	$

Cheers,
Alex

> 
> > 
> > 	$ cat muecker.h 
> > 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > 	void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > 	void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > 	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
> > 	$
> > 
> > I assume the code above is not reaching my code.
> 
> -- 
> <https://www.alejandro-colomar.es/>
  
Martin Uecker Aug. 4, 2024, 6:02 p.m. UTC | #8
Hi Alex,

Am Sonntag, dem 04.08.2024 um 19:49 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 06:43:46PM GMT, Alejandro Colomar wrote:
> > On Sun, Aug 04, 2024 at 06:40:14PM GMT, Alejandro Colomar wrote:
> > > > The last 
> > > > case should return a non-constant.
> > > 
> > > The last case [*] is only allowed in prototypes.  How should we get the
> > > non-constant value?  It's just another way to say [], isn't it?
> > > 
> > > > If you reuse the sizeof code, it should be mostly correct, but
> > > > maybe the last case needs to be revisted. In the following
> > > > examples
> > > > 
> > > > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > > void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > > > 
> > > > the array '*x' should be a regular fixed size array in foo
> > > > but a VLA in 'bar'.
> > > 
> > > In the function prototype, it seems to be completely ignoring
> > > __lengthof__, just as it ignores any expression, so I don't know if it's
> > > working (I could try to print some debugging values to stderr from the
> > > compiler, if we care about it).
> > 
> > Huh, no, my bad.  It must be using the lengthof value.  It needs to
> > match pointers to arrays of a given size.  I'll test this.
> 
> Is this missing diagnostics?
> 
> 	$ cat star.c 
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 	void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
> 	void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);
> 
> 	int
> 	main(void)
> 	{
> 		int  i3[3];
> 		int  i5[5];
> 		char c35[3][5];
> 		char c53[5][3];
> 
> 		foo(&c35, &i3);
> 		foo(&c35, &i5);  // I'd expect this to fail

Yes, this should fail. The int (*)[5] is not
compatible with int(*)[3].

> 		bar(&c53, &i3);  // I'd expect this to fail

This is no contraint violation, because int (*)[5] is
compatible with int (*i)[*], so this needs to be accepted.

It is then UB at run-time and the patches I posted recently
would catch this.  When possible, a compile time warning 
would be nice and I am also looking into this.

It would also be good if we could allow a compiler to
reject this at compile time... also something I am
thinking about.

> 		bar(&c53, &i5);
> 
> 		foos(&c35, &i3);
> 		foos(&c35, &i5);  // I'd expect this to fail
> 		bars(&c53, &i3);  // I'd expect this to fail

These are both okay, because the sizeof is not an integer
constant expressions (both int[*][3] and int[3][*] have
variable size), so the last argument has to be compatible
with int[*] which they both are.  Both would trigger
run-time UB then because the size is then 15.

Martin

> 		bars(&c53, &i5);
> 	}
> 	$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
> 	$
> 
> Cheers,
> Alex
> 
> > 
> > > 
> > > 	$ cat muecker.h 
> > > 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > > 	void f(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > > 	void b(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > > 	$ /opt/local/gnu/gcc/lengthof/bin/gcc muecker.h -S
> > > 	$
> > > 
> > > I assume the code above is not reaching my code.
> > 
> > -- 
> > <https://www.alejandro-colomar.es/>
> 
> 
>
  
Alejandro Colomar Aug. 4, 2024, 6:34 p.m. UTC | #9
On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> Hi Alex,

Hi Martin,

> > Is this missing diagnostics?
> > 
> > 	$ cat star.c 
> > 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > 	void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > 	void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > 
> > 	int
> > 	main(void)
> > 	{
> > 		int  i3[3];
> > 		int  i5[5];
> > 		char c35[3][5];
> > 		char c53[5][3];
> > 
> > 		foo(&c35, &i3);
> > 		foo(&c35, &i5);  // I'd expect this to fail
> 
> Yes, this should fail. The int (*)[5] is not
> compatible with int(*)[3].
> 
> > 		bar(&c53, &i3);  // I'd expect this to fail
> 
> This is no contraint violation, because int (*)[5] is
> compatible with int (*i)[*], so this needs to be accepted.

No constraint, but I'd expect a diagnostic from -Wextra (array-bounds?).

> It is then UB at run-time and the patches I posted recently

Can you please send a link to those patches?

> would catch this.  When possible, a compile time warning 
> would be nice and I am also looking into this.
> 
> It would also be good if we could allow a compiler to
> reject this at compile time... also something I am
> thinking about.

Thanks!

> 
> > 		bar(&c53, &i5);
> > 
> > 		foos(&c35, &i3);
> > 		foos(&c35, &i5);  // I'd expect this to fail
> > 		bars(&c53, &i3);  // I'd expect this to fail
> 
> These are both okay, because the sizeof is not an integer
> constant expressions (both int[*][3] and int[3][*] have
> variable size), so the last argument has to be compatible
> with int[*] which they both are.  Both would trigger
> run-time UB then because the size is then 15.

D'oh!  I screwed it.  I wanted to have written this:

	$ cat star.c 
	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
	void foo2(char (*a)[3][*], int (*x)[sizeof(**a)]);
	void bar2(char (*a)[*][3], int (*x)[sizeof(**a)]);

	int
	main(void)
	{
		int  i3[3];
		int  i5[5];
		char c35[3][5];
		char c53[5][3];

		foo(&c35, &i3);
		foo(&c35, &i5);  // I'd expect this to err
		bar(&c53, &i3);  // I'd expect this to warn
		bar(&c53, &i5);

		foo2(&c35, &i3);  // I'd expect this to warn
		foo2(&c35, &i5);
		bar2(&c53, &i3);
		//bar2(&c53, &i5);  // error: -Wincompatible-pointer-types
	}
	$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
	$ 


> 
> Martin

Cheers,
Alex
  
Martin Uecker Aug. 4, 2024, 7:10 p.m. UTC | #10
Am Sonntag, dem 04.08.2024 um 20:34 +0200 schrieb Alejandro Colomar:
> On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > Hi Alex,
> 
> Hi Martin,
> 
> > > Is this missing diagnostics?
> > > 
> > > 	$ cat star.c 
> > > 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > > 	void foos(char (*a)[3][*], int (*x)[sizeof(*a)]);
> > > 	void bars(char (*a)[*][3], int (*x)[sizeof(*a)]);
> > > 
> > > 	int
> > > 	main(void)
> > > 	{
> > > 		int  i3[3];
> > > 		int  i5[5];
> > > 		char c35[3][5];
> > > 		char c53[5][3];
> > > 
> > > 		foo(&c35, &i3);
> > > 		foo(&c35, &i5);  // I'd expect this to fail
> > 
> > Yes, this should fail. The int (*)[5] is not
> > compatible with int(*)[3].
> > 
> > > 		bar(&c53, &i3);  // I'd expect this to fail
> > 
> > This is no contraint violation, because int (*)[5] is
> > compatible with int (*i)[*], so this needs to be accepted.
> 
> No constraint, but I'd expect a diagnostic from -Wextra (array-bounds?).
> 
> > It is then UB at run-time and the patches I posted recently
> 
> Can you please send a link to those patches?

https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657253.html


Martin


> 
> > would catch this.  When possible, a compile time warning 
> > would be nice and I am also looking into this.
> > 
> > It would also be good if we could allow a compiler to
> > reject this at compile time... also something I am
> > thinking about.
> 
> Thanks!
> 
> > 
> > > 		bar(&c53, &i5);
> > > 
> > > 		foos(&c35, &i3);
> > > 		foos(&c35, &i5);  // I'd expect this to fail
> > > 		bars(&c53, &i3);  // I'd expect this to fail
> > 
> > These are both okay, because the sizeof is not an integer
> > constant expressions (both int[*][3] and int[3][*] have
> > variable size), so the last argument has to be compatible
> > with int[*] which they both are.  Both would trigger
> > run-time UB then because the size is then 15.
> 
> D'oh!  I screwed it.  I wanted to have written this:
> 
> 	$ cat star.c 
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 	void foo2(char (*a)[3][*], int (*x)[sizeof(**a)]);
> 	void bar2(char (*a)[*][3], int (*x)[sizeof(**a)]);
> 
> 	int
> 	main(void)
> 	{
> 		int  i3[3];
> 		int  i5[5];
> 		char c35[3][5];
> 		char c53[5][3];
> 
> 		foo(&c35, &i3);
> 		foo(&c35, &i5);  // I'd expect this to err
> 		bar(&c53, &i3);  // I'd expect this to warn
> 		bar(&c53, &i5);
> 
> 		foo2(&c35, &i3);  // I'd expect this to warn
> 		foo2(&c35, &i5);
> 		bar2(&c53, &i3);
> 		//bar2(&c53, &i5);  // error: -Wincompatible-pointer-types
> 	}
> 	$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
> 	$ 
> 
> 
> > 
> > Martin
> 
> Cheers,
> Alex
>
  
Alejandro Colomar Aug. 5, 2024, 9:45 a.m. UTC | #11
[CC += Kees, Qing]

Hi Joseph,

On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> D'oh!  I screwed it.  I wanted to have written this:
> 
> 	$ cat star.c 
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);

I think this answers your question of if we want __lengthof__ to
evaluate its operand if the top-level array is non-VLA but an inner
array is VLA.

We clearly want it to not evaluate, because we want this __lengthof__
to be a constant expression, ...

> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 	void foo2(char (*a)[3][*], int (*x)[sizeof(**a)]);
> 	void bar2(char (*a)[*][3], int (*x)[sizeof(**a)]);
> 
> 	int
> 	main(void)
> 	{
> 		int  i3[3];
> 		int  i5[5];
> 		char c35[3][5];
> 		char c53[5][3];
> 
> 		foo(&c35, &i3);
> 		foo(&c35, &i5);  // I'd expect this to err

... and thus cause a compile-time error here
(-Wincompatible-pointer-types).

I suspect we need to modify array_type_nelts_minus_one() for that; I'm
going to investigate.

Have a lovely day!
Alex

> 		bar(&c53, &i3);  // I'd expect this to warn
> 		bar(&c53, &i5);
> 
> 		foo2(&c35, &i3);  // I'd expect this to warn
> 		foo2(&c35, &i5);
> 		bar2(&c53, &i3);
> 		//bar2(&c53, &i5);  // error: -Wincompatible-pointer-types
> 	}
> 	$ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
> 	$
  
Jakub Jelinek Aug. 5, 2024, 9:50 a.m. UTC | #12
On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
> [CC += Kees, Qing]
> 
> Hi Joseph,
> 
> On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> > On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > D'oh!  I screwed it.  I wanted to have written this:
> > 
> > 	$ cat star.c 
> > 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 
> I think this answers your question of if we want __lengthof__ to
> evaluate its operand if the top-level array is non-VLA but an inner
> array is VLA.
> 
> We clearly want it to not evaluate, because we want this __lengthof__
> to be a constant expression, ...

But if you don't evaluate the argument, you can't handle counted_by.
Because for counted_by you need the expression (the object on which it is
used).

	Jakub
  
Martin Uecker Aug. 5, 2024, 10:33 a.m. UTC | #13
Am Montag, dem 05.08.2024 um 11:50 +0200 schrieb Jakub Jelinek:
> On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
> > [CC += Kees, Qing]
> > 
> > Hi Joseph,
> > 
> > On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> > > On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > > D'oh!  I screwed it.  I wanted to have written this:
> > > 
> > > 	$ cat star.c 
> > > 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > 
> > I think this answers your question of if we want __lengthof__ to
> > evaluate its operand if the top-level array is non-VLA but an inner
> > array is VLA.
> > 
> > We clearly want it to not evaluate, because we want this __lengthof__
> > to be a constant expression, ...
> 
> But if you don't evaluate the argument, you can't handle counted_by.
> Because for counted_by you need the expression (the object on which it is
> used).

You would not evaluate only when the size is an integer constant
expression, which would not apply to counted_by.

Martin
  
Alejandro Colomar Aug. 5, 2024, 11:55 a.m. UTC | #14
Hi Martin,

On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > within array_type_nelts_minus_one().  What code triggers that condition?
> > Am I missing error handling for that?  Thanks!
> 
> For incomplete arrays, basically we have the following different
> variants for arrays:
> 
> T[ ] incomplete: !TYPE_DOMAIN 
> T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
>   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE

Could you describe the following types?  I've repeated the ones you
already described, deduplicated some that have a different meaning in
different contexts, and added some multi-dimensional arrays.

T[ ]     (incomplete type; function parameter)
T[ ]     (flexible array member)
T[0]     (zero-size array)
T[0]     (GNU flexible array member)
T[1]     (old flexible array member)
T[7]     (constant size)
T[7][n]  (constant size with inner variable size)
T[7][*]  (constant size with inner unspecified size)
T[n]     (variable size)
T[*]     (unspecified size)

That would help with the [*] issues I'm investigating.  I think
array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
and I'd like to fix that.

Have a lovely day!
Alex
  
Alejandro Colomar Aug. 5, 2024, 11:57 a.m. UTC | #15
On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > Am I missing error handling for that?  Thanks!
> > 
> > For incomplete arrays, basically we have the following different
> > variants for arrays:
> > 
> > T[ ] incomplete: !TYPE_DOMAIN 
> > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> 
> Could you describe the following types?  I've repeated the ones you
> already described, deduplicated some that have a different meaning in
> different contexts, and added some multi-dimensional arrays.
> 
> T[ ]     (incomplete type; function parameter)
> T[ ]     (flexible array member)
> T[0]     (zero-size array)
> T[0]     (GNU flexible array member)
> T[1]     (old flexible array member)
> T[7]     (constant size)
> T[7][n]  (constant size with inner variable size)
> T[7][*]  (constant size with inner unspecified size)

And please also describe T[7][4], although I expect that to be just the
same as T[7].

> T[n]     (variable size)
> T[*]     (unspecified size)
> 
> That would help with the [*] issues I'm investigating.  I think
> array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> and I'd like to fix that.
> 
> Have a lovely day!
> Alex
> 
> -- 
> <https://www.alejandro-colomar.es/>
  
Alejandro Colomar Aug. 5, 2024, 11:58 a.m. UTC | #16
On Mon, Aug 05, 2024 at 01:57:35PM GMT, Alejandro Colomar wrote:
> On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> > Hi Martin,
> > 
> > On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > > Am I missing error handling for that?  Thanks!
> > > 
> > > For incomplete arrays, basically we have the following different
> > > variants for arrays:
> > > 
> > > T[ ] incomplete: !TYPE_DOMAIN 
> > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> > 
> > Could you describe the following types?  I've repeated the ones you
> > already described, deduplicated some that have a different meaning in
> > different contexts, and added some multi-dimensional arrays.
> > 
> > T[ ]     (incomplete type; function parameter)
> > T[ ]     (flexible array member)
> > T[0]     (zero-size array)
> > T[0]     (GNU flexible array member)
> > T[1]     (old flexible array member)
> > T[7]     (constant size)
> > T[7][n]  (constant size with inner variable size)
> > T[7][*]  (constant size with inner unspecified size)
> 
> And please also describe T[7][4], although I expect that to be just the
> same as T[7].

And it would also be interesting to describe T[7][ ].

> 
> > T[n]     (variable size)
> > T[*]     (unspecified size)
> > 
> > That would help with the [*] issues I'm investigating.  I think
> > array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> > and I'd like to fix that.
> > 
> > Have a lovely day!
> > Alex
> > 
> > -- 
> > <https://www.alejandro-colomar.es/>
> 
> 
> 
> -- 
> <https://www.alejandro-colomar.es/>
  
Alejandro Colomar Aug. 5, 2024, 11:59 a.m. UTC | #17
On Mon, Aug 05, 2024 at 01:58:18PM GMT, Alejandro Colomar wrote:
> On Mon, Aug 05, 2024 at 01:57:35PM GMT, Alejandro Colomar wrote:
> > On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> > > Hi Martin,
> > > 
> > > On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > > > Am I missing error handling for that?  Thanks!
> > > > 
> > > > For incomplete arrays, basically we have the following different
> > > > variants for arrays:
> > > > 
> > > > T[ ] incomplete: !TYPE_DOMAIN 
> > > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> > > 
> > > Could you describe the following types?  I've repeated the ones you
> > > already described, deduplicated some that have a different meaning in
> > > different contexts, and added some multi-dimensional arrays.
> > > 
> > > T[ ]     (incomplete type; function parameter)
> > > T[ ]     (flexible array member)
> > > T[0]     (zero-size array)
> > > T[0]     (GNU flexible array member)
> > > T[1]     (old flexible array member)
> > > T[7]     (constant size)
> > > T[7][n]  (constant size with inner variable size)
> > > T[7][*]  (constant size with inner unspecified size)
> > 
> > And please also describe T[7][4], although I expect that to be just the
> > same as T[7].
> 
> And it would also be interesting to describe T[7][ ].

And maybe also:

T[n][m]
T[n][*]
T[n][ ]
T[n][7]

> 
> > 
> > > T[n]     (variable size)
> > > T[*]     (unspecified size)
> > > 
> > > That would help with the [*] issues I'm investigating.  I think
> > > array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> > > and I'd like to fix that.
> > > 
> > > Have a lovely day!
> > > Alex
> > > 
> > > -- 
> > > <https://www.alejandro-colomar.es/>
> > 
> > 
> > 
> > -- 
> > <https://www.alejandro-colomar.es/>
> 
> 
> 
> -- 
> <https://www.alejandro-colomar.es/>
  
Martin Uecker Aug. 5, 2024, 1:35 p.m. UTC | #18
Am Montag, dem 05.08.2024 um 13:59 +0200 schrieb Alejandro Colomar:
> On Mon, Aug 05, 2024 at 01:58:18PM GMT, Alejandro Colomar wrote:
> > On Mon, Aug 05, 2024 at 01:57:35PM GMT, Alejandro Colomar wrote:
> > > On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> > > > Hi Martin,
> > > > 
> > > > On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > > > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > > > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > > > > Am I missing error handling for that?  Thanks!
> > > > > 
> > > > > For incomplete arrays, basically we have the following different
> > > > > variants for arrays:
> > > > > 
> > > > > T[ ] incomplete: !TYPE_DOMAIN 
> > > > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > > > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > > > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> > > > 
> > > > Could you describe the following types?  I've repeated the ones you
> > > > already described, deduplicated some that have a different meaning in
> > > > different contexts, and added some multi-dimensional arrays.
> > > > 
> > > > T[ ]     (incomplete type; function parameter)
> > > > T[ ]     (flexible array member)
> > > > T[0]     (zero-size array)
> > > > T[0]     (GNU flexible array member)
> > > > T[1]     (old flexible array member)
> > > > T[7]     (constant size)
> > > > T[7][n]  (constant size with inner variable size)
> > > > T[7][*]  (constant size with inner unspecified size)
> > > 
> > > And please also describe T[7][4], although I expect that to be just the
> > > same as T[7].
> > 
> > And it would also be interesting to describe T[7][ ].
> 
> And maybe also:
> 
> T[n][m]
> T[n][*]
> T[n][ ]
> T[n][7]

I do not understand your question. What do you mean by
"describe the type"?

But I think you might make it unnecessarily complicated.  It
should be sufficient to look at the outermost size.  You
can completely ignore thatever happens There
should be three cases if I am not mistaken:

- incomplete (includes ISO FAM) -> error
- constant (includes GNU FAM) -> return fixed size
- variable (includes unspecified) -> evaluate the
argument and return the size, while making sure it is 
visibly non-constant.

To check that the array has a variable length, you can use
the same logic as in comptypes_internal (cf. d1_variable).

It is possible that you can not properly distinguish between

int a[0][n];
int a[*][n];

those two cases. The logic will treat the first as the second.
I think this is ok for now.  All this array stuff should be 
implified and refactored anyway, but this is for another time.


I am also not sure you even need to use array_type_nelts in C
because there is never a non-zero minimum size.


Martin

> 
> > 
> > > 
> > > > T[n]     (variable size)
> > > > T[*]     (unspecified size)
> > > > 
> > > > That would help with the [*] issues I'm investigating.  I think
> > > > array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> > > > and I'd like to fix that.
> > > > 
> > > > Have a lovely day!
> > > > Alex
> > > > 
> > > > -- 
> > > > <https://www.alejandro-colomar.es/>
> > > 
> > > 
> > > 
> > > -- 
> > > <https://www.alejandro-colomar.es/>
> > 
> > 
> > 
> > -- 
> > <https://www.alejandro-colomar.es/>
> 
> 
>
  
Alejandro Colomar Aug. 5, 2024, 3:27 p.m. UTC | #19
Hi Martin,

On Mon, Aug 05, 2024 at 03:35:06PM GMT, Martin Uecker wrote:
> > > > > > For incomplete arrays, basically we have the following different
> > > > > > variants for arrays:
> > > > > > 
> > > > > > T[ ] incomplete: !TYPE_DOMAIN 
> > > > > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > > > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > > > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > > > > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > > > > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> > > > > 
> > > > > Could you describe the following types?  I've repeated the ones you
> > > > > already described, deduplicated some that have a different meaning in
> > > > > different contexts, and added some multi-dimensional arrays.
> > > > > 
> > > > > T[ ]     (incomplete type; function parameter)
> > > > > T[ ]     (flexible array member)
> > > > > T[0]     (zero-size array)
> > > > > T[0]     (GNU flexible array member)
> > > > > T[1]     (old flexible array member)
> > > > > T[7]     (constant size)
> > > > > T[7][n]  (constant size with inner variable size)
> > > > > T[7][*]  (constant size with inner unspecified size)
> > > > 
> > > > And please also describe T[7][4], although I expect that to be just the
> > > > same as T[7].
> > > 
> > > And it would also be interesting to describe T[7][ ].
> > 
> > And maybe also:
> > 
> > T[n][m]
> > T[n][*]
> > T[n][ ]
> > T[n][7]
> 
> I do not understand your question. What do you mean by
> "describe the type"?

I had in mind what you already did above, (e.g.,
T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST), but with a more
comprehensive list.  comptypes_internal() seems what I wanted.

> But I think you might make it unnecessarily complicated.  It
> should be sufficient to look at the outermost size.  You
> can completely ignore thatever happens There
> should be three cases if I am not mistaken:
> 
> - incomplete (includes ISO FAM) -> error
> - constant (includes GNU FAM) -> return fixed size
> - variable (includes unspecified) -> evaluate the
> argument and return the size, while making sure it is 
> visibly non-constant.
> 
> To check that the array has a variable length, you can use
> the same logic as in comptypes_internal (cf. d1_variable).

Hmmm, comptypes_internal() has taught me what I was asking here.
However, it seems to not be enough for what I actually need.

Here's my problem:

The array is correctly considered a fixed-length array.  I know it
because the following debugging code:

	+fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);
	+tree dom = TYPE_DOMAIN (type);
	+int zero = !TYPE_MAX_VALUE (dom);
	+fprintf(stderr, "ALX: zero: %d\n", zero);
	+int var0 = !zero
	+        && (TREE_CODE (TYPE_MIN_VALUE (dom)) != INTEGER_CST
	+               || TREE_CODE (TYPE_MAX_VALUE (dom)) != INTEGER_CST);
	+fprintf(stderr, "ALX: var: %d\n", var0);
	+int var = var0 || (zero && TYPE_LANG_FLAG_1(type));
	+fprintf(stderr, "ALX: var: %d\n", var);
	+  ret = array_type_nelts_top (type);
	+fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);

prints:

	ALX: c_lengthof_type() 4098
	ALX: zero: 0
	ALX: var: 0
	ALX: var: 0
	ALX: c_lengthof_type() 4109

for
	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);

That differs from

	ALX: c_lengthof_type() 4098
	ALX: zero: 1
	ALX: var: 0
	ALX: var: 1
	ALX: c_lengthof_type() 4109

for
	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);

However, if I turn on -Wvla, both get a warning:

	len.c: At top level:
	len.c:288:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
	  288 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
	      | ^~~~
	len.c:289:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
	  289 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
	      | ^~~~

I suspect that the problem is in:

	$ grepc -tfd array_type_nelts_minus_one gcc
	gcc/tree.cc:tree
	array_type_nelts_minus_one (const_tree type)
	{
	  tree index_type, min, max;

	  /* If they did it with unspecified bounds, then we should have already
	     given an error about it before we got here.  */
	  if (! TYPE_DOMAIN (type))
	    return error_mark_node;

	  index_type = TYPE_DOMAIN (type);
	  min = TYPE_MIN_VALUE (index_type);
	  max = TYPE_MAX_VALUE (index_type);

	  /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
	  if (!max)
	    {
	      /* zero sized arrays are represented from C FE as complete types with
		 NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
		 them as min 0, max -1.  */
	      if (COMPLETE_TYPE_P (type)
		  && integer_zerop (TYPE_SIZE (type))
		  && integer_zerop (min))
		return build_int_cst (TREE_TYPE (min), -1);

	      return error_mark_node;
	    }

	  return (integer_zerop (min)
		  ? max
		  : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
	}

With some debugging code, I've seen that in the fixed-length case, this
reaches the last return (integer_zerop() is true, so it returns max),
which is exactly the same as with any normal fixed-length array.

In the variable-length case (i.e., [*][3]), it returns build_int_cst().

So, it seems my problem is that 'max' does not represent an integer
constant, even though we know it is.  Can we coerce it to an integer
constant somehow?  Or maybe it's some of the in_lengthof that's messing
with me?

> 
> It is possible that you can not properly distinguish between
> 
> int a[0][n];
> int a[*][n];
> 
> those two cases. The logic will treat the first as the second.

Those can be distinguished.  [0] triggers the zero test, while [*]
triggers the second var test.

> I think this is ok for now.  All this array stuff should be 
> implified and refactored anyway, but this is for another time.
> 
> 
> I am also not sure you even need to use array_type_nelts in C
> because there is never a non-zero minimum size.

How should I get the number of elements without array_type_nelts()?  Is
there any other existing way to get it?  It just had a good name that
matched my grep, but maybe I'm missing something easier.

Thanks!

Cheers,
Alex

> Martin
  
Martin Uecker Aug. 5, 2024, 4:05 p.m. UTC | #20
Am Montag, dem 05.08.2024 um 17:27 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
...

> > But I think you might make it unnecessarily complicated.  It
> > should be sufficient to look at the outermost size.  You
> > can completely ignore thatever happens There
> > should be three cases if I am not mistaken:
> > 
> > - incomplete (includes ISO FAM) -> error
> > - constant (includes GNU FAM) -> return fixed size
> > - variable (includes unspecified) -> evaluate the
> > argument and return the size, while making sure it is 
> > visibly non-constant.
> > 
> > To check that the array has a variable length, you can use
> > the same logic as in comptypes_internal (cf. d1_variable).
> 
> Hmmm, comptypes_internal() has taught me what I was asking here.
> However, it seems to not be enough for what I actually need.
> 
> Here's my problem:
> 
> The array is correctly considered a fixed-length array.  I know it
> because the following debugging code:
> 
> 	+fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);
> 	+tree dom = TYPE_DOMAIN (type);
> 	+int zero = !TYPE_MAX_VALUE (dom);
> 	+fprintf(stderr, "ALX: zero: %d\n", zero);
> 	+int var0 = !zero
> 	+        && (TREE_CODE (TYPE_MIN_VALUE (dom)) != INTEGER_CST
> 	+               || TREE_CODE (TYPE_MAX_VALUE (dom)) != INTEGER_CST);
> 	+fprintf(stderr, "ALX: var: %d\n", var0);
> 	+int var = var0 || (zero && TYPE_LANG_FLAG_1(type));
> 	+fprintf(stderr, "ALX: var: %d\n", var);
> 	+  ret = array_type_nelts_top (type);
> 	+fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);
> 
> prints:
> 
> 	ALX: c_lengthof_type() 4098
> 	ALX: zero: 0
> 	ALX: var: 0
> 	ALX: var: 0
> 	ALX: c_lengthof_type() 4109
> 
> for
> 	void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 
> That differs from
> 
> 	ALX: c_lengthof_type() 4098
> 	ALX: zero: 1
> 	ALX: var: 0
> 	ALX: var: 1
> 	ALX: c_lengthof_type() 4109
> 
> for
> 	void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);


That looks good.

> 
> However, if I turn on -Wvla, both get a warning:
> 
> 	len.c: At top level:
> 	len.c:288:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> 	  288 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 	      | ^~~~
> 	len.c:289:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> 	  289 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> 	      | ^~~~
> 

You should check the the result you get from __lengthof__
is an integer constant expression in the first case.

> I suspect that the problem is in:
> 
> 	$ grepc -tfd array_type_nelts_minus_one gcc
> 	gcc/tree.cc:tree
> 	array_type_nelts_minus_one (const_tree type)
> 	{
> 	  tree index_type, min, max;
> 
> 	  /* If they did it with unspecified bounds, then we should have already
> 	     given an error about it before we got here.  */
> 	  if (! TYPE_DOMAIN (type))
> 	    return error_mark_node;
> 
> 	  index_type = TYPE_DOMAIN (type);
> 	  min = TYPE_MIN_VALUE (index_type);
> 	  max = TYPE_MAX_VALUE (index_type);
> 
> 	  /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
> 	  if (!max)
> 	    {
> 	      /* zero sized arrays are represented from C FE as complete types with
> 		 NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
> 		 them as min 0, max -1.  */
> 	      if (COMPLETE_TYPE_P (type)
> 		  && integer_zerop (TYPE_SIZE (type))
> 		  && integer_zerop (min))
> 		return build_int_cst (TREE_TYPE (min), -1);
> 
> 	      return error_mark_node;
> 	    }
> 
> 	  return (integer_zerop (min)
> 		  ? max
> 		  : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
> 	}
> 
> With some debugging code, I've seen that in the fixed-length case, this
> reaches the last return (integer_zerop() is true, so it returns max),
> which is exactly the same as with any normal fixed-length array.
> 
> In the variable-length case (i.e., [*][3]), it returns build_int_cst().
> 
> So, it seems my problem is that 'max' does not represent an integer
> constant, even though we know it is.  Can we coerce it to an integer
> constant somehow?  Or maybe it's some of the in_lengthof that's messing
> with me?
> 

I would suspect the logic related to the C_MAYBE_CONST_EXPR.
In your original patch this still used C_TYPE_VARIABLE_SIZE,
which is not what we want for lengthof.

> > 
> > It is possible that you can not properly distinguish between
> > 
> > int a[0][n];
> > int a[*][n];
> > 
> > those two cases. The logic will treat the first as the second.
> 
> Those can be distinguished.  [0] triggers the zero test, while [*]
> triggers the second var test.

Are you sure? Both types should have C_TYPE_VARIABLE_SIZE set to 1.

> 
> > I think this is ok for now.  All this array stuff should be 
> > implified and refactored anyway, but this is for another time.
> > 
> > 
> > I am also not sure you even need to use array_type_nelts in C
> > because there is never a non-zero minimum size.
> 
> How should I get the number of elements without array_type_nelts()?  Is
> there any other existing way to get it?  It just had a good name that
> matched my grep, but maybe I'm missing something easier.

Maybe it is ok, but there is also code which just adds one
to TYPE_MAX_VALUE.

Martin

> 
> Thanks!
> 
> Cheers,
> Alex
> 
> > Martin
>
  
Alejandro Colomar Aug. 5, 2024, 5:47 p.m. UTC | #21
Hi Martin,

On Mon, Aug 05, 2024 at 06:05:15PM GMT, Martin Uecker wrote:
> > 
> > However, if I turn on -Wvla, both get a warning:
> > 
> > 	len.c: At top level:
> > 	len.c:288:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> > 	  288 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > 	      | ^~~~
> > 	len.c:289:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> > 	  289 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> > 	      | ^~~~
> > 
> 
> You should check the the result you get from __lengthof__
> is an integer constant expression in the first case.
> 
> > I suspect that the problem is in:
> > 
> > 	$ grepc -tfd array_type_nelts_minus_one gcc
> > 	gcc/tree.cc:tree
> > 	array_type_nelts_minus_one (const_tree type)
> > 	{
> > 	  tree index_type, min, max;
> > 
> > 	  /* If they did it with unspecified bounds, then we should have already
> > 	     given an error about it before we got here.  */
> > 	  if (! TYPE_DOMAIN (type))
> > 	    return error_mark_node;
> > 
> > 	  index_type = TYPE_DOMAIN (type);
> > 	  min = TYPE_MIN_VALUE (index_type);
> > 	  max = TYPE_MAX_VALUE (index_type);
> > 
> > 	  /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
> > 	  if (!max)
> > 	    {
> > 	      /* zero sized arrays are represented from C FE as complete types with
> > 		 NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
> > 		 them as min 0, max -1.  */
> > 	      if (COMPLETE_TYPE_P (type)
> > 		  && integer_zerop (TYPE_SIZE (type))
> > 		  && integer_zerop (min))
> > 		return build_int_cst (TREE_TYPE (min), -1);
> > 
> > 	      return error_mark_node;
> > 	    }
> > 
> > 	  return (integer_zerop (min)
> > 		  ? max
> > 		  : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
> > 	}
> > 
> > With some debugging code, I've seen that in the fixed-length case, this
> > reaches the last return (integer_zerop() is true, so it returns max),
> > which is exactly the same as with any normal fixed-length array.
> > 
> > In the variable-length case (i.e., [*][3]), it returns build_int_cst().
> > 
> > So, it seems my problem is that 'max' does not represent an integer
> > constant, even though we know it is.  Can we coerce it to an integer
> > constant somehow?  Or maybe it's some of the in_lengthof that's messing
> > with me?
> > 
> 
> I would suspect the logic related to the C_MAYBE_CONST_EXPR.
> In your original patch this still used C_TYPE_VARIABLE_SIZE,
> which is not what we want for lengthof.

Ahhh, I blindly pasted that from sizeof, IIRC.  I'll check.
Thanks a lot!

> > > It is possible that you can not properly distinguish between
> > > 
> > > int a[0][n];
> > > int a[*][n];
> > > 
> > > those two cases. The logic will treat the first as the second.
> > 
> > Those can be distinguished.  [0] triggers the zero test, while [*]
> > triggers the second var test.
> 
> Are you sure? Both types should have C_TYPE_VARIABLE_SIZE set to 1.

You were right.  They're the same.  I was thinking of [0] vs [*], but
[0][n] is bad.  It gets treated as a VLA.

I won't worry too much about it, since GCC doesn't properly support
0-length arrays.  We'll have to worry about it if we start discussing
full support for 0-length arrays.

> > > I think this is ok for now.  All this array stuff should be 
> > > implified and refactored anyway, but this is for another time.
> > > 
> > > 
> > > I am also not sure you even need to use array_type_nelts in C
> > > because there is never a non-zero minimum size.
> > 
> > How should I get the number of elements without array_type_nelts()?  Is
> > there any other existing way to get it?  It just had a good name that
> > matched my grep, but maybe I'm missing something easier.
> 
> Maybe it is ok, but there is also code which just adds one
> to TYPE_MAX_VALUE.

Hmmm.  I'll check.

> 
> Martin

Cheers,
Alex
  
Qing Zhao Aug. 5, 2024, 8:10 p.m. UTC | #22
On Aug 5, 2024, at 06:33, Martin Uecker <uecker@tugraz.at> wrote:
> 
> Am Montag, dem 05.08.2024 um 11:50 +0200 schrieb Jakub Jelinek:
>> On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
>>> [CC += Kees, Qing]
>>> 
>>> Hi Joseph,
>>> 
>>> On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
>>>> On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
>>>> D'oh!  I screwed it.  I wanted to have written this:
>>>> 
>>>> $ cat star.c 
>>>> void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
>>> 
>>> I think this answers your question of if we want __lengthof__ to
>>> evaluate its operand if the top-level array is non-VLA but an inner
>>> array is VLA.
>>> 
>>> We clearly want it to not evaluate, because we want this __lengthof__
>>> to be a constant expression, ...
>> 
>> But if you don't evaluate the argument, you can't handle counted_by.
>> Because for counted_by you need the expression (the object on which it is
>> used).
> 
> You would not evaluate only when the size is an integer constant
> expression, which would not apply to counted_by.

I still don’t feel very comfortable on the idea of  applying “__lengthof__()” operator on flexible array members with “counted_by”  attributes. My major concerns are:

1. I thought that the new operator “__lengthof__(expr)" might need to stick to the TYPE of the expr  just as  the operator sizeof (expr).  Sizeof will just error when the TYPE of the expr does not include the size information (for Incomplete types including flexible array members), I expect the same behavior for the new operator “__lenghof__(expr)”. 

The “counted-by” attribute currently is not in the TYPE system, and we plan to add it into the TYPE system later through language standard (or an GCC extension).  If that happens, then both the “sizeof” and the “__lengthof__” operators should be automatically evaluate the “size" or the “length” for the expr through its TYPE.  (Just as the current VLA, its size and length already in the TYPE, therefore both “sizeof” and “__lengthof__” should evaluate VLA. 

however, at this time, it’s not a good idea to mess up the operator “sizeof” or the new operator “__lengthof__”  with the information outside of the TYPE system. (Even though the information outside of TYPE can provide the size or length info).

2. For the purpose of PR116016 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016): add __builtin_set_counted_by(P->FAM, COUNT) or equivalent 

The most important functionality the kernel needs from compilers is a way to SET the counted_by field. Unless the new operator “__lengthof__” returns a LVALUE that can be assigned to, it cannot meet the requirement from kernel. 

However, from my understanding of the new “__lengthof__”, it will be similar as “sizeof”, will not be a Lvalue. 

Therefore, for PR116016, we still need a new builtin, specific for the counted_by attribute. 

Just my 2 cents. 

Qing



> Martin
> 
>
  
Martin Uecker Aug. 5, 2024, 8:41 p.m. UTC | #23
Am Montag, dem 05.08.2024 um 20:10 +0000 schrieb Qing Zhao:
> On Aug 5, 2024, at 06:33, Martin Uecker <uecker@tugraz.at> wrote:
> > 
> > Am Montag, dem 05.08.2024 um 11:50 +0200 schrieb Jakub Jelinek:
> > > On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
> > > > [CC += Kees, Qing]
> > > > 
> > > > Hi Joseph,
> > > > 
> > > > On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> > > > > On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > > > > D'oh!  I screwed it.  I wanted to have written this:
> > > > > 
> > > > > $ cat star.c 
> > > > > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > > 
> > > > I think this answers your question of if we want __lengthof__ to
> > > > evaluate its operand if the top-level array is non-VLA but an inner
> > > > array is VLA.
> > > > 
> > > > We clearly want it to not evaluate, because we want this __lengthof__
> > > > to be a constant expression, ...
> > > 
> > > But if you don't evaluate the argument, you can't handle counted_by.
> > > Because for counted_by you need the expression (the object on which it is
> > > used).
> > 
> > You would not evaluate only when the size is an integer constant
> > expression, which would not apply to counted_by.
> 
> I still don’t feel very comfortable on the idea of  applying “__lengthof__()” operator on flexible array members with “counted_by”  attributes. My major concerns are:
> 
> 1. I thought that the new operator “__lengthof__(expr)" might need to stick to the TYPE of the expr  just as  the operator sizeof (expr).  Sizeof will just error when the TYPE of the expr does not include the size information (for Incomplete types including flexible array members), I expect the same behavior for the new operator “__lenghof__(expr)”. 
> 
> The “counted-by” attribute currently is not in the TYPE system, and we plan to add it into the TYPE system later through language standard (or an GCC extension).  If that happens, then both the “sizeof” and the “__lengthof__” operators should be automatically evaluate the “size" or the “length” for the expr through its TYPE.  (Just as the current VLA, its size and length already in the TYPE, therefore both “sizeof” and “__lengthof__” should evaluate VLA. 
> 
> however, at this time, it’s not a good idea to mess up the operator “sizeof” or the new operator “__lengthof__”  with the information outside of the TYPE system. (Even though the information outside of TYPE can provide the size or length info).

I agree with this. I would also much prefer to see a proper language extension,
we should not really be any harder to implement than "counted_by" itself.

Martin

> 
> 2. For the purpose of PR116016 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016): add __builtin_set_counted_by(P->FAM, COUNT) or equivalent 
> 
> The most important functionality the kernel needs from compilers is a way to SET the counted_by field. Unless the new operator “__lengthof__” returns a LVALUE that can be assigned to, it cannot meet the requirement from kernel. 
> 
> However, from my understanding of the new “__lengthof__”, it will be similar as “sizeof”, will not be a Lvalue. 
> 
> Therefore, for PR116016, we still need a new builtin, specific for the counted_by attribute. 

> Just my 2 cents. 
> 
> Qing
> 




Martin
  
Alejandro Colomar Aug. 5, 2024, 8:59 p.m. UTC | #24
Hi Qing,

On Mon, Aug 05, 2024 at 10:41:43PM GMT, Martin Uecker wrote:
> Am Montag, dem 05.08.2024 um 20:10 +0000 schrieb Qing Zhao:
> > On Aug 5, 2024, at 06:33, Martin Uecker <uecker@tugraz.at> wrote:
> > > 
> > > Am Montag, dem 05.08.2024 um 11:50 +0200 schrieb Jakub Jelinek:
> > > > On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
> > > > > [CC += Kees, Qing]
> > > > > 
> > > > > Hi Joseph,
> > > > > 
> > > > > On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> > > > > > On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > > > > > D'oh!  I screwed it.  I wanted to have written this:
> > > > > > 
> > > > > > $ cat star.c 
> > > > > > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > > > > 
> > > > > I think this answers your question of if we want __lengthof__ to
> > > > > evaluate its operand if the top-level array is non-VLA but an inner
> > > > > array is VLA.
> > > > > 
> > > > > We clearly want it to not evaluate, because we want this __lengthof__
> > > > > to be a constant expression, ...
> > > > 
> > > > But if you don't evaluate the argument, you can't handle counted_by.
> > > > Because for counted_by you need the expression (the object on which it is
> > > > used).
> > > 
> > > You would not evaluate only when the size is an integer constant
> > > expression, which would not apply to counted_by.
> > 
> > I still don’t feel very comfortable on the idea of  applying
> > “__lengthof__()” operator on flexible array members with
> > “counted_by”  attributes. My major concerns are:
> > 
> > 1. I thought that the new operator “__lengthof__(expr)" might need
> > to stick to the TYPE of the expr  just as  the operator
> > sizeof (expr).  Sizeof will just error when the TYPE of the expr
> > does not include the size information (for Incomplete types
> > including flexible array members), I expect the same behavior for
> > the new operator “__lenghof__(expr)”. 

Let's be cautious, and start by having an error on FAMs in the first
implementation.  We can discuss support for attributes in a later
iteration.  So, I agree.

> > 
> > The “counted-by” attribute currently is not in the TYPE system, and
> > we plan to add it into the TYPE system later through language
> > standard (or an GCC extension).  If that happens, then both the
> > “sizeof” and the “__lengthof__” operators should be automatically
> > evaluate the “size" or the “length” for the expr through its TYPE.
> >  (Just as the current VLA, its size and length already in the TYPE,
> > therefore both “sizeof” and “__lengthof__” should evaluate VLA. 

I'm curious; how do you plan to make counted_by as part of the type
system?  I've read the paper for using a .identifier length designator
(n3188; <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm>),
but that's a constant, and doesn't use an attribute.

> > 
> > however, at this time, it’s not a good idea to mess up the operator
> > “sizeof” or the new operator “__lengthof__”  with the information
> > outside of the TYPE system. (Even though the information outside of
> > TYPE can provide the size or length info).

Okay.

> I agree with this. I would also much prefer to see a proper language extension,
> we should not really be any harder to implement than "counted_by" itself.
> 
> Martin
> 
> > 
> > 2. For the purpose of PR116016
> > (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016): add
> > __builtin_set_counted_by(P->FAM, COUNT) or equivalent 
> > 
> > The most important functionality the kernel needs from compilers is
> > a way to SET the counted_by field. Unless the new operator
> > “__lengthof__” returns a LVALUE that can be assigned to,

No; it won't be an lvalue; and the approach with an lvalue gives you a
way to both read and write in a single feature, so it's quite nice.  ;)

> > it cannot meet the requirement from kernel. 
> > 
> > However, from my understanding of the new “__lengthof__”, it will be
> > similar as “sizeof”, will not be a Lvalue. 

Yep.

> > 
> > Therefore, for PR116016, we still need a new builtin, specific for
> > the counted_by attribute. 

Yep.

Have a lovely night!
Alex
  
Qing Zhao Aug. 6, 2024, 3:23 p.m. UTC | #25
On Aug 5, 2024, at 16:59, Alejandro Colomar <alx@kernel.org> wrote:


The “counted-by” attribute currently is not in the TYPE system, and
we plan to add it into the TYPE system later through language
standard (or an GCC extension).  If that happens, then both the
“sizeof” and the “__lengthof__” operators should be automatically
evaluate the “size" or the “length” for the expr through its TYPE.
(Just as the current VLA, its size and length already in the TYPE,
therefore both “sizeof” and “__lengthof__” should evaluate VLA.

I'm curious; how do you plan to make counted_by as part of the type
system?  I've read the paper for using a .identifier length designator
(n3188; <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm>),
but that's a constant, and doesn't use an attribute.

The “counted_by” attribute is only a temporary and practical solution at this moment
 to build a direct relationship between the length of the array and and array itself in the
source code level, but not touching the TYPE system at all.
The final plan is similar as the solution in the above paper you
referred.

i.e, currently, with “counted_by” attribute:

struct foo {
  unsigned int count;
  char array [] _attribute__ ((counted_by (count));
};

Later, when the relationship is built into TYPE, the above will become:

struct foo {
  unsigned int count;
  char array [.count];
};

That will be the cleanest solution to this problem.
However, might take a much longer time to final get into the compiler.

Thanks.

Qing
  

Patch

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index e7e371fd26f..91793dfbffc 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -465,6 +465,7 @@  const struct c_common_resword c_common_reswords[] =
   { "__inline",		RID_INLINE,	0 },
   { "__inline__",	RID_INLINE,	0 },
   { "__label__",	RID_LABEL,	0 },
+  { "__lengthof__",	RID_LENGTHOF, 0 },
   { "__null",		RID_NULL,	0 },
   { "__real",		RID_REALPART,	0 },
   { "__real__",		RID_REALPART,	0 },
@@ -4070,6 +4071,31 @@  c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement the lengthof keyword: Return the length of an array,
+   that is, the number of elements in the array.  */
+
+tree
+c_lengthof_type (location_t loc, tree type)
+{
+  enum tree_code type_code;
+
+  type_code = TREE_CODE (type);
+  if (!COMPLETE_TYPE_P (type))
+    {
+      error_at (loc,
+		"invalid application of %<lengthof%> to incomplete type %qT",
+		type);
+      return error_mark_node;
+    }
+  if (type_code != ARRAY_TYPE)
+    {
+      error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
+      return error_mark_node;
+    }
+
+  return array_type_nelts_top (type);
+}
 
 /* Handle C and C++ default attributes.  */
 
diff --git a/gcc/c-family/c-common.def b/gcc/c-family/c-common.def
index 5de96e5d4a8..6d162f67104 100644
--- a/gcc/c-family/c-common.def
+++ b/gcc/c-family/c-common.def
@@ -50,6 +50,9 @@  DEFTREECODE (EXCESS_PRECISION_EXPR, "excess_precision_expr", tcc_expression, 1)
    number.  */
 DEFTREECODE (USERDEF_LITERAL, "userdef_literal", tcc_exceptional, 3)
 
+/* Represents a 'lengthof' expression.  */
+DEFTREECODE (LENGTHOF_EXPR, "lengthof_expr", tcc_expression, 1)
+
 /* Represents a 'sizeof' expression during C++ template expansion,
    or for the purpose of -Wsizeof-pointer-memaccess warning.  */
 DEFTREECODE (SIZEOF_EXPR, "sizeof_expr", tcc_expression, 1)
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ccaea27c2b9..f815a4cf3bc 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -105,6 +105,7 @@  enum rid
 
   /* C extensions */
   RID_ASM,       RID_TYPEOF,   RID_TYPEOF_UNQUAL, RID_ALIGNOF,  RID_ATTRIBUTE,
+  RID_LENGTHOF,
   RID_VA_ARG,
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,    RID_CHOOSE_EXPR,
   RID_TYPES_COMPATIBLE_P,      RID_BUILTIN_COMPLEX,	   RID_BUILTIN_SHUFFLE,
@@ -885,6 +886,7 @@  extern tree c_common_truthvalue_conversion (location_t, tree);
 extern void c_apply_type_quals_to_decl (int, tree);
 extern tree c_sizeof_or_alignof_type (location_t, tree, bool, bool, int);
 extern tree c_alignof_expr (location_t, tree);
+extern tree c_lengthof_type (location_t, tree);
 /* Print an error message for invalid operands to arith operation CODE.
    NOP_EXPR is used as a special case (see truthvalue_conversion).  */
 extern void binary_op_error (rich_location *, enum tree_code, tree, tree);
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 4dced430d1f..790c58b2558 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -8937,12 +8937,16 @@  start_struct (location_t loc, enum tree_code code, tree name,
      within a statement expr used within sizeof, et. al.  This is not
      terribly serious as C++ doesn't permit statement exprs within
      sizeof anyhow.  */
-  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
+  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof || in_lengthof))
     warning_at (loc, OPT_Wc___compat,
 		"defining type in %qs expression is invalid in C++",
 		(in_sizeof
 		 ? "sizeof"
-		 : (in_typeof ? "typeof" : "alignof")));
+		 : (in_typeof
+		    ? "typeof"
+		    : (in_alignof
+		       ? "alignof"
+		       : "lengthof"))));
 
   if (in_underspecified_init)
     error_at (loc, "%qT defined in underspecified object initializer", ref);
@@ -9897,7 +9901,7 @@  finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
 	 struct_types.  */
       if (warn_cxx_compat
 	  && struct_parse_info != NULL
-	  && !in_sizeof && !in_typeof && !in_alignof)
+	  && !in_sizeof && !in_typeof && !in_alignof && !in_lengthof)
 	struct_parse_info->struct_types.safe_push (t);
      }
 
@@ -10071,12 +10075,16 @@  start_enum (location_t loc, struct c_enum_contents *the_enum, tree name,
   /* FIXME: This will issue a warning for a use of a type defined
      within sizeof in a statement expr.  This is not terribly serious
      as C++ doesn't permit statement exprs within sizeof anyhow.  */
-  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
+  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof || in_lengthof))
     warning_at (loc, OPT_Wc___compat,
 		"defining type in %qs expression is invalid in C++",
 		(in_sizeof
 		 ? "sizeof"
-		 : (in_typeof ? "typeof" : "alignof")));
+		 : (in_typeof
+		    ? "typeof"
+		    : (in_alignof
+		       ? "alignof"
+		       : "lengthof"))));
 
   if (in_underspecified_init)
     error_at (loc, "%qT defined in underspecified object initializer",
@@ -10270,7 +10278,7 @@  finish_enum (tree enumtype, tree values, tree attributes)
      struct_types.  */
   if (warn_cxx_compat
       && struct_parse_info != NULL
-      && !in_sizeof && !in_typeof && !in_alignof)
+      && !in_sizeof && !in_typeof && !in_alignof && !in_lengthof)
     struct_parse_info->struct_types.safe_push (enumtype);
 
   /* Check for consistency with previous definition */
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 12c5ed5d92c..09bb19f9299 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -74,7 +74,17 @@  along with GCC; see the file COPYING3.  If not see
 #include "bitmap.h"
 #include "analyzer/analyzer-language.h"
 #include "toplev.h"
+
+#define c_parser_sizeof_expression(parser)                                    \
+(                                                                             \
+  c_parser_sizeof_or_lengthof_expression (parser, RID_SIZEOF)                 \
+)
 
+#define c_parser_lengthof_expression(parser)                                  \
+(                                                                             \
+  c_parser_sizeof_or_lengthof_expression (parser, RID_LENGTHOF)               \
+)
+
 /* We need to walk over decls with incomplete struct/union/enum types
    after parsing the whole translation unit.
    In finish_decl(), if the decl is static, has incomplete
@@ -1687,7 +1697,7 @@  static struct c_expr c_parser_binary_expression (c_parser *, struct c_expr *,
 						 tree);
 static struct c_expr c_parser_cast_expression (c_parser *, struct c_expr *);
 static struct c_expr c_parser_unary_expression (c_parser *);
-static struct c_expr c_parser_sizeof_expression (c_parser *);
+static struct c_expr c_parser_sizeof_or_lengthof_expression (c_parser *, enum rid);
 static struct c_expr c_parser_alignof_expression (c_parser *);
 static struct c_expr c_parser_postfix_expression (c_parser *);
 static struct c_expr c_parser_postfix_expression_after_paren_type (c_parser *,
@@ -9864,6 +9874,8 @@  c_parser_unary_expression (c_parser *parser)
     case CPP_KEYWORD:
       switch (c_parser_peek_token (parser)->keyword)
 	{
+	case RID_LENGTHOF:
+	  return c_parser_lengthof_expression (parser);
 	case RID_SIZEOF:
 	  return c_parser_sizeof_expression (parser);
 	case RID_ALIGNOF:
@@ -9903,12 +9915,13 @@  c_parser_unary_expression (c_parser *parser)
 /* Parse a sizeof expression.  */
 
 static struct c_expr
-c_parser_sizeof_expression (c_parser *parser)
+c_parser_sizeof_or_lengthof_expression (c_parser *parser, enum rid rid)
 {
+  const char *op_name = (rid == RID_LENGTHOF) ? "lengthof" : "sizeof";
   struct c_expr expr;
   struct c_expr result;
   location_t expr_loc;
-  gcc_assert (c_parser_next_token_is_keyword (parser, RID_SIZEOF));
+  gcc_assert (c_parser_next_token_is_keyword (parser, rid));
 
   location_t start;
   location_t finish = UNKNOWN_LOCATION;
@@ -9917,7 +9930,10 @@  c_parser_sizeof_expression (c_parser *parser)
 
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
-  in_sizeof++;
+  if (rid == RID_LENGTHOF)
+    in_lengthof++;
+  else
+    in_sizeof++;
   if (c_parser_next_token_is (parser, CPP_OPEN_PAREN)
       && c_token_starts_compound_literal (c_parser_peek_2nd_token (parser)))
     {
@@ -9936,7 +9952,10 @@  c_parser_sizeof_expression (c_parser *parser)
 	{
 	  struct c_expr ret;
 	  c_inhibit_evaluation_warnings--;
-	  in_sizeof--;
+	  if (rid == RID_LENGTHOF)
+	    in_lengthof--;
+	  else
+	    in_sizeof--;
 	  ret.set_error ();
 	  ret.original_code = ERROR_MARK;
 	  ret.original_type = NULL;
@@ -9948,31 +9967,45 @@  c_parser_sizeof_expression (c_parser *parser)
 							       type_name,
 							       expr_loc);
 	  finish = expr.get_finish ();
-	  goto sizeof_expr;
+	  goto Xof_expr;
 	}
       /* sizeof ( type-name ).  */
       if (scspecs)
-	error_at (expr_loc, "storage class specifier in %<sizeof%>");
+	error_at (expr_loc, "storage class specifier in %qs", op_name);
       if (type_name->specs->alignas_p)
 	error_at (type_name->specs->locations[cdw_alignas],
-		  "alignment specified for type name in %<sizeof%>");
+		  "alignment specified for type name in %qs", op_name);
       c_inhibit_evaluation_warnings--;
-      in_sizeof--;
-      result = c_expr_sizeof_type (expr_loc, type_name);
+      if (rid == RID_LENGTHOF)
+	{
+	  in_lengthof--;
+	  result = c_expr_lengthof_type (expr_loc, type_name);
+	}
+      else
+	{
+	  in_sizeof--;
+	  result = c_expr_sizeof_type (expr_loc, type_name);
+	}
     }
   else
     {
       expr_loc = c_parser_peek_token (parser)->location;
       expr = c_parser_unary_expression (parser);
       finish = expr.get_finish ();
-    sizeof_expr:
+    Xof_expr:
       c_inhibit_evaluation_warnings--;
-      in_sizeof--;
+      if (rid == RID_LENGTHOF)
+	in_lengthof--;
+      else
+	in_sizeof--;
       mark_exp_read (expr.value);
       if (TREE_CODE (expr.value) == COMPONENT_REF
 	  && DECL_C_BIT_FIELD (TREE_OPERAND (expr.value, 1)))
-	error_at (expr_loc, "%<sizeof%> applied to a bit-field");
-      result = c_expr_sizeof_expr (expr_loc, expr);
+	error_at (expr_loc, "%qs applied to a bit-field", op_name);
+      if (rid == RID_LENGTHOF)
+	result = c_expr_lengthof_expr (expr_loc, expr);
+      else
+	result = c_expr_sizeof_expr (expr_loc, expr);
     }
   if (finish == UNKNOWN_LOCATION)
     finish = start;
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index 15da875a029..102fcfefea6 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -736,6 +736,7 @@  extern int c_type_dwarf_attribute (const_tree, int);
 /* in c-typeck.cc */
 extern int in_alignof;
 extern int in_sizeof;
+extern int in_lengthof;
 extern int in_typeof;
 extern bool c_in_omp_for;
 extern bool c_omp_array_section_p;
@@ -786,6 +787,9 @@  extern tree build_external_ref (location_t, tree, bool, tree *);
 extern void pop_maybe_used (bool);
 extern struct c_expr c_expr_sizeof_expr (location_t, struct c_expr);
 extern struct c_expr c_expr_sizeof_type (location_t, struct c_type_name *);
+extern struct c_expr c_expr_lengthof_expr (location_t, struct c_expr);
+extern struct c_expr c_expr_lengthof_type (location_t loc,
+                                           struct c_type_name *);
 extern struct c_expr parser_build_unary_op (location_t, enum tree_code,
     					    struct c_expr);
 extern struct c_expr parser_build_binary_op (location_t,
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7e0f01ed22b..121e74de25d 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -71,6 +71,9 @@  int in_alignof;
 /* The level of nesting inside "sizeof".  */
 int in_sizeof;
 
+/* The level of nesting inside "sizeof".  */
+int in_lengthof;
+
 /* The level of nesting inside "typeof".  */
 int in_typeof;
 
@@ -3255,7 +3258,7 @@  build_external_ref (location_t loc, tree id, bool fun, tree *type)
 
   if (TREE_CODE (ref) == FUNCTION_DECL && !in_alignof)
     {
-      if (!in_sizeof && !in_typeof)
+      if (!in_sizeof && !in_typeof && !in_lengthof)
 	C_DECL_USED (ref) = 1;
       else if (DECL_INITIAL (ref) == NULL_TREE
 	       && DECL_EXTERNAL (ref)
@@ -3311,7 +3314,7 @@  struct maybe_used_decl
 {
   /* The decl.  */
   tree decl;
-  /* The level seen at (in_sizeof + in_typeof).  */
+  /* The level seen at (in_sizeof + in_typeof + in_lengthof).  */
   int level;
   /* The next one at this level or above, or NULL.  */
   struct maybe_used_decl *next;
@@ -3329,7 +3332,7 @@  record_maybe_used_decl (tree decl)
 {
   struct maybe_used_decl *t = XOBNEW (&parser_obstack, struct maybe_used_decl);
   t->decl = decl;
-  t->level = in_sizeof + in_typeof;
+  t->level = in_sizeof + in_typeof + in_lengthof;
   t->next = maybe_used_decls;
   maybe_used_decls = t;
 }
@@ -3343,7 +3346,7 @@  void
 pop_maybe_used (bool used)
 {
   struct maybe_used_decl *p = maybe_used_decls;
-  int cur_level = in_sizeof + in_typeof;
+  int cur_level = in_sizeof + in_typeof + in_lengthof;
   while (p && p->level > cur_level)
     {
       if (used)
@@ -3453,6 +3456,90 @@  c_expr_sizeof_type (location_t loc, struct c_type_name *t)
   return ret;
 }
 
+/* Return the result of lengthof applied to EXPR.  */
+
+struct c_expr
+c_expr_lengthof_expr (location_t loc, struct c_expr expr)
+{
+  struct c_expr ret;
+  if (expr.value == error_mark_node)
+    {
+      ret.value = error_mark_node;
+      ret.original_code = ERROR_MARK;
+      ret.original_type = NULL;
+      ret.m_decimal = 0;
+      pop_maybe_used (false);
+    }
+  else
+    {
+      bool expr_const_operands = true;
+
+      tree folded_expr = c_fully_fold (expr.value, require_constant_value,
+				       &expr_const_operands);
+      ret.value = c_lengthof_type (loc, TREE_TYPE (folded_expr));
+      c_last_sizeof_arg = expr.value;
+      c_last_sizeof_loc = loc;
+      ret.original_code = LENGTHOF_EXPR;
+      ret.original_type = NULL;
+      ret.m_decimal = 0;
+      if (C_TYPE_VARIABLE_SIZE (TREE_TYPE (folded_expr)))
+	{
+	  /* lengthof is evaluated when given a vla.  */
+	  ret.value = build2 (C_MAYBE_CONST_EXPR, TREE_TYPE (ret.value),
+			      folded_expr, ret.value);
+	  C_MAYBE_CONST_EXPR_NON_CONST (ret.value) = !expr_const_operands;
+	  SET_EXPR_LOCATION (ret.value, loc);
+	}
+      pop_maybe_used (C_TYPE_VARIABLE_SIZE (TREE_TYPE (folded_expr)));
+    }
+  return ret;
+}
+
+/* Return the result of lengthof applied to T, a structure for the type
+   name passed to _lengthof (rather than the type itself).  LOC is the
+   location of the original expression.  */
+
+struct c_expr
+c_expr_lengthof_type (location_t loc, struct c_type_name *t)
+{
+  tree type;
+  struct c_expr ret;
+  tree type_expr = NULL_TREE;
+  bool type_expr_const = true;
+  type = groktypename (t, &type_expr, &type_expr_const);
+  ret.value = c_lengthof_type (loc, type);
+  c_last_sizeof_arg = type;
+  c_last_sizeof_loc = loc;
+  ret.original_code = LENGTHOF_EXPR;
+  ret.original_type = NULL;
+  ret.m_decimal = 0;
+  if (type == error_mark_node)
+    {
+      ret.value = error_mark_node;
+      ret.original_code = ERROR_MARK;
+    }
+  else
+  if ((type_expr || TREE_CODE (ret.value) == INTEGER_CST)
+      && C_TYPE_VARIABLE_SIZE (type))
+    {
+      /* If the type is a [*] array, it is a VLA but is represented as
+	 having a size of zero.  In such a case we must ensure that
+	 the result of lengthof does not get folded to a constant by
+	 c_fully_fold, because if the length is evaluated the result is
+	 not constant and so constraints on zero or negative size
+	 arrays must not be applied when this lengthof call is inside
+	 another array declarator.  */
+      if (!type_expr)
+	type_expr = integer_zero_node;
+      ret.value = build2 (C_MAYBE_CONST_EXPR, TREE_TYPE (ret.value),
+			  type_expr, ret.value);
+      C_MAYBE_CONST_EXPR_NON_CONST (ret.value) = !type_expr_const;
+    }
+  pop_maybe_used (type != error_mark_node
+		  ? C_TYPE_VARIABLE_SIZE (type) : false);
+  return ret;
+}
+
 /* Build a function call to function FUNCTION with parameters PARAMS.
    The function call is at LOC.
    PARAMS is a list--a chain of TREE_LIST nodes--in which the
diff --git a/gcc/cp/operators.def b/gcc/cp/operators.def
index d8878923602..d640ed8bd91 100644
--- a/gcc/cp/operators.def
+++ b/gcc/cp/operators.def
@@ -91,6 +91,7 @@  DEF_OPERATOR ("co_await", CO_AWAIT_EXPR, "aw", OVL_OP_FLAG_UNARY)
 
 /* These are extensions.  */
 DEF_OPERATOR ("alignof", ALIGNOF_EXPR, "az", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("__lengthof__", LENGTHOF_EXPR, "lz", OVL_OP_FLAG_UNARY)
 DEF_OPERATOR ("__imag__", IMAGPART_EXPR, "v18__imag__", OVL_OP_FLAG_UNARY)
 DEF_OPERATOR ("__real__", REALPART_EXPR, "v18__real__", OVL_OP_FLAG_UNARY)
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 0b572afca72..4d6066f8acd 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -10391,6 +10391,18 @@  If the operand of the @code{__alignof__} expression is a function,
 the expression evaluates to the alignment of the function which may
 be specified by attribute @code{aligned} (@pxref{Common Function Attributes}).
 
+@node Length
+@section Determining the Length of Arrays
+@cindex length
+@cindex array length
+
+The keyword @code{__lengthof__} determines the length of an array operand,
+that is, the number of elements in the array.
+Its syntax is just like @code{sizeof},
+and the operand is evaluated following the same rules.
+(TODO: We probably want to restrict evaluation to top-level VLAs only.
+       This documentation describes the current implementation.)
+
 @node Inline
 @section An Inline Function is As Fast As a Macro
 @cindex inline functions
diff --git a/gcc/target.h b/gcc/target.h
index c1f99b97b86..79890ae9944 100644
--- a/gcc/target.h
+++ b/gcc/target.h
@@ -245,6 +245,9 @@  enum type_context_kind {
   /* Directly measuring the alignment of T.  */
   TCTX_ALIGNOF,
 
+  /* Directly measuring the length of array T.  */
+  TCTX_LENGTHOF,
+
   /* Creating objects of type T with static storage duration.  */
   TCTX_STATIC_STORAGE,