On Tue, Aug 06, 2024 at 10:38:55PM GMT, Alejandro Colomar wrote:
> Hi Qing,
>
> On Tue, Aug 06, 2024 at 08:15:50PM GMT, Qing Zhao wrote:
> > Some comments on the documentation part.
> >
> > (Hopefully, this time my quoting format is good, I checked the email
> > sent back to myself, no formatting issue, but when I checked the emails
> > in the archive, https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659593.html,
> > yst, I see the quoting format issue you mentioned, adjusted my mail client setting,
> > hopefully this time it’s good)
>
> Yup; it's good now. Thanks! ;)
>
> > > On Aug 6, 2024, at 08:22, Alejandro Colomar <alx@kernel.org> wrote:
> > >
> > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > > index 0b572afca72..6e1d302150d 100644
> > > --- a/gcc/doc/extend.texi
> > > +++ b/gcc/doc/extend.texi
> > > @@ -10391,6 +10391,33 @@ If the operand of the @code{__alignof__} expression is a function,
> > > the expression evaluates to the alignment of the function which may
> > > be specified by attribute @code{aligned} (@pxref{Common Function Attributes}).
> > >
> > > +@node Length
> > > +@section Determining the Length of Arrays
> > > +@cindex lengthof
> > > +@cindex length
> > > +@cindex array length
> > > +
> > > +The keyword @code{__lengthof__} determines the length of an array operand,
> > > +that is, the number of elements in the array.
> > > +Its syntax is just like @code{sizeof}.
> > > +The operand must be a complete array type.
> >
> > What’s the behavior when the array operand is a VLA,
>
> The number of elements in the array is returned.
> The value is a run-time value; not a constant expression.
> The behavior is identical to sizeof(a)/sizeof(*a), except that obviously
> the operand is only written once, so it's only evaluated once.
>
> size_t n = arc4random();
> assert(n == __lengthof__(int [n]));
>
> I think this is covered by
>
> +The keyword @code{__lengthof__} determines the length of an array operand,
> +that is, the number of elements in the array.
>
> > or a flexible array member?
>
> A flexible array member is an incomplete type, and so it is a constraint
> violation. It results in a compile-time error.
>
> I think this is covered by
>
> +The operand must be a complete array type.
>
> > Should these be clarified in the documentation part?
>
> I don't know. I think it's already documented. I prefer keeping it
> concise, as long as it's not ambiguous.
>
> > > +The operand is not evaluated
> > > +if the top-level length designator is an integer constant expression;
> > > +and it is evaluated
> > > +if the top-level length designator is not an integer constant expression.
> >
> > Can you add small examples to clarify the above?
>
> The examples would be rather confusing.
Sorry, I wanted to have removed the line above, after I came up with the
example below. :)
>
> __lengthof__(int [4][n]); // constant expression
> __lengthof__(int [n][4]); // run-time value
>
> I think these would be interesting in the docs. I'll check what would
> be the appropriate formatting for these. Thanks!
>
> > > +XXX: Do we want to document the following? I think so.
> > > +XXX: It would prevent users from relying on __lengthof__
> > > +XXX: for distinguishing arrays from pointers.
> > > +XXX: I don't want users to complain in the future
> > > +XXX: if this doesn't report errors on function parameters anymore
> > > +XXX: and that breaks their assumptions.
> > > +In the future,
> > > +it might also accept a function parameter with array notation,
> > > +an incomplete array whose length is specified by other means,
> > > +such as attributes,
> > > +or other similar cases.
> >
> > For incomplete array (for example, flexible array members), as I
> > mentioned in another email:
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659478.html
> >
> > I think that it’s better to wait the Length information is finally
> > integrated into TYPE.
>
> Yes, I agree for a start. But I also don't want to allow any user to
> rely on the fact that __lengthof__ rejects pointers to determine that
> something is not a pointer, because we reserve the possibility of
> extending the operator to work on certain pointers. (I'm not saying
> we'll do it; but we reserve the right.)
>
> Maybe we can just trust that nobody will do that, and complain if we see
> anyone using it in that way.
>
> I think I'll just remove that paragraph.
>
> Have a lovely night!
> Alex
>
> >
> > thanks.
> >
> > Qing
>
> --
> <https://www.alejandro-colomar.es/>
@@ -465,6 +465,7 @@ const struct c_common_resword c_common_reswords[] =
{ "__inline", RID_INLINE, 0 },
{ "__inline__", RID_INLINE, 0 },
{ "__label__", RID_LABEL, 0 },
+ { "__lengthof__", RID_LENGTHOF, 0 },
{ "__null", RID_NULL, 0 },
{ "__real", RID_REALPART, 0 },
{ "__real__", RID_REALPART, 0 },
@@ -4070,6 +4071,31 @@ c_alignof_expr (location_t loc, tree expr)
return fold_convert_loc (loc, size_type_node, t);
}
+
+/* Implement the lengthof keyword: Return the length of an array,
+ that is, the number of elements in the array. */
+
+tree
+c_lengthof_type (location_t loc, tree type)
+{
+ enum tree_code type_code;
+
+ type_code = TREE_CODE (type);
+ if (type_code != ARRAY_TYPE)
+ {
+ error_at (loc, "invalid application of %<lengthof%> to type %qT", type);
+ return error_mark_node;
+ }
+ if (!COMPLETE_TYPE_P (type))
+ {
+ error_at (loc,
+ "invalid application of %<lengthof%> to incomplete type %qT",
+ type);
+ return error_mark_node;
+ }
+
+ return array_type_nelts_top (type);
+}
/* Handle C and C++ default attributes. */
@@ -50,6 +50,9 @@ DEFTREECODE (EXCESS_PRECISION_EXPR, "excess_precision_expr", tcc_expression, 1)
number. */
DEFTREECODE (USERDEF_LITERAL, "userdef_literal", tcc_exceptional, 3)
+/* Represents a 'lengthof' expression. */
+DEFTREECODE (LENGTHOF_EXPR, "lengthof_expr", tcc_expression, 1)
+
/* Represents a 'sizeof' expression during C++ template expansion,
or for the purpose of -Wsizeof-pointer-memaccess warning. */
DEFTREECODE (SIZEOF_EXPR, "sizeof_expr", tcc_expression, 1)
@@ -105,6 +105,7 @@ enum rid
/* C extensions */
RID_ASM, RID_TYPEOF, RID_TYPEOF_UNQUAL, RID_ALIGNOF, RID_ATTRIBUTE,
+ RID_LENGTHOF,
RID_VA_ARG,
RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR,
RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE,
@@ -885,6 +886,7 @@ extern tree c_common_truthvalue_conversion (location_t, tree);
extern void c_apply_type_quals_to_decl (int, tree);
extern tree c_sizeof_or_alignof_type (location_t, tree, bool, bool, int);
extern tree c_alignof_expr (location_t, tree);
+extern tree c_lengthof_type (location_t, tree);
/* Print an error message for invalid operands to arith operation CODE.
NOP_EXPR is used as a special case (see truthvalue_conversion). */
extern void binary_op_error (rich_location *, enum tree_code, tree, tree);
@@ -8937,12 +8937,16 @@ start_struct (location_t loc, enum tree_code code, tree name,
within a statement expr used within sizeof, et. al. This is not
terribly serious as C++ doesn't permit statement exprs within
sizeof anyhow. */
- if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
+ if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof || in_lengthof))
warning_at (loc, OPT_Wc___compat,
"defining type in %qs expression is invalid in C++",
(in_sizeof
? "sizeof"
- : (in_typeof ? "typeof" : "alignof")));
+ : (in_typeof
+ ? "typeof"
+ : (in_alignof
+ ? "alignof"
+ : "lengthof"))));
if (in_underspecified_init)
error_at (loc, "%qT defined in underspecified object initializer", ref);
@@ -9897,7 +9901,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, tree attributes,
struct_types. */
if (warn_cxx_compat
&& struct_parse_info != NULL
- && !in_sizeof && !in_typeof && !in_alignof)
+ && !in_sizeof && !in_typeof && !in_alignof && !in_lengthof)
struct_parse_info->struct_types.safe_push (t);
}
@@ -10071,12 +10075,16 @@ start_enum (location_t loc, struct c_enum_contents *the_enum, tree name,
/* FIXME: This will issue a warning for a use of a type defined
within sizeof in a statement expr. This is not terribly serious
as C++ doesn't permit statement exprs within sizeof anyhow. */
- if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
+ if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof || in_lengthof))
warning_at (loc, OPT_Wc___compat,
"defining type in %qs expression is invalid in C++",
(in_sizeof
? "sizeof"
- : (in_typeof ? "typeof" : "alignof")));
+ : (in_typeof
+ ? "typeof"
+ : (in_alignof
+ ? "alignof"
+ : "lengthof"))));
if (in_underspecified_init)
error_at (loc, "%qT defined in underspecified object initializer",
@@ -10270,7 +10278,7 @@ finish_enum (tree enumtype, tree values, tree attributes)
struct_types. */
if (warn_cxx_compat
&& struct_parse_info != NULL
- && !in_sizeof && !in_typeof && !in_alignof)
+ && !in_sizeof && !in_typeof && !in_alignof && !in_lengthof)
struct_parse_info->struct_types.safe_push (enumtype);
/* Check for consistency with previous definition */
@@ -74,7 +74,17 @@ along with GCC; see the file COPYING3. If not see
#include "bitmap.h"
#include "analyzer/analyzer-language.h"
#include "toplev.h"
+
+#define c_parser_sizeof_expression(parser) \
+( \
+ c_parser_sizeof_or_lengthof_expression (parser, RID_SIZEOF) \
+)
+#define c_parser_lengthof_expression(parser) \
+( \
+ c_parser_sizeof_or_lengthof_expression (parser, RID_LENGTHOF) \
+)
+
/* We need to walk over decls with incomplete struct/union/enum types
after parsing the whole translation unit.
In finish_decl(), if the decl is static, has incomplete
@@ -1687,7 +1697,7 @@ static struct c_expr c_parser_binary_expression (c_parser *, struct c_expr *,
tree);
static struct c_expr c_parser_cast_expression (c_parser *, struct c_expr *);
static struct c_expr c_parser_unary_expression (c_parser *);
-static struct c_expr c_parser_sizeof_expression (c_parser *);
+static struct c_expr c_parser_sizeof_or_lengthof_expression (c_parser *, enum rid);
static struct c_expr c_parser_alignof_expression (c_parser *);
static struct c_expr c_parser_postfix_expression (c_parser *);
static struct c_expr c_parser_postfix_expression_after_paren_type (c_parser *,
@@ -9864,6 +9874,8 @@ c_parser_unary_expression (c_parser *parser)
case CPP_KEYWORD:
switch (c_parser_peek_token (parser)->keyword)
{
+ case RID_LENGTHOF:
+ return c_parser_lengthof_expression (parser);
case RID_SIZEOF:
return c_parser_sizeof_expression (parser);
case RID_ALIGNOF:
@@ -9903,12 +9915,13 @@ c_parser_unary_expression (c_parser *parser)
/* Parse a sizeof expression. */
static struct c_expr
-c_parser_sizeof_expression (c_parser *parser)
+c_parser_sizeof_or_lengthof_expression (c_parser *parser, enum rid rid)
{
+ const char *op_name = (rid == RID_LENGTHOF) ? "lengthof" : "sizeof";
struct c_expr expr;
struct c_expr result;
location_t expr_loc;
- gcc_assert (c_parser_next_token_is_keyword (parser, RID_SIZEOF));
+ gcc_assert (c_parser_next_token_is_keyword (parser, rid));
location_t start;
location_t finish = UNKNOWN_LOCATION;
@@ -9917,7 +9930,10 @@ c_parser_sizeof_expression (c_parser *parser)
c_parser_consume_token (parser);
c_inhibit_evaluation_warnings++;
- in_sizeof++;
+ if (rid == RID_LENGTHOF)
+ in_lengthof++;
+ else
+ in_sizeof++;
if (c_parser_next_token_is (parser, CPP_OPEN_PAREN)
&& c_token_starts_compound_literal (c_parser_peek_2nd_token (parser)))
{
@@ -9936,7 +9952,10 @@ c_parser_sizeof_expression (c_parser *parser)
{
struct c_expr ret;
c_inhibit_evaluation_warnings--;
- in_sizeof--;
+ if (rid == RID_LENGTHOF)
+ in_lengthof--;
+ else
+ in_sizeof--;
ret.set_error ();
ret.original_code = ERROR_MARK;
ret.original_type = NULL;
@@ -9948,31 +9967,45 @@ c_parser_sizeof_expression (c_parser *parser)
type_name,
expr_loc);
finish = expr.get_finish ();
- goto sizeof_expr;
+ goto Xof_expr;
}
/* sizeof ( type-name ). */
if (scspecs)
- error_at (expr_loc, "storage class specifier in %<sizeof%>");
+ error_at (expr_loc, "storage class specifier in %qs", op_name);
if (type_name->specs->alignas_p)
error_at (type_name->specs->locations[cdw_alignas],
- "alignment specified for type name in %<sizeof%>");
+ "alignment specified for type name in %qs", op_name);
c_inhibit_evaluation_warnings--;
- in_sizeof--;
- result = c_expr_sizeof_type (expr_loc, type_name);
+ if (rid == RID_LENGTHOF)
+ {
+ in_lengthof--;
+ result = c_expr_lengthof_type (expr_loc, type_name);
+ }
+ else
+ {
+ in_sizeof--;
+ result = c_expr_sizeof_type (expr_loc, type_name);
+ }
}
else
{
expr_loc = c_parser_peek_token (parser)->location;
expr = c_parser_unary_expression (parser);
finish = expr.get_finish ();
- sizeof_expr:
+ Xof_expr:
c_inhibit_evaluation_warnings--;
- in_sizeof--;
+ if (rid == RID_LENGTHOF)
+ in_lengthof--;
+ else
+ in_sizeof--;
mark_exp_read (expr.value);
if (TREE_CODE (expr.value) == COMPONENT_REF
&& DECL_C_BIT_FIELD (TREE_OPERAND (expr.value, 1)))
- error_at (expr_loc, "%<sizeof%> applied to a bit-field");
- result = c_expr_sizeof_expr (expr_loc, expr);
+ error_at (expr_loc, "%qs applied to a bit-field", op_name);
+ if (rid == RID_LENGTHOF)
+ result = c_expr_lengthof_expr (expr_loc, expr);
+ else
+ result = c_expr_sizeof_expr (expr_loc, expr);
}
if (finish == UNKNOWN_LOCATION)
finish = start;
@@ -736,6 +736,7 @@ extern int c_type_dwarf_attribute (const_tree, int);
/* in c-typeck.cc */
extern int in_alignof;
extern int in_sizeof;
+extern int in_lengthof;
extern int in_typeof;
extern bool c_in_omp_for;
extern bool c_omp_array_section_p;
@@ -786,6 +787,9 @@ extern tree build_external_ref (location_t, tree, bool, tree *);
extern void pop_maybe_used (bool);
extern struct c_expr c_expr_sizeof_expr (location_t, struct c_expr);
extern struct c_expr c_expr_sizeof_type (location_t, struct c_type_name *);
+extern struct c_expr c_expr_lengthof_expr (location_t, struct c_expr);
+extern struct c_expr c_expr_lengthof_type (location_t loc,
+ struct c_type_name *);
extern struct c_expr parser_build_unary_op (location_t, enum tree_code,
struct c_expr);
extern struct c_expr parser_build_binary_op (location_t,
@@ -71,6 +71,9 @@ int in_alignof;
/* The level of nesting inside "sizeof". */
int in_sizeof;
+/* The level of nesting inside "sizeof". */
+int in_lengthof;
+
/* The level of nesting inside "typeof". */
int in_typeof;
@@ -3255,7 +3258,7 @@ build_external_ref (location_t loc, tree id, bool fun, tree *type)
if (TREE_CODE (ref) == FUNCTION_DECL && !in_alignof)
{
- if (!in_sizeof && !in_typeof)
+ if (!in_sizeof && !in_typeof && !in_lengthof)
C_DECL_USED (ref) = 1;
else if (DECL_INITIAL (ref) == NULL_TREE
&& DECL_EXTERNAL (ref)
@@ -3311,7 +3314,7 @@ struct maybe_used_decl
{
/* The decl. */
tree decl;
- /* The level seen at (in_sizeof + in_typeof). */
+ /* The level seen at (in_sizeof + in_typeof + in_lengthof). */
int level;
/* The next one at this level or above, or NULL. */
struct maybe_used_decl *next;
@@ -3329,7 +3332,7 @@ record_maybe_used_decl (tree decl)
{
struct maybe_used_decl *t = XOBNEW (&parser_obstack, struct maybe_used_decl);
t->decl = decl;
- t->level = in_sizeof + in_typeof;
+ t->level = in_sizeof + in_typeof + in_lengthof;
t->next = maybe_used_decls;
maybe_used_decls = t;
}
@@ -3343,7 +3346,7 @@ void
pop_maybe_used (bool used)
{
struct maybe_used_decl *p = maybe_used_decls;
- int cur_level = in_sizeof + in_typeof;
+ int cur_level = in_sizeof + in_typeof + in_lengthof;
while (p && p->level > cur_level)
{
if (used)
@@ -3453,6 +3456,109 @@ c_expr_sizeof_type (location_t loc, struct c_type_name *t)
return ret;
}
+static bool
+is_top_array_vla (tree type)
+{
+ bool zero, var;
+ tree d;
+
+ if (TREE_CODE (type) != ARRAY_TYPE)
+ return false;
+ if (!COMPLETE_TYPE_P (type))
+ return false;
+
+ d = TYPE_DOMAIN (type);
+ zero = !TYPE_MAX_VALUE (d);
+ var = (!zero
+ && (TREE_CODE (TYPE_MIN_VALUE (d)) != INTEGER_CST
+ || TREE_CODE (TYPE_MAX_VALUE (d)) != INTEGER_CST));
+ var = var || (zero && C_TYPE_VARIABLE_SIZE (type));
+ return var;
+}
+
+/* Return the result of lengthof applied to EXPR. */
+
+struct c_expr
+c_expr_lengthof_expr (location_t loc, struct c_expr expr)
+{
+ struct c_expr ret;
+ if (expr.value == error_mark_node)
+ {
+ ret.value = error_mark_node;
+ ret.original_code = ERROR_MARK;
+ ret.original_type = NULL;
+ ret.m_decimal = 0;
+ pop_maybe_used (false);
+ }
+ else
+ {
+ bool expr_const_operands = true;
+
+ tree folded_expr = c_fully_fold (expr.value, require_constant_value,
+ &expr_const_operands);
+ ret.value = c_lengthof_type (loc, TREE_TYPE (folded_expr));
+ c_last_sizeof_arg = expr.value;
+ c_last_sizeof_loc = loc;
+ ret.original_code = LENGTHOF_EXPR;
+ ret.original_type = NULL;
+ ret.m_decimal = 0;
+ if (is_top_array_vla (TREE_TYPE (folded_expr)))
+ {
+ /* lengthof is evaluated when given a vla. */
+ ret.value = build2 (C_MAYBE_CONST_EXPR, TREE_TYPE (ret.value),
+ folded_expr, ret.value);
+ C_MAYBE_CONST_EXPR_NON_CONST (ret.value) = !expr_const_operands;
+ SET_EXPR_LOCATION (ret.value, loc);
+ }
+ pop_maybe_used (is_top_array_vla (TREE_TYPE (folded_expr)));
+ }
+ return ret;
+}
+
+/* Return the result of lengthof applied to T, a structure for the type
+ name passed to _lengthof (rather than the type itself). LOC is the
+ location of the original expression. */
+
+struct c_expr
+c_expr_lengthof_type (location_t loc, struct c_type_name *t)
+{
+ tree type;
+ struct c_expr ret;
+ tree type_expr = NULL_TREE;
+ bool type_expr_const = true;
+ type = groktypename (t, &type_expr, &type_expr_const);
+ ret.value = c_lengthof_type (loc, type);
+ c_last_sizeof_arg = type;
+ c_last_sizeof_loc = loc;
+ ret.original_code = LENGTHOF_EXPR;
+ ret.original_type = NULL;
+ ret.m_decimal = 0;
+ if (type == error_mark_node)
+ {
+ ret.value = error_mark_node;
+ ret.original_code = ERROR_MARK;
+ }
+ else
+ if ((type_expr || TREE_CODE (ret.value) == INTEGER_CST)
+ && is_top_array_vla (type))
+ {
+ /* If the type is a [*] array, it is a VLA but is represented as
+ having a size of zero. In such a case we must ensure that
+ the result of lengthof does not get folded to a constant by
+ c_fully_fold, because if the length is evaluated the result is
+ not constant and so constraints on zero or negative size
+ arrays must not be applied when this lengthof call is inside
+ another array declarator. */
+ if (!type_expr)
+ type_expr = integer_zero_node;
+ ret.value = build2 (C_MAYBE_CONST_EXPR, TREE_TYPE (ret.value),
+ type_expr, ret.value);
+ C_MAYBE_CONST_EXPR_NON_CONST (ret.value) = !type_expr_const;
+ }
+ pop_maybe_used (type != error_mark_node ? is_top_array_vla (type) : false);
+ return ret;
+}
+
/* Build a function call to function FUNCTION with parameters PARAMS.
The function call is at LOC.
PARAMS is a list--a chain of TREE_LIST nodes--in which the
@@ -91,6 +91,7 @@ DEF_OPERATOR ("co_await", CO_AWAIT_EXPR, "aw", OVL_OP_FLAG_UNARY)
/* These are extensions. */
DEF_OPERATOR ("alignof", ALIGNOF_EXPR, "az", OVL_OP_FLAG_UNARY)
+DEF_OPERATOR ("__lengthof__", LENGTHOF_EXPR, "lz", OVL_OP_FLAG_UNARY)
DEF_OPERATOR ("__imag__", IMAGPART_EXPR, "v18__imag__", OVL_OP_FLAG_UNARY)
DEF_OPERATOR ("__real__", REALPART_EXPR, "v18__real__", OVL_OP_FLAG_UNARY)
@@ -10391,6 +10391,33 @@ If the operand of the @code{__alignof__} expression is a function,
the expression evaluates to the alignment of the function which may
be specified by attribute @code{aligned} (@pxref{Common Function Attributes}).
+@node Length
+@section Determining the Length of Arrays
+@cindex lengthof
+@cindex length
+@cindex array length
+
+The keyword @code{__lengthof__} determines the length of an array operand,
+that is, the number of elements in the array.
+Its syntax is just like @code{sizeof}.
+The operand must be a complete array type.
+The operand is not evaluated
+if the top-level length designator is an integer constant expression;
+and it is evaluated
+if the top-level length designator is not an integer constant expression.
+
+XXX: Do we want to document the following? I think so.
+XXX: It would prevent users from relying on __lengthof__
+XXX: for distinguishing arrays from pointers.
+XXX: I don't want users to complain in the future
+XXX: if this doesn't report errors on function parameters anymore
+XXX: and that breaks their assumptions.
+In the future,
+it might also accept a function parameter with array notation,
+an incomplete array whose length is specified by other means,
+such as attributes,
+or other similar cases.
+
@node Inline
@section An Inline Function is As Fast As a Macro
@cindex inline functions
@@ -245,6 +245,9 @@ enum type_context_kind {
/* Directly measuring the alignment of T. */
TCTX_ALIGNOF,
+ /* Directly measuring the length of array T. */
+ TCTX_LENGTHOF,
+
/* Creating objects of type T with static storage duration. */
TCTX_STATIC_STORAGE,