c++: Perform immediate invocation evaluation separately from cp_fold_r [PR118068]
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-arm-bootstrap |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-aarch64-bootstrap |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Test passed
|
Commit Message
Hi!
The r14-4140 change moved consteval evaluation from build_over_call to
cp_fold_r.
The following testcase is a regression caused by that change. There
is a cast around immediate invocation, (int) foo (0x23)
where consteval for returns unsigned char. When the consteval call
has been folded early to 0x23 (with unsigned char type), cp_fold sees
(int) 0x23 and folds that to 0x23 with int type.
But when the immediate invocation is handled only during cp_fold_r,
cp_fold_r first calls cp_fold on the NOP_EXPR, which calls cp_fold
on its operand, it is CALL_EXPR, nothing is folded at that point.
Then cp_fold_r continues to walk the NOP_EXPR's operand, sees it is
an immediate function invocation and cp_fold_immediate_r calls
cxx_constant_value on it and replaces the CALL_EXPR with the INTEGER_CST
0x23. Nothing comes back to folding the containing NOP_EXPR though.
Sure, with optimizations enabled some GIMPLE optimization folds that later,
but e.g. with -O0 nothing does that. I think there could be arbitrarily
complex expressions on top of the immediate invocation(s) that used to be
folded by cp_fold before and aren't folded anymore.
One possibility would be to do the immediate invocation evaluation in
cp_fold rather than cp_fold_r (or in addition to cp_fold_r), but that
could mean we evaluate it multiple times and e.g. if there is an error
emit multiple errors.
The following patch instead first evaluates all immediate invocations and
does cp_fold_r in a separate step. That not only allows the folding of
expressions which contain immediate invocations, but also simplifies some
of the quirks that had to be done when it was in cp_fold_r.
Though, I had to add an extra case to cp_genericize_r RETURN_EXPR handling
to avoid a regression where after emitting errors in RETURN_EXPR argument
we've emitted a -Wreturn-type false positive. Previously we ended up with
RETURN_EXPR with CLEANUP_POINT_EXPR with INIT_EXPR of RESULT_DECL to
error_mark_node, now we fold it more and have RETURN_EXPR with
error_mark_node operand. The former would result during gimplification
into something -Wresult-type was quiet about, the latter doesn't.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
BTW, r14-4140 changed behavior on
consteval bool foo (bool x) { if (x) throw 1; return false; }
constexpr void
foo ()
{
if constexpr (false)
{
bool a = foo (true);
}
}
where GCC 13 emitted
error: expression ‘<throw-expression>’ is not a constant expression
error and GCC 14/trunk including the patch below doesn't reject it.
And clang++ trunk rejects it. It isn't immediately clear to me what
is right, if immediate invocations in discarded statements should
be evaluated or not.
2025-03-07 Jakub Jelinek <jakub@redhat.com>
PR target/118068
gcc/cp/
* cp-gimplify.cc (cp_fold_immediate): Use cp_walk_tree rather than
cp_walk_tree_without_duplicates.
(cp_fold_immediate_r): For IF_STMT_CONSTEVAL_P IF_STMT don't walk
into THEN_CLAUSE subtree, only ELSE_CLAUSE. For non-call related
stmts call data->pset.add and if it returns true, don't walk subtrees.
(cp_fold_r): Don't call cp_fold_immediate_r here.
(cp_fold_function): For C++20 or later call cp_walk_tree
with cp_fold_immediate_r callback first before calling cp_walk_tree
with cp_fold_r callback and call data.pset.empty () in between.
(cp_fully_fold_init): Likewise.
(cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
if RETURN_EXPR has erroneous argument.
gcc/testsuite/
* g++.target/i386/pr118068.C: New test.
Jakub
Comments
On 3/7/25 11:54 AM, Jakub Jelinek wrote:
> Hi!
>
> The r14-4140 change moved consteval evaluation from build_over_call to
> cp_fold_r.
>
> The following testcase is a regression caused by that change. There
> is a cast around immediate invocation, (int) foo (0x23)
> where consteval for returns unsigned char. When the consteval call
> has been folded early to 0x23 (with unsigned char type), cp_fold sees
> (int) 0x23 and folds that to 0x23 with int type.
> But when the immediate invocation is handled only during cp_fold_r,
> cp_fold_r first calls cp_fold on the NOP_EXPR, which calls cp_fold
> on its operand, it is CALL_EXPR, nothing is folded at that point.
> Then cp_fold_r continues to walk the NOP_EXPR's operand, sees it is
> an immediate function invocation and cp_fold_immediate_r calls
> cxx_constant_value on it and replaces the CALL_EXPR with the INTEGER_CST
> 0x23. Nothing comes back to folding the containing NOP_EXPR though.
> Sure, with optimizations enabled some GIMPLE optimization folds that later,
> but e.g. with -O0 nothing does that. I think there could be arbitrarily
> complex expressions on top of the immediate invocation(s) that used to be
> folded by cp_fold before and aren't folded anymore.
>
> One possibility would be to do the immediate invocation evaluation in
> cp_fold rather than cp_fold_r (or in addition to cp_fold_r), but that
> could mean we evaluate it multiple times and e.g. if there is an error
> emit multiple errors.
>
> The following patch instead first evaluates all immediate invocations and
> does cp_fold_r in a separate step. That not only allows the folding of
> expressions which contain immediate invocations, but also simplifies some
> of the quirks that had to be done when it was in cp_fold_r.
> Though, I had to add an extra case to cp_genericize_r RETURN_EXPR handling
> to avoid a regression where after emitting errors in RETURN_EXPR argument
> we've emitted a -Wreturn-type false positive. Previously we ended up with
> RETURN_EXPR with CLEANUP_POINT_EXPR with INIT_EXPR of RESULT_DECL to
> error_mark_node, now we fold it more and have RETURN_EXPR with
> error_mark_node operand. The former would result during gimplification
> into something -Wresult-type was quiet about, the latter doesn't.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> BTW, r14-4140 changed behavior on
> consteval bool foo (bool x) { if (x) throw 1; return false; }
>
> constexpr void
> foo ()
> {
> if constexpr (false)
> {
> bool a = foo (true);
> }
> }
> where GCC 13 emitted
> error: expression ‘<throw-expression>’ is not a constant expression
> error and GCC 14/trunk including the patch below doesn't reject it.
> And clang++ trunk rejects it. It isn't immediately clear to me what
> is right, if immediate invocations in discarded statements should
> be evaluated or not.
>
> 2025-03-07 Jakub Jelinek <jakub@redhat.com>
>
> PR target/118068
> gcc/cp/
> * cp-gimplify.cc (cp_fold_immediate): Use cp_walk_tree rather than
> cp_walk_tree_without_duplicates.
> (cp_fold_immediate_r): For IF_STMT_CONSTEVAL_P IF_STMT don't walk
> into THEN_CLAUSE subtree, only ELSE_CLAUSE. For non-call related
> stmts call data->pset.add and if it returns true, don't walk subtrees.
> (cp_fold_r): Don't call cp_fold_immediate_r here.
> (cp_fold_function): For C++20 or later call cp_walk_tree
> with cp_fold_immediate_r callback first before calling cp_walk_tree
> with cp_fold_r callback and call data.pset.empty () in between.
> (cp_fully_fold_init): Likewise.
> (cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
> if RETURN_EXPR has erroneous argument.
> gcc/testsuite/
> * g++.target/i386/pr118068.C: New test.
>
> --- gcc/cp/cp-gimplify.cc.jj 2025-03-06 17:26:00.547975990 +0100
> +++ gcc/cp/cp-gimplify.cc 2025-03-07 12:07:41.578078222 +0100
> @@ -519,7 +519,7 @@ cp_fold_immediate (tree *tp, mce_value m
>
> cp_fold_data data (flags);
> int save_errorcount = errorcount;
> - tree r = cp_walk_tree_without_duplicates (tp, cp_fold_immediate_r, &data);
> + tree r = cp_walk_tree (tp, cp_fold_immediate_r, &data, NULL);
> if (errorcount > save_errorcount)
> return integer_one_node;
> return r;
> @@ -1204,13 +1204,14 @@ cp_build_init_expr_for_ctor (tree call,
> return init;
> }
>
> -/* A subroutine of cp_fold_r to handle immediate functions. */
> +/* A walk_tree callback for cp_fold_function and cp_fully_fold_init to handle
> + immediate functions. */
>
> static tree
> cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, void *data_)
> {
> auto data = static_cast<cp_fold_data *>(data_);
> tree stmt = *stmt_p;
> /* The purpose of this is not to emit errors for mce_unknown. */
> const tsubst_flags_t complain = (data->flags & ff_mce_false
> ? tf_error : tf_none);
> @@ -1250,7 +1251,19 @@ cp_fold_immediate_r (tree *stmt_p, int *
> if (!ADDR_EXPR_DENOTES_CALL_P (stmt))
> decl = TREE_OPERAND (stmt, 0);
> break;
> + case IF_STMT:
> + if (IF_STMT_CONSTEVAL_P (stmt))
> + {
> + if (!data->pset.add (stmt))
> + cp_walk_tree (&ELSE_CLAUSE (stmt), cp_fold_immediate_r, data_,
> + NULL);
> + *walk_subtrees = 0;
> + return NULL_TREE;
> + }
> + /* FALLTHRU */
> default:
> + if (data->pset.add (stmt))
> + *walk_subtrees = 0;
> return NULL_TREE;
> }
>
> @@ -1370,45 +1383,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
> tree stmt = *stmt_p;
> enum tree_code code = TREE_CODE (stmt);
>
> - if (cxx_dialect >= cxx20)
> - {
> - /* Unfortunately we must handle code like
> - false ? bar () : 42
> - where we have to check bar too. The cp_fold call below could
> - fold the ?: into a constant before we've checked it. */
> - if (code == COND_EXPR)
> - {
> - auto then_fn = cp_fold_r, else_fn = cp_fold_r;
> - /* See if we can figure out if either of the branches is dead. If it
> - is, we don't need to do everything that cp_fold_r does. */
> - cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
> - if (integer_zerop (TREE_OPERAND (stmt, 0)))
> - then_fn = cp_fold_immediate_r;
> - else if (integer_nonzerop (TREE_OPERAND (stmt, 0)))
> - else_fn = cp_fold_immediate_r;
> -
> - if (TREE_OPERAND (stmt, 1))
> - cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
> - nullptr);
> - if (TREE_OPERAND (stmt, 2))
> - cp_walk_tree (&TREE_OPERAND (stmt, 2), else_fn, data,
> - nullptr);
> - *walk_subtrees = 0;
> - /* Don't return yet, still need the cp_fold below. */
> - }
> - else
> - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> - }
If we're removing the call to cp_fold_immediate_r, we should copy over
the early "this affects cp_fold_r" exit from that function that was
previously removed as redundant.
> *stmt_p = stmt = cp_fold (*stmt_p, data->flags);
>
> - /* For certain trees, like +foo(), the cp_fold above will remove the +,
> - and the subsequent tree walk would go straight down to the CALL_EXPR's
> - operands, meaning that cp_fold_immediate_r would never see the
> - CALL_EXPR. Ew :(. */
> - if (TREE_CODE (stmt) == CALL_EXPR && code != CALL_EXPR)
> - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> -
> if (data->pset.add (stmt))
> {
> /* Don't walk subtrees of stmts we've already walked once, otherwise
> @@ -1537,6 +1513,17 @@ cp_fold_function (tree fndecl)
> been constant-evaluated already if possible, so we can safely
> pass ff_mce_false. */
> cp_fold_data data (ff_genericize | ff_mce_false);
> + /* Do cp_fold_immediate_r in separate whole IL walk instead of during
> + cp_fold_r, as otherwise expressions using results of immediate functions
> + might not be folded as cp_fold is called on those before cp_fold_r is
> + called on their argument. And calling cp_fold_immediate_r during
> + cp_fold can mean evaluation of the immediate functions many times. */
Hmm, fold_cache should prevent multiple evaluation?
> + if (cxx_dialect >= cxx20)
> + {
> + cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_immediate_r,
> + &data, NULL);
> + data.pset.empty ();
> + }
> cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_r, &data, NULL);
>
> /* This is merely an optimization: if FNDECL has no i-e expressions,
> @@ -1717,6 +1704,11 @@ cp_genericize_r (tree *stmt_p, int *walk
> case RETURN_EXPR:
> if (TREE_OPERAND (stmt, 0))
> {
> + if (error_operand_p (TREE_OPERAND (stmt, 0))
> + && warn_return_type)
> + /* Suppress -Wreturn-type for this function. */
> + suppress_warning (current_function_decl, OPT_Wreturn_type);
> +
> if (is_invisiref_parm (TREE_OPERAND (stmt, 0)))
> /* Don't dereference an invisiref RESULT_DECL inside a
> RETURN_EXPR. */
> @@ -2910,6 +2902,11 @@ cp_fully_fold_init (tree x)
> return x;
> x = cp_fully_fold (x, mce_false);
> cp_fold_data data (ff_mce_false);
> + if (cxx_dialect >= cxx20)
> + {
> + cp_walk_tree (&x, cp_fold_immediate_r, &data, NULL);
> + data.pset.empty ();
> + }
> cp_walk_tree (&x, cp_fold_r, &data, NULL);
> return x;
> }
> --- gcc/testsuite/g++.target/i386/pr118068.C.jj 2025-03-07 10:38:15.551662990 +0100
> +++ gcc/testsuite/g++.target/i386/pr118068.C 2025-03-07 10:38:15.551662990 +0100
> @@ -0,0 +1,17 @@
> +// PR target/118068
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O0 -mavx" }
> +
> +typedef float V __attribute__((vector_size (32)));
> +
> +consteval unsigned char
> +foo (int x)
> +{
> + return x;
> +}
> +
> +V
> +bar (V x, V y)
> +{
> + return __builtin_ia32_blendps256 (x, y, (int) foo (0x23));
> +}
>
> Jakub
>
On Wed, Mar 12, 2025 at 05:39:46PM -0400, Jason Merrill wrote:
> On 3/7/25 11:54 AM, Jakub Jelinek wrote:
> > Hi!
> >
> > The r14-4140 change moved consteval evaluation from build_over_call to
> > cp_fold_r.
> >
> > The following testcase is a regression caused by that change. There
> > is a cast around immediate invocation, (int) foo (0x23)
> > where consteval for returns unsigned char. When the consteval call
> > has been folded early to 0x23 (with unsigned char type), cp_fold sees
> > (int) 0x23 and folds that to 0x23 with int type.
> > But when the immediate invocation is handled only during cp_fold_r,
> > cp_fold_r first calls cp_fold on the NOP_EXPR, which calls cp_fold
> > on its operand, it is CALL_EXPR, nothing is folded at that point.
> > Then cp_fold_r continues to walk the NOP_EXPR's operand, sees it is
> > an immediate function invocation and cp_fold_immediate_r calls
> > cxx_constant_value on it and replaces the CALL_EXPR with the INTEGER_CST
> > 0x23. Nothing comes back to folding the containing NOP_EXPR though.
> > Sure, with optimizations enabled some GIMPLE optimization folds that later,
> > but e.g. with -O0 nothing does that. I think there could be arbitrarily
> > complex expressions on top of the immediate invocation(s) that used to be
> > folded by cp_fold before and aren't folded anymore.
> >
> > One possibility would be to do the immediate invocation evaluation in
> > cp_fold rather than cp_fold_r (or in addition to cp_fold_r), but that
> > could mean we evaluate it multiple times and e.g. if there is an error
> > emit multiple errors.
> >
> > The following patch instead first evaluates all immediate invocations and
> > does cp_fold_r in a separate step. That not only allows the folding of
> > expressions which contain immediate invocations, but also simplifies some
> > of the quirks that had to be done when it was in cp_fold_r.
> > Though, I had to add an extra case to cp_genericize_r RETURN_EXPR handling
> > to avoid a regression where after emitting errors in RETURN_EXPR argument
> > we've emitted a -Wreturn-type false positive. Previously we ended up with
> > RETURN_EXPR with CLEANUP_POINT_EXPR with INIT_EXPR of RESULT_DECL to
> > error_mark_node, now we fold it more and have RETURN_EXPR with
> > error_mark_node operand. The former would result during gimplification
> > into something -Wresult-type was quiet about, the latter doesn't.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > BTW, r14-4140 changed behavior on
> > consteval bool foo (bool x) { if (x) throw 1; return false; }
> >
> > constexpr void
> > foo ()
> > {
> > if constexpr (false)
> > {
> > bool a = foo (true);
> > }
> > }
> > where GCC 13 emitted
> > error: expression ‘<throw-expression>’ is not a constant expression
> > error and GCC 14/trunk including the patch below doesn't reject it.
> > And clang++ trunk rejects it. It isn't immediately clear to me what
> > is right, if immediate invocations in discarded statements should
> > be evaluated or not.
> >
> > 2025-03-07 Jakub Jelinek <jakub@redhat.com>
> >
> > PR target/118068
> > gcc/cp/
> > * cp-gimplify.cc (cp_fold_immediate): Use cp_walk_tree rather than
> > cp_walk_tree_without_duplicates.
> > (cp_fold_immediate_r): For IF_STMT_CONSTEVAL_P IF_STMT don't walk
> > into THEN_CLAUSE subtree, only ELSE_CLAUSE. For non-call related
> > stmts call data->pset.add and if it returns true, don't walk subtrees.
> > (cp_fold_r): Don't call cp_fold_immediate_r here.
> > (cp_fold_function): For C++20 or later call cp_walk_tree
> > with cp_fold_immediate_r callback first before calling cp_walk_tree
> > with cp_fold_r callback and call data.pset.empty () in between.
> > (cp_fully_fold_init): Likewise.
> > (cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
> > if RETURN_EXPR has erroneous argument.
> > gcc/testsuite/
> > * g++.target/i386/pr118068.C: New test.
> >
> > --- gcc/cp/cp-gimplify.cc.jj 2025-03-06 17:26:00.547975990 +0100
> > +++ gcc/cp/cp-gimplify.cc 2025-03-07 12:07:41.578078222 +0100
> > @@ -519,7 +519,7 @@ cp_fold_immediate (tree *tp, mce_value m
> > cp_fold_data data (flags);
> > int save_errorcount = errorcount;
> > - tree r = cp_walk_tree_without_duplicates (tp, cp_fold_immediate_r, &data);
> > + tree r = cp_walk_tree (tp, cp_fold_immediate_r, &data, NULL);
> > if (errorcount > save_errorcount)
> > return integer_one_node;
> > return r;
> > @@ -1204,13 +1204,14 @@ cp_build_init_expr_for_ctor (tree call,
> > return init;
> > }
> > -/* A subroutine of cp_fold_r to handle immediate functions. */
> > +/* A walk_tree callback for cp_fold_function and cp_fully_fold_init to handle
> > + immediate functions. */
> > static tree
> > cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, void *data_)
> > {
> > auto data = static_cast<cp_fold_data *>(data_);
> > tree stmt = *stmt_p;
> > /* The purpose of this is not to emit errors for mce_unknown. */
> > const tsubst_flags_t complain = (data->flags & ff_mce_false
> > ? tf_error : tf_none);
> > @@ -1250,7 +1251,19 @@ cp_fold_immediate_r (tree *stmt_p, int *
> > if (!ADDR_EXPR_DENOTES_CALL_P (stmt))
> > decl = TREE_OPERAND (stmt, 0);
> > break;
> > + case IF_STMT:
> > + if (IF_STMT_CONSTEVAL_P (stmt))
> > + {
> > + if (!data->pset.add (stmt))
> > + cp_walk_tree (&ELSE_CLAUSE (stmt), cp_fold_immediate_r, data_,
> > + NULL);
> > + *walk_subtrees = 0;
> > + return NULL_TREE;
> > + }
> > + /* FALLTHRU */
> > default:
> > + if (data->pset.add (stmt))
> > + *walk_subtrees = 0;
> > return NULL_TREE;
> > }
> > @@ -1370,45 +1383,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
> > tree stmt = *stmt_p;
> > enum tree_code code = TREE_CODE (stmt);
> > - if (cxx_dialect >= cxx20)
> > - {
> > - /* Unfortunately we must handle code like
> > - false ? bar () : 42
> > - where we have to check bar too. The cp_fold call below could
> > - fold the ?: into a constant before we've checked it. */
> > - if (code == COND_EXPR)
> > - {
> > - auto then_fn = cp_fold_r, else_fn = cp_fold_r;
> > - /* See if we can figure out if either of the branches is dead. If it
> > - is, we don't need to do everything that cp_fold_r does. */
> > - cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
> > - if (integer_zerop (TREE_OPERAND (stmt, 0)))
> > - then_fn = cp_fold_immediate_r;
> > - else if (integer_nonzerop (TREE_OPERAND (stmt, 0)))
> > - else_fn = cp_fold_immediate_r;
> > -
> > - if (TREE_OPERAND (stmt, 1))
> > - cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
> > - nullptr);
> > - if (TREE_OPERAND (stmt, 2))
> > - cp_walk_tree (&TREE_OPERAND (stmt, 2), else_fn, data,
> > - nullptr);
> > - *walk_subtrees = 0;
> > - /* Don't return yet, still need the cp_fold below. */
> > - }
> > - else
> > - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> > - }
>
> If we're removing the call to cp_fold_immediate_r, we should copy over the
> early "this affects cp_fold_r" exit from that function that was previously
> removed as redundant.
You mean
/* No need to look into types or unevaluated operands.
NB: This affects cp_fold_r as well. */
if (TYPE_P (stmt) || unevaluated_p (TREE_CODE (stmt)))
{
*walk_subtrees = 0;
return NULL_TREE;
}
? Note, GCC 13 didn't have anything like that in cp_fold_r.
> > @@ -1537,6 +1513,17 @@ cp_fold_function (tree fndecl)
> > been constant-evaluated already if possible, so we can safely
> > pass ff_mce_false. */
> > cp_fold_data data (ff_genericize | ff_mce_false);
> > + /* Do cp_fold_immediate_r in separate whole IL walk instead of during
> > + cp_fold_r, as otherwise expressions using results of immediate functions
> > + might not be folded as cp_fold is called on those before cp_fold_r is
> > + called on their argument. And calling cp_fold_immediate_r during
> > + cp_fold can mean evaluation of the immediate functions many times. */
>
> Hmm, fold_cache should prevent multiple evaluation?
So would you prefer something like the patch below instead?
Currently it regresses
FAIL: g++.dg/cpp2a/consteval-prop15.C -std=gnu++20 (test for excess errors)
FAIL: g++.dg/cpp2a/consteval-prop15.C -std=gnu++23 (test for excess errors)
FAIL: g++.dg/cpp2a/consteval-prop15.C -std=gnu++26 (test for excess errors)
FAIL: g++.dg/cpp2a/consteval-prop2.C -std=c++20 (test for warnings, line 10)
FAIL: g++.dg/cpp2a/consteval-prop2.C -std=c++23 (test for warnings, line 10)
FAIL: g++.dg/cpp2a/consteval-prop2.C -std=c++26 (test for warnings, line 10)
and fixes the new testcase in the patch.
The excess errors are
/usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:104:6: error: call to consteval function 'f9<int>(non_const)' is not a constant expression
/usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:104:6: error: 'non_const' is not a constant expression
/usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:106:7: error: call to consteval function 'f10<int>(non_const)' is not a constant expression
/usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:106:7: error: 'non_const' is not a constant expression
And in the other tests the differences on stderr are
-consteval-prop2.C:10:16: note: ‘constexpr int f(T) [with T = int]’ was promoted to an immediate function because its body contains an immediate-escalating expression ‘id(t)’
- 10 | return t + id(t); // { dg-message "immediate-escalating expression .id\\(t\\)." }
- | ~~^~~
+consteval-prop2.C:14:10: note: ‘constexpr int f(T) [with T = int]’ was promoted to an immediate function
and
-consteval-prop2.C:71:21: error: ‘i’ is not a constant expression
+consteval-prop2.C:71:22: error: ‘i’ is not a constant expression
+ 71 | + k<int>(i); // { dg-bogus "call to|constant" "" { xfail *-*-* } }
+ | ^
2025-03-17 Jakub Jelinek <jakub@redhat.com>
PR target/118068
gcc/cp/
* cp-gimplify.cc (get_fold_cache): Add prototype.
(cp_fold_immediate_r): For CALL_EXPR or ADDR_EXPR, query fold_cache
and if stmt is there already and not mapped to self, exit early.
(cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
if RETURN_EXPR has erroneous argument.
(cp_fold) <case ADDR_EXPR>: For C++20 call cp_fold_immediate_r.
(cp_fold) <case CALL_EXPR>: Likewise.
gcc/testsuite/
* g++.target/i386/pr118068.C: New test.
--- gcc/cp/cp-gimplify.cc.jj 2025-03-14 15:31:40.170197353 +0100
+++ gcc/cp/cp-gimplify.cc 2025-03-17 16:51:14.644040172 +0100
@@ -1204,6 +1204,8 @@ cp_build_init_expr_for_ctor (tree call,
return init;
}
+static hash_map<tree, tree> *&get_fold_cache (fold_flags_t);
+
/* A subroutine of cp_fold_r to handle immediate functions. */
static tree
@@ -1257,6 +1259,22 @@ cp_fold_immediate_r (tree *stmt_p, int *
if (!decl || TREE_CODE (decl) != FUNCTION_DECL)
return NULL_TREE;
+ /* If this call has been evaluated before by cp_fold and
+ stored into the fold cache, just use earlier computed replacement. */
+ if (code == CALL_EXPR || code == ADDR_EXPR)
+ {
+ auto &fold_cache = get_fold_cache (data->flags);
+ if (fold_cache != NULL)
+ {
+ if (tree *cached = fold_cache->get (stmt))
+ if (*cached != stmt)
+ {
+ *stmt_p = *cached;
+ return NULL_TREE;
+ }
+ }
+ }
+
/* Fully escalate once all templates have been instantiated. What we're
calling is not a consteval function but it may become one. This
requires recursing; DECL may be promoted to consteval because it
@@ -1717,6 +1735,11 @@ cp_genericize_r (tree *stmt_p, int *walk
case RETURN_EXPR:
if (TREE_OPERAND (stmt, 0))
{
+ if (error_operand_p (TREE_OPERAND (stmt, 0))
+ && warn_return_type)
+ /* Suppress -Wreturn-type for this function. */
+ suppress_warning (current_function_decl, OPT_Wreturn_type);
+
if (is_invisiref_parm (TREE_OPERAND (stmt, 0)))
/* Don't dereference an invisiref RESULT_DECL inside a
RETURN_EXPR. */
@@ -3082,6 +3105,19 @@ cp_fold (tree x, fold_flags_t flags)
goto unary;
case ADDR_EXPR:
+ if (cxx_dialect >= cxx20
+ && !ADDR_EXPR_DENOTES_CALL_P (x)
+ && TREE_CODE (TREE_OPERAND (x, 0)) == FUNCTION_DECL)
+ {
+ int walk_subtrees = 0;
+ cp_fold_data data (flags);
+ cp_fold_immediate_r (&x, &walk_subtrees, &data);
+ if (x != org_x)
+ {
+ x = cp_fold (x, flags);
+ break;
+ }
+ }
loc = EXPR_LOCATION (x);
op0 = cp_fold_maybe_rvalue (TREE_OPERAND (x, 0), false, flags);
@@ -3320,6 +3356,17 @@ cp_fold (tree x, fold_flags_t flags)
case CALL_EXPR:
{
+ if (cxx_dialect >= cxx20)
+ {
+ int walk_subtrees = 0;
+ cp_fold_data data (flags);
+ cp_fold_immediate_r (&x, &walk_subtrees, &data);
+ if (x != org_x)
+ {
+ x = cp_fold (x, flags);
+ break;
+ }
+ }
tree callee = get_callee_fndecl (x);
/* "Inline" calls to std::move/forward and other cast-like functions
--- gcc/testsuite/g++.target/i386/pr118068.C.jj 2025-03-07 10:38:15.551662990 +0100
+++ gcc/testsuite/g++.target/i386/pr118068.C 2025-03-07 10:38:15.551662990 +0100
@@ -0,0 +1,17 @@
+// PR target/118068
+// { dg-do compile { target c++20 } }
+// { dg-options "-O0 -mavx" }
+
+typedef float V __attribute__((vector_size (32)));
+
+consteval unsigned char
+foo (int x)
+{
+ return x;
+}
+
+V
+bar (V x, V y)
+{
+ return __builtin_ia32_blendps256 (x, y, (int) foo (0x23));
+}
Jakub
On 3/17/25 12:09 PM, Jakub Jelinek wrote:
> On Wed, Mar 12, 2025 at 05:39:46PM -0400, Jason Merrill wrote:
>> On 3/7/25 11:54 AM, Jakub Jelinek wrote:
>>> Hi!
>>>
>>> The r14-4140 change moved consteval evaluation from build_over_call to
>>> cp_fold_r.
>>>
>>> The following testcase is a regression caused by that change. There
>>> is a cast around immediate invocation, (int) foo (0x23)
>>> where consteval for returns unsigned char. When the consteval call
>>> has been folded early to 0x23 (with unsigned char type), cp_fold sees
>>> (int) 0x23 and folds that to 0x23 with int type.
>>> But when the immediate invocation is handled only during cp_fold_r,
>>> cp_fold_r first calls cp_fold on the NOP_EXPR, which calls cp_fold
>>> on its operand, it is CALL_EXPR, nothing is folded at that point.
>>> Then cp_fold_r continues to walk the NOP_EXPR's operand, sees it is
>>> an immediate function invocation and cp_fold_immediate_r calls
>>> cxx_constant_value on it and replaces the CALL_EXPR with the INTEGER_CST
>>> 0x23. Nothing comes back to folding the containing NOP_EXPR though.
>>> Sure, with optimizations enabled some GIMPLE optimization folds that later,
>>> but e.g. with -O0 nothing does that. I think there could be arbitrarily
>>> complex expressions on top of the immediate invocation(s) that used to be
>>> folded by cp_fold before and aren't folded anymore.
>>>
>>> One possibility would be to do the immediate invocation evaluation in
>>> cp_fold rather than cp_fold_r (or in addition to cp_fold_r), but that
>>> could mean we evaluate it multiple times and e.g. if there is an error
>>> emit multiple errors.
>>>
>>> The following patch instead first evaluates all immediate invocations and
>>> does cp_fold_r in a separate step. That not only allows the folding of
>>> expressions which contain immediate invocations, but also simplifies some
>>> of the quirks that had to be done when it was in cp_fold_r.
>>> Though, I had to add an extra case to cp_genericize_r RETURN_EXPR handling
>>> to avoid a regression where after emitting errors in RETURN_EXPR argument
>>> we've emitted a -Wreturn-type false positive. Previously we ended up with
>>> RETURN_EXPR with CLEANUP_POINT_EXPR with INIT_EXPR of RESULT_DECL to
>>> error_mark_node, now we fold it more and have RETURN_EXPR with
>>> error_mark_node operand. The former would result during gimplification
>>> into something -Wresult-type was quiet about, the latter doesn't.
>>>
>>> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>>
>>> BTW, r14-4140 changed behavior on
>>> consteval bool foo (bool x) { if (x) throw 1; return false; }
>>>
>>> constexpr void
>>> foo ()
>>> {
>>> if constexpr (false)
>>> {
>>> bool a = foo (true);
>>> }
>>> }
>>> where GCC 13 emitted
>>> error: expression ‘<throw-expression>’ is not a constant expression
>>> error and GCC 14/trunk including the patch below doesn't reject it.
>>> And clang++ trunk rejects it. It isn't immediately clear to me what
>>> is right, if immediate invocations in discarded statements should
>>> be evaluated or not.
>>>
>>> 2025-03-07 Jakub Jelinek <jakub@redhat.com>
>>>
>>> PR target/118068
>>> gcc/cp/
>>> * cp-gimplify.cc (cp_fold_immediate): Use cp_walk_tree rather than
>>> cp_walk_tree_without_duplicates.
>>> (cp_fold_immediate_r): For IF_STMT_CONSTEVAL_P IF_STMT don't walk
>>> into THEN_CLAUSE subtree, only ELSE_CLAUSE. For non-call related
>>> stmts call data->pset.add and if it returns true, don't walk subtrees.
>>> (cp_fold_r): Don't call cp_fold_immediate_r here.
>>> (cp_fold_function): For C++20 or later call cp_walk_tree
>>> with cp_fold_immediate_r callback first before calling cp_walk_tree
>>> with cp_fold_r callback and call data.pset.empty () in between.
>>> (cp_fully_fold_init): Likewise.
>>> (cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
>>> if RETURN_EXPR has erroneous argument.
>>> gcc/testsuite/
>>> * g++.target/i386/pr118068.C: New test.
>>>
>>> --- gcc/cp/cp-gimplify.cc.jj 2025-03-06 17:26:00.547975990 +0100
>>> +++ gcc/cp/cp-gimplify.cc 2025-03-07 12:07:41.578078222 +0100
>>> @@ -519,7 +519,7 @@ cp_fold_immediate (tree *tp, mce_value m
>>> cp_fold_data data (flags);
>>> int save_errorcount = errorcount;
>>> - tree r = cp_walk_tree_without_duplicates (tp, cp_fold_immediate_r, &data);
>>> + tree r = cp_walk_tree (tp, cp_fold_immediate_r, &data, NULL);
>>> if (errorcount > save_errorcount)
>>> return integer_one_node;
>>> return r;
>>> @@ -1204,13 +1204,14 @@ cp_build_init_expr_for_ctor (tree call,
>>> return init;
>>> }
>>> -/* A subroutine of cp_fold_r to handle immediate functions. */
>>> +/* A walk_tree callback for cp_fold_function and cp_fully_fold_init to handle
>>> + immediate functions. */
>>> static tree
>>> cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, void *data_)
>>> {
>>> auto data = static_cast<cp_fold_data *>(data_);
>>> tree stmt = *stmt_p;
>>> /* The purpose of this is not to emit errors for mce_unknown. */
>>> const tsubst_flags_t complain = (data->flags & ff_mce_false
>>> ? tf_error : tf_none);
>>> @@ -1250,7 +1251,19 @@ cp_fold_immediate_r (tree *stmt_p, int *
>>> if (!ADDR_EXPR_DENOTES_CALL_P (stmt))
>>> decl = TREE_OPERAND (stmt, 0);
>>> break;
>>> + case IF_STMT:
>>> + if (IF_STMT_CONSTEVAL_P (stmt))
>>> + {
>>> + if (!data->pset.add (stmt))
>>> + cp_walk_tree (&ELSE_CLAUSE (stmt), cp_fold_immediate_r, data_,
>>> + NULL);
>>> + *walk_subtrees = 0;
>>> + return NULL_TREE;
>>> + }
>>> + /* FALLTHRU */
>>> default:
>>> + if (data->pset.add (stmt))
>>> + *walk_subtrees = 0;
>>> return NULL_TREE;
>>> }
>>> @@ -1370,45 +1383,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
>>> tree stmt = *stmt_p;
>>> enum tree_code code = TREE_CODE (stmt);
>>> - if (cxx_dialect >= cxx20)
>>> - {
>>> - /* Unfortunately we must handle code like
>>> - false ? bar () : 42
>>> - where we have to check bar too. The cp_fold call below could
>>> - fold the ?: into a constant before we've checked it. */
>>> - if (code == COND_EXPR)
>>> - {
>>> - auto then_fn = cp_fold_r, else_fn = cp_fold_r;
>>> - /* See if we can figure out if either of the branches is dead. If it
>>> - is, we don't need to do everything that cp_fold_r does. */
>>> - cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
>>> - if (integer_zerop (TREE_OPERAND (stmt, 0)))
>>> - then_fn = cp_fold_immediate_r;
>>> - else if (integer_nonzerop (TREE_OPERAND (stmt, 0)))
>>> - else_fn = cp_fold_immediate_r;
>>> -
>>> - if (TREE_OPERAND (stmt, 1))
>>> - cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
>>> - nullptr);
>>> - if (TREE_OPERAND (stmt, 2))
>>> - cp_walk_tree (&TREE_OPERAND (stmt, 2), else_fn, data,
>>> - nullptr);
>>> - *walk_subtrees = 0;
>>> - /* Don't return yet, still need the cp_fold below. */
>>> - }
>>> - else
>>> - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
>>> - }
>>
>> If we're removing the call to cp_fold_immediate_r, we should copy over the
>> early "this affects cp_fold_r" exit from that function that was previously
>> removed as redundant.
>
> You mean
> /* No need to look into types or unevaluated operands.
> NB: This affects cp_fold_r as well. */
> if (TYPE_P (stmt) || unevaluated_p (TREE_CODE (stmt)))
> {
> *walk_subtrees = 0;
> return NULL_TREE;
> }
> ? Note, GCC 13 didn't have anything like that in cp_fold_r.
Hmm, indeed, I was misremembering. I still think it would be good to
add it.
>>> @@ -1537,6 +1513,17 @@ cp_fold_function (tree fndecl)
>>> been constant-evaluated already if possible, so we can safely
>>> pass ff_mce_false. */
>>> cp_fold_data data (ff_genericize | ff_mce_false);
>>> + /* Do cp_fold_immediate_r in separate whole IL walk instead of during
>>> + cp_fold_r, as otherwise expressions using results of immediate functions
>>> + might not be folded as cp_fold is called on those before cp_fold_r is
>>> + called on their argument. And calling cp_fold_immediate_r during
>>> + cp_fold can mean evaluation of the immediate functions many times. */
>>
>> Hmm, fold_cache should prevent multiple evaluation?
>
> So would you prefer something like the patch below instead?
Hmm, no, I think your earlier approach is cleaner. I'd just like to
remove the last sentence of the comment.
> Currently it regresses
> FAIL: g++.dg/cpp2a/consteval-prop15.C -std=gnu++20 (test for excess errors)
> FAIL: g++.dg/cpp2a/consteval-prop15.C -std=gnu++23 (test for excess errors)
> FAIL: g++.dg/cpp2a/consteval-prop15.C -std=gnu++26 (test for excess errors)
> FAIL: g++.dg/cpp2a/consteval-prop2.C -std=c++20 (test for warnings, line 10)
> FAIL: g++.dg/cpp2a/consteval-prop2.C -std=c++23 (test for warnings, line 10)
> FAIL: g++.dg/cpp2a/consteval-prop2.C -std=c++26 (test for warnings, line 10)
> and fixes the new testcase in the patch.
> The excess errors are
> /usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:104:6: error: call to consteval function 'f9<int>(non_const)' is not a constant expression
> /usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:104:6: error: 'non_const' is not a constant expression
> /usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:106:7: error: call to consteval function 'f10<int>(non_const)' is not a constant expression
> /usr/src/gcc/gcc/testsuite/g++.dg/cpp2a/consteval-prop15.C:106:7: error: 'non_const' is not a constant expression
> And in the other tests the differences on stderr are
> -consteval-prop2.C:10:16: note: ‘constexpr int f(T) [with T = int]’ was promoted to an immediate function because its body contains an immediate-escalating expression ‘id(t)’
> - 10 | return t + id(t); // { dg-message "immediate-escalating expression .id\\(t\\)." }
> - | ~~^~~
> +consteval-prop2.C:14:10: note: ‘constexpr int f(T) [with T = int]’ was promoted to an immediate function
> and
> -consteval-prop2.C:71:21: error: ‘i’ is not a constant expression
> +consteval-prop2.C:71:22: error: ‘i’ is not a constant expression
> + 71 | + k<int>(i); // { dg-bogus "call to|constant" "" { xfail *-*-* } }
> + | ^
>
> 2025-03-17 Jakub Jelinek <jakub@redhat.com>
>
> PR target/118068
> gcc/cp/
> * cp-gimplify.cc (get_fold_cache): Add prototype.
> (cp_fold_immediate_r): For CALL_EXPR or ADDR_EXPR, query fold_cache
> and if stmt is there already and not mapped to self, exit early.
> (cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
> if RETURN_EXPR has erroneous argument.
> (cp_fold) <case ADDR_EXPR>: For C++20 call cp_fold_immediate_r.
> (cp_fold) <case CALL_EXPR>: Likewise.
> gcc/testsuite/
> * g++.target/i386/pr118068.C: New test.
>
> --- gcc/cp/cp-gimplify.cc.jj 2025-03-14 15:31:40.170197353 +0100
> +++ gcc/cp/cp-gimplify.cc 2025-03-17 16:51:14.644040172 +0100
> @@ -1204,6 +1204,8 @@ cp_build_init_expr_for_ctor (tree call,
> return init;
> }
>
> +static hash_map<tree, tree> *&get_fold_cache (fold_flags_t);
> +
> /* A subroutine of cp_fold_r to handle immediate functions. */
>
> static tree
> @@ -1257,6 +1259,22 @@ cp_fold_immediate_r (tree *stmt_p, int *
> if (!decl || TREE_CODE (decl) != FUNCTION_DECL)
> return NULL_TREE;
>
> + /* If this call has been evaluated before by cp_fold and
> + stored into the fold cache, just use earlier computed replacement. */
> + if (code == CALL_EXPR || code == ADDR_EXPR)
> + {
> + auto &fold_cache = get_fold_cache (data->flags);
> + if (fold_cache != NULL)
> + {
> + if (tree *cached = fold_cache->get (stmt))
> + if (*cached != stmt)
> + {
> + *stmt_p = *cached;
> + return NULL_TREE;
> + }
> + }
> + }
> +
> /* Fully escalate once all templates have been instantiated. What we're
> calling is not a consteval function but it may become one. This
> requires recursing; DECL may be promoted to consteval because it
> @@ -1717,6 +1735,11 @@ cp_genericize_r (tree *stmt_p, int *walk
> case RETURN_EXPR:
> if (TREE_OPERAND (stmt, 0))
> {
> + if (error_operand_p (TREE_OPERAND (stmt, 0))
> + && warn_return_type)
> + /* Suppress -Wreturn-type for this function. */
> + suppress_warning (current_function_decl, OPT_Wreturn_type);
> +
> if (is_invisiref_parm (TREE_OPERAND (stmt, 0)))
> /* Don't dereference an invisiref RESULT_DECL inside a
> RETURN_EXPR. */
> @@ -3082,6 +3105,19 @@ cp_fold (tree x, fold_flags_t flags)
> goto unary;
>
> case ADDR_EXPR:
> + if (cxx_dialect >= cxx20
> + && !ADDR_EXPR_DENOTES_CALL_P (x)
> + && TREE_CODE (TREE_OPERAND (x, 0)) == FUNCTION_DECL)
> + {
> + int walk_subtrees = 0;
> + cp_fold_data data (flags);
> + cp_fold_immediate_r (&x, &walk_subtrees, &data);
> + if (x != org_x)
> + {
> + x = cp_fold (x, flags);
> + break;
> + }
> + }
> loc = EXPR_LOCATION (x);
> op0 = cp_fold_maybe_rvalue (TREE_OPERAND (x, 0), false, flags);
>
> @@ -3320,6 +3356,17 @@ cp_fold (tree x, fold_flags_t flags)
>
> case CALL_EXPR:
> {
> + if (cxx_dialect >= cxx20)
> + {
> + int walk_subtrees = 0;
> + cp_fold_data data (flags);
> + cp_fold_immediate_r (&x, &walk_subtrees, &data);
> + if (x != org_x)
> + {
> + x = cp_fold (x, flags);
> + break;
> + }
> + }
> tree callee = get_callee_fndecl (x);
>
> /* "Inline" calls to std::move/forward and other cast-like functions
> --- gcc/testsuite/g++.target/i386/pr118068.C.jj 2025-03-07 10:38:15.551662990 +0100
> +++ gcc/testsuite/g++.target/i386/pr118068.C 2025-03-07 10:38:15.551662990 +0100
> @@ -0,0 +1,17 @@
> +// PR target/118068
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O0 -mavx" }
> +
> +typedef float V __attribute__((vector_size (32)));
> +
> +consteval unsigned char
> +foo (int x)
> +{
> + return x;
> +}
> +
> +V
> +bar (V x, V y)
> +{
> + return __builtin_ia32_blendps256 (x, y, (int) foo (0x23));
> +}
>
>
> Jakub
>
On Tue, Mar 18, 2025 at 12:26:53PM -0400, Jason Merrill wrote:
> > You mean
> > /* No need to look into types or unevaluated operands.
> > NB: This affects cp_fold_r as well. */
> > if (TYPE_P (stmt) || unevaluated_p (TREE_CODE (stmt)))
> > {
> > *walk_subtrees = 0;
> > return NULL_TREE;
> > }
> > ? Note, GCC 13 didn't have anything like that in cp_fold_r.
>
> Hmm, indeed, I was misremembering. I still think it would be good to add
> it.
Ok. Shall I do it in this patch or incrementally?
All of
/* No need to look into types or unevaluated operands.
NB: This affects cp_fold_r as well. */
if (TYPE_P (stmt)
|| unevaluated_p (code)
/* We do not use in_immediate_context here because it checks
more than is desirable, e.g., sk_template_parms. */
|| cp_unevaluated_operand
|| (current_function_decl
&& DECL_IMMEDIATE_FUNCTION_P (current_function_decl)))
{
*walk_subtrees = 0;
return NULL_TREE;
}
and unconditionally or just for C++ >= 20?
The current behavior before the patch is that because cp_fold_immediate_r
does that, with the exception of COND_EXPRE we do essentially
if (cxx_dialect >= cxx20
&& (TYPE_P (stmt)
|| unevaluated_p (code)
/* We do not use in_immediate_context here because it checks
more than is desirable, e.g., sk_template_parms. */
|| cp_unevaluated_operand
|| (current_function_decl
&& DECL_IMMEDIATE_FUNCTION_P (current_function_decl))))
*walk_subtrees = 0;
(so we clear *walk_subtrees but still continue to call cp_fold on it
even in those cases, and do it only for C++20).
> > > > @@ -1537,6 +1513,17 @@ cp_fold_function (tree fndecl)
> > > > been constant-evaluated already if possible, so we can safely
> > > > pass ff_mce_false. */
> > > > cp_fold_data data (ff_genericize | ff_mce_false);
> > > > + /* Do cp_fold_immediate_r in separate whole IL walk instead of during
> > > > + cp_fold_r, as otherwise expressions using results of immediate functions
> > > > + might not be folded as cp_fold is called on those before cp_fold_r is
> > > > + called on their argument. And calling cp_fold_immediate_r during
> > > > + cp_fold can mean evaluation of the immediate functions many times. */
> > >
> > > Hmm, fold_cache should prevent multiple evaluation?
> >
> > So would you prefer something like the patch below instead?
>
> Hmm, no, I think your earlier approach is cleaner. I'd just like to remove
> the last sentence of the comment.
Patch with just that removed sentence is below.
2025-03-18 Jakub Jelinek <jakub@redhat.com>
PR target/118068
gcc/cp/
* cp-gimplify.cc (cp_fold_immediate): Use cp_walk_tree rather than
cp_walk_tree_without_duplicates.
(cp_fold_immediate_r): For IF_STMT_CONSTEVAL_P IF_STMT don't walk
into THEN_CLAUSE subtree, only ELSE_CLAUSE. For non-call related
stmts call data->pset.add and if it returns true, don't walk subtrees.
(cp_fold_r): Don't call cp_fold_immediate_r here.
(cp_fold_function): For C++20 or later call cp_walk_tree
with cp_fold_immediate_r callback first before calling cp_walk_tree
with cp_fold_r callback and call data.pset.empty () in between.
(cp_fully_fold_init): Likewise.
(cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
if RETURN_EXPR has erroneous argument.
gcc/testsuite/
* g++.target/i386/pr118068.C: New test.
--- gcc/cp/cp-gimplify.cc.jj 2025-03-06 17:26:00.547975990 +0100
+++ gcc/cp/cp-gimplify.cc 2025-03-07 12:07:41.578078222 +0100
@@ -519,7 +519,7 @@ cp_fold_immediate (tree *tp, mce_value m
cp_fold_data data (flags);
int save_errorcount = errorcount;
- tree r = cp_walk_tree_without_duplicates (tp, cp_fold_immediate_r, &data);
+ tree r = cp_walk_tree (tp, cp_fold_immediate_r, &data, NULL);
if (errorcount > save_errorcount)
return integer_one_node;
return r;
@@ -1204,13 +1204,14 @@ cp_build_init_expr_for_ctor (tree call,
return init;
}
-/* A subroutine of cp_fold_r to handle immediate functions. */
+/* A walk_tree callback for cp_fold_function and cp_fully_fold_init to handle
+ immediate functions. */
static tree
cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, void *data_)
{
auto data = static_cast<cp_fold_data *>(data_);
tree stmt = *stmt_p;
/* The purpose of this is not to emit errors for mce_unknown. */
const tsubst_flags_t complain = (data->flags & ff_mce_false
? tf_error : tf_none);
@@ -1250,7 +1251,19 @@ cp_fold_immediate_r (tree *stmt_p, int *
if (!ADDR_EXPR_DENOTES_CALL_P (stmt))
decl = TREE_OPERAND (stmt, 0);
break;
+ case IF_STMT:
+ if (IF_STMT_CONSTEVAL_P (stmt))
+ {
+ if (!data->pset.add (stmt))
+ cp_walk_tree (&ELSE_CLAUSE (stmt), cp_fold_immediate_r, data_,
+ NULL);
+ *walk_subtrees = 0;
+ return NULL_TREE;
+ }
+ /* FALLTHRU */
default:
+ if (data->pset.add (stmt))
+ *walk_subtrees = 0;
return NULL_TREE;
}
@@ -1370,45 +1383,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
tree stmt = *stmt_p;
enum tree_code code = TREE_CODE (stmt);
- if (cxx_dialect >= cxx20)
- {
- /* Unfortunately we must handle code like
- false ? bar () : 42
- where we have to check bar too. The cp_fold call below could
- fold the ?: into a constant before we've checked it. */
- if (code == COND_EXPR)
- {
- auto then_fn = cp_fold_r, else_fn = cp_fold_r;
- /* See if we can figure out if either of the branches is dead. If it
- is, we don't need to do everything that cp_fold_r does. */
- cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
- if (integer_zerop (TREE_OPERAND (stmt, 0)))
- then_fn = cp_fold_immediate_r;
- else if (integer_nonzerop (TREE_OPERAND (stmt, 0)))
- else_fn = cp_fold_immediate_r;
-
- if (TREE_OPERAND (stmt, 1))
- cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
- nullptr);
- if (TREE_OPERAND (stmt, 2))
- cp_walk_tree (&TREE_OPERAND (stmt, 2), else_fn, data,
- nullptr);
- *walk_subtrees = 0;
- /* Don't return yet, still need the cp_fold below. */
- }
- else
- cp_fold_immediate_r (stmt_p, walk_subtrees, data);
- }
-
*stmt_p = stmt = cp_fold (*stmt_p, data->flags);
- /* For certain trees, like +foo(), the cp_fold above will remove the +,
- and the subsequent tree walk would go straight down to the CALL_EXPR's
- operands, meaning that cp_fold_immediate_r would never see the
- CALL_EXPR. Ew :(. */
- if (TREE_CODE (stmt) == CALL_EXPR && code != CALL_EXPR)
- cp_fold_immediate_r (stmt_p, walk_subtrees, data);
-
if (data->pset.add (stmt))
{
/* Don't walk subtrees of stmts we've already walked once, otherwise
@@ -1537,6 +1513,16 @@ cp_fold_function (tree fndecl)
been constant-evaluated already if possible, so we can safely
pass ff_mce_false. */
cp_fold_data data (ff_genericize | ff_mce_false);
+ /* Do cp_fold_immediate_r in separate whole IL walk instead of during
+ cp_fold_r, as otherwise expressions using results of immediate functions
+ might not be folded as cp_fold is called on those before cp_fold_r is
+ called on their argument. */
+ if (cxx_dialect >= cxx20)
+ {
+ cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_immediate_r,
+ &data, NULL);
+ data.pset.empty ();
+ }
cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_r, &data, NULL);
/* This is merely an optimization: if FNDECL has no i-e expressions,
@@ -1717,6 +1703,11 @@ cp_genericize_r (tree *stmt_p, int *walk
case RETURN_EXPR:
if (TREE_OPERAND (stmt, 0))
{
+ if (error_operand_p (TREE_OPERAND (stmt, 0))
+ && warn_return_type)
+ /* Suppress -Wreturn-type for this function. */
+ suppress_warning (current_function_decl, OPT_Wreturn_type);
+
if (is_invisiref_parm (TREE_OPERAND (stmt, 0)))
/* Don't dereference an invisiref RESULT_DECL inside a
RETURN_EXPR. */
@@ -2910,6 +2901,11 @@ cp_fully_fold_init (tree x)
return x;
x = cp_fully_fold (x, mce_false);
cp_fold_data data (ff_mce_false);
+ if (cxx_dialect >= cxx20)
+ {
+ cp_walk_tree (&x, cp_fold_immediate_r, &data, NULL);
+ data.pset.empty ();
+ }
cp_walk_tree (&x, cp_fold_r, &data, NULL);
return x;
}
--- gcc/testsuite/g++.target/i386/pr118068.C.jj 2025-03-07 10:38:15.551662990 +0100
+++ gcc/testsuite/g++.target/i386/pr118068.C 2025-03-07 10:38:15.551662990 +0100
@@ -0,0 +1,17 @@
+// PR target/118068
+// { dg-do compile { target c++20 } }
+// { dg-options "-O0 -mavx" }
+
+typedef float V __attribute__((vector_size (32)));
+
+consteval unsigned char
+foo (int x)
+{
+ return x;
+}
+
+V
+bar (V x, V y)
+{
+ return __builtin_ia32_blendps256 (x, y, (int) foo (0x23));
+}
Jakub
On 3/18/25 12:48 PM, Jakub Jelinek wrote:
> On Tue, Mar 18, 2025 at 12:26:53PM -0400, Jason Merrill wrote:
>>> You mean
>>> /* No need to look into types or unevaluated operands.
>>> NB: This affects cp_fold_r as well. */
>>> if (TYPE_P (stmt) || unevaluated_p (TREE_CODE (stmt)))
>>> {
>>> *walk_subtrees = 0;
>>> return NULL_TREE;
>>> }
>>> ? Note, GCC 13 didn't have anything like that in cp_fold_r.
>>
>> Hmm, indeed, I was misremembering. I still think it would be good to add
>> it.
>
> Ok. Shall I do it in this patch or incrementally?
Incrementally is fine. And I can take care of it.
>>>>> @@ -1537,6 +1513,17 @@ cp_fold_function (tree fndecl)
>>>>> been constant-evaluated already if possible, so we can safely
>>>>> pass ff_mce_false. */
>>>>> cp_fold_data data (ff_genericize | ff_mce_false);
>>>>> + /* Do cp_fold_immediate_r in separate whole IL walk instead of during
>>>>> + cp_fold_r, as otherwise expressions using results of immediate functions
>>>>> + might not be folded as cp_fold is called on those before cp_fold_r is
>>>>> + called on their argument. And calling cp_fold_immediate_r during
>>>>> + cp_fold can mean evaluation of the immediate functions many times. */
>>>>
>>>> Hmm, fold_cache should prevent multiple evaluation?
>>>
>>> So would you prefer something like the patch below instead?
>>
>> Hmm, no, I think your earlier approach is cleaner. I'd just like to remove
>> the last sentence of the comment.
>
> Patch with just that removed sentence is below.
OK, thanks.
> 2025-03-18 Jakub Jelinek <jakub@redhat.com>
>
> PR target/118068
> gcc/cp/
> * cp-gimplify.cc (cp_fold_immediate): Use cp_walk_tree rather than
> cp_walk_tree_without_duplicates.
> (cp_fold_immediate_r): For IF_STMT_CONSTEVAL_P IF_STMT don't walk
> into THEN_CLAUSE subtree, only ELSE_CLAUSE. For non-call related
> stmts call data->pset.add and if it returns true, don't walk subtrees.
> (cp_fold_r): Don't call cp_fold_immediate_r here.
> (cp_fold_function): For C++20 or later call cp_walk_tree
> with cp_fold_immediate_r callback first before calling cp_walk_tree
> with cp_fold_r callback and call data.pset.empty () in between.
> (cp_fully_fold_init): Likewise.
> (cp_genericize_r) <case RETURN_EXPR>: Suppress -Wreturn-type warning
> if RETURN_EXPR has erroneous argument.
> gcc/testsuite/
> * g++.target/i386/pr118068.C: New test.
>
> --- gcc/cp/cp-gimplify.cc.jj 2025-03-06 17:26:00.547975990 +0100
> +++ gcc/cp/cp-gimplify.cc 2025-03-07 12:07:41.578078222 +0100
> @@ -519,7 +519,7 @@ cp_fold_immediate (tree *tp, mce_value m
>
> cp_fold_data data (flags);
> int save_errorcount = errorcount;
> - tree r = cp_walk_tree_without_duplicates (tp, cp_fold_immediate_r, &data);
> + tree r = cp_walk_tree (tp, cp_fold_immediate_r, &data, NULL);
> if (errorcount > save_errorcount)
> return integer_one_node;
> return r;
> @@ -1204,13 +1204,14 @@ cp_build_init_expr_for_ctor (tree call,
> return init;
> }
>
> -/* A subroutine of cp_fold_r to handle immediate functions. */
> +/* A walk_tree callback for cp_fold_function and cp_fully_fold_init to handle
> + immediate functions. */
>
> static tree
> cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, void *data_)
> {
> auto data = static_cast<cp_fold_data *>(data_);
> tree stmt = *stmt_p;
> /* The purpose of this is not to emit errors for mce_unknown. */
> const tsubst_flags_t complain = (data->flags & ff_mce_false
> ? tf_error : tf_none);
> @@ -1250,7 +1251,19 @@ cp_fold_immediate_r (tree *stmt_p, int *
> if (!ADDR_EXPR_DENOTES_CALL_P (stmt))
> decl = TREE_OPERAND (stmt, 0);
> break;
> + case IF_STMT:
> + if (IF_STMT_CONSTEVAL_P (stmt))
> + {
> + if (!data->pset.add (stmt))
> + cp_walk_tree (&ELSE_CLAUSE (stmt), cp_fold_immediate_r, data_,
> + NULL);
> + *walk_subtrees = 0;
> + return NULL_TREE;
> + }
> + /* FALLTHRU */
> default:
> + if (data->pset.add (stmt))
> + *walk_subtrees = 0;
> return NULL_TREE;
> }
>
> @@ -1370,45 +1383,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
> tree stmt = *stmt_p;
> enum tree_code code = TREE_CODE (stmt);
>
> - if (cxx_dialect >= cxx20)
> - {
> - /* Unfortunately we must handle code like
> - false ? bar () : 42
> - where we have to check bar too. The cp_fold call below could
> - fold the ?: into a constant before we've checked it. */
> - if (code == COND_EXPR)
> - {
> - auto then_fn = cp_fold_r, else_fn = cp_fold_r;
> - /* See if we can figure out if either of the branches is dead. If it
> - is, we don't need to do everything that cp_fold_r does. */
> - cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
> - if (integer_zerop (TREE_OPERAND (stmt, 0)))
> - then_fn = cp_fold_immediate_r;
> - else if (integer_nonzerop (TREE_OPERAND (stmt, 0)))
> - else_fn = cp_fold_immediate_r;
> -
> - if (TREE_OPERAND (stmt, 1))
> - cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
> - nullptr);
> - if (TREE_OPERAND (stmt, 2))
> - cp_walk_tree (&TREE_OPERAND (stmt, 2), else_fn, data,
> - nullptr);
> - *walk_subtrees = 0;
> - /* Don't return yet, still need the cp_fold below. */
> - }
> - else
> - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> - }
> -
> *stmt_p = stmt = cp_fold (*stmt_p, data->flags);
>
> - /* For certain trees, like +foo(), the cp_fold above will remove the +,
> - and the subsequent tree walk would go straight down to the CALL_EXPR's
> - operands, meaning that cp_fold_immediate_r would never see the
> - CALL_EXPR. Ew :(. */
> - if (TREE_CODE (stmt) == CALL_EXPR && code != CALL_EXPR)
> - cp_fold_immediate_r (stmt_p, walk_subtrees, data);
> -
> if (data->pset.add (stmt))
> {
> /* Don't walk subtrees of stmts we've already walked once, otherwise
> @@ -1537,6 +1513,16 @@ cp_fold_function (tree fndecl)
> been constant-evaluated already if possible, so we can safely
> pass ff_mce_false. */
> cp_fold_data data (ff_genericize | ff_mce_false);
> + /* Do cp_fold_immediate_r in separate whole IL walk instead of during
> + cp_fold_r, as otherwise expressions using results of immediate functions
> + might not be folded as cp_fold is called on those before cp_fold_r is
> + called on their argument. */
> + if (cxx_dialect >= cxx20)
> + {
> + cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_immediate_r,
> + &data, NULL);
> + data.pset.empty ();
> + }
> cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_r, &data, NULL);
>
> /* This is merely an optimization: if FNDECL has no i-e expressions,
> @@ -1717,6 +1703,11 @@ cp_genericize_r (tree *stmt_p, int *walk
> case RETURN_EXPR:
> if (TREE_OPERAND (stmt, 0))
> {
> + if (error_operand_p (TREE_OPERAND (stmt, 0))
> + && warn_return_type)
> + /* Suppress -Wreturn-type for this function. */
> + suppress_warning (current_function_decl, OPT_Wreturn_type);
> +
> if (is_invisiref_parm (TREE_OPERAND (stmt, 0)))
> /* Don't dereference an invisiref RESULT_DECL inside a
> RETURN_EXPR. */
> @@ -2910,6 +2901,11 @@ cp_fully_fold_init (tree x)
> return x;
> x = cp_fully_fold (x, mce_false);
> cp_fold_data data (ff_mce_false);
> + if (cxx_dialect >= cxx20)
> + {
> + cp_walk_tree (&x, cp_fold_immediate_r, &data, NULL);
> + data.pset.empty ();
> + }
> cp_walk_tree (&x, cp_fold_r, &data, NULL);
> return x;
> }
> --- gcc/testsuite/g++.target/i386/pr118068.C.jj 2025-03-07 10:38:15.551662990 +0100
> +++ gcc/testsuite/g++.target/i386/pr118068.C 2025-03-07 10:38:15.551662990 +0100
> @@ -0,0 +1,17 @@
> +// PR target/118068
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O0 -mavx" }
> +
> +typedef float V __attribute__((vector_size (32)));
> +
> +consteval unsigned char
> +foo (int x)
> +{
> + return x;
> +}
> +
> +V
> +bar (V x, V y)
> +{
> + return __builtin_ia32_blendps256 (x, y, (int) foo (0x23));
> +}
>
>
> Jakub
>
@@ -519,7 +519,7 @@ cp_fold_immediate (tree *tp, mce_value m
cp_fold_data data (flags);
int save_errorcount = errorcount;
- tree r = cp_walk_tree_without_duplicates (tp, cp_fold_immediate_r, &data);
+ tree r = cp_walk_tree (tp, cp_fold_immediate_r, &data, NULL);
if (errorcount > save_errorcount)
return integer_one_node;
return r;
@@ -1204,13 +1204,14 @@ cp_build_init_expr_for_ctor (tree call,
return init;
}
-/* A subroutine of cp_fold_r to handle immediate functions. */
+/* A walk_tree callback for cp_fold_function and cp_fully_fold_init to handle
+ immediate functions. */
static tree
cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, void *data_)
{
auto data = static_cast<cp_fold_data *>(data_);
tree stmt = *stmt_p;
/* The purpose of this is not to emit errors for mce_unknown. */
const tsubst_flags_t complain = (data->flags & ff_mce_false
? tf_error : tf_none);
@@ -1250,7 +1251,19 @@ cp_fold_immediate_r (tree *stmt_p, int *
if (!ADDR_EXPR_DENOTES_CALL_P (stmt))
decl = TREE_OPERAND (stmt, 0);
break;
+ case IF_STMT:
+ if (IF_STMT_CONSTEVAL_P (stmt))
+ {
+ if (!data->pset.add (stmt))
+ cp_walk_tree (&ELSE_CLAUSE (stmt), cp_fold_immediate_r, data_,
+ NULL);
+ *walk_subtrees = 0;
+ return NULL_TREE;
+ }
+ /* FALLTHRU */
default:
+ if (data->pset.add (stmt))
+ *walk_subtrees = 0;
return NULL_TREE;
}
@@ -1370,45 +1383,8 @@ cp_fold_r (tree *stmt_p, int *walk_subtr
tree stmt = *stmt_p;
enum tree_code code = TREE_CODE (stmt);
- if (cxx_dialect >= cxx20)
- {
- /* Unfortunately we must handle code like
- false ? bar () : 42
- where we have to check bar too. The cp_fold call below could
- fold the ?: into a constant before we've checked it. */
- if (code == COND_EXPR)
- {
- auto then_fn = cp_fold_r, else_fn = cp_fold_r;
- /* See if we can figure out if either of the branches is dead. If it
- is, we don't need to do everything that cp_fold_r does. */
- cp_walk_tree (&TREE_OPERAND (stmt, 0), cp_fold_r, data, nullptr);
- if (integer_zerop (TREE_OPERAND (stmt, 0)))
- then_fn = cp_fold_immediate_r;
- else if (integer_nonzerop (TREE_OPERAND (stmt, 0)))
- else_fn = cp_fold_immediate_r;
-
- if (TREE_OPERAND (stmt, 1))
- cp_walk_tree (&TREE_OPERAND (stmt, 1), then_fn, data,
- nullptr);
- if (TREE_OPERAND (stmt, 2))
- cp_walk_tree (&TREE_OPERAND (stmt, 2), else_fn, data,
- nullptr);
- *walk_subtrees = 0;
- /* Don't return yet, still need the cp_fold below. */
- }
- else
- cp_fold_immediate_r (stmt_p, walk_subtrees, data);
- }
-
*stmt_p = stmt = cp_fold (*stmt_p, data->flags);
- /* For certain trees, like +foo(), the cp_fold above will remove the +,
- and the subsequent tree walk would go straight down to the CALL_EXPR's
- operands, meaning that cp_fold_immediate_r would never see the
- CALL_EXPR. Ew :(. */
- if (TREE_CODE (stmt) == CALL_EXPR && code != CALL_EXPR)
- cp_fold_immediate_r (stmt_p, walk_subtrees, data);
-
if (data->pset.add (stmt))
{
/* Don't walk subtrees of stmts we've already walked once, otherwise
@@ -1537,6 +1513,17 @@ cp_fold_function (tree fndecl)
been constant-evaluated already if possible, so we can safely
pass ff_mce_false. */
cp_fold_data data (ff_genericize | ff_mce_false);
+ /* Do cp_fold_immediate_r in separate whole IL walk instead of during
+ cp_fold_r, as otherwise expressions using results of immediate functions
+ might not be folded as cp_fold is called on those before cp_fold_r is
+ called on their argument. And calling cp_fold_immediate_r during
+ cp_fold can mean evaluation of the immediate functions many times. */
+ if (cxx_dialect >= cxx20)
+ {
+ cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_immediate_r,
+ &data, NULL);
+ data.pset.empty ();
+ }
cp_walk_tree (&DECL_SAVED_TREE (fndecl), cp_fold_r, &data, NULL);
/* This is merely an optimization: if FNDECL has no i-e expressions,
@@ -1717,6 +1704,11 @@ cp_genericize_r (tree *stmt_p, int *walk
case RETURN_EXPR:
if (TREE_OPERAND (stmt, 0))
{
+ if (error_operand_p (TREE_OPERAND (stmt, 0))
+ && warn_return_type)
+ /* Suppress -Wreturn-type for this function. */
+ suppress_warning (current_function_decl, OPT_Wreturn_type);
+
if (is_invisiref_parm (TREE_OPERAND (stmt, 0)))
/* Don't dereference an invisiref RESULT_DECL inside a
RETURN_EXPR. */
@@ -2910,6 +2902,11 @@ cp_fully_fold_init (tree x)
return x;
x = cp_fully_fold (x, mce_false);
cp_fold_data data (ff_mce_false);
+ if (cxx_dialect >= cxx20)
+ {
+ cp_walk_tree (&x, cp_fold_immediate_r, &data, NULL);
+ data.pset.empty ();
+ }
cp_walk_tree (&x, cp_fold_r, &data, NULL);
return x;
}
@@ -0,0 +1,17 @@
+// PR target/118068
+// { dg-do compile { target c++20 } }
+// { dg-options "-O0 -mavx" }
+
+typedef float V __attribute__((vector_size (32)));
+
+consteval unsigned char
+foo (int x)
+{
+ return x;
+}
+
+V
+bar (V x, V y)
+{
+ return __builtin_ia32_blendps256 (x, y, (int) foo (0x23));
+}