c, ubsan: Instrument even shortened divisions [PR109151]
Commit Message
Hi!
On the following testcase, the C FE decides to shorten the division because
it has a guarantee that INT_MIN / -1 division won't be encountered, the
first operand is widened from narrower unsigned and/or the second operand is
a constant other than all ones (in this case both are true).
The problem is that the narrower type in this case is _Bool and
ubsan_instrument_division only instruments it if op0's type is INTEGER_TYPE
or REAL_TYPE. Strangely this doesn't happen in C++ FE.
Anyway, we only shorten divisions if the INT_MIN / -1 case is impossible,
so I think we should be fine even with -fstrict-enums in C++ in case it
shortened to ENUMERAL_TYPEs.
The following patch just instruments those on the ubsan_instrument_division
side. Perhaps only the first hunk and testcase might be needed because
we shouldn't shorten if the other case could be triggered.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
or shall I omit the second hunk?
2023-03-17 Jakub Jelinek <jakub@redhat.com>
PR c/109151
* c-ubsan.cc (ubsan_instrument_division): Handle all scalar integral
types rather than just INTEGER_TYPE.
* c-c++-common/ubsan/div-by-zero-8.c: New test.
Jakub
Comments
On Fri, 17 Mar 2023, Jakub Jelinek wrote:
> Hi!
>
> On the following testcase, the C FE decides to shorten the division because
> it has a guarantee that INT_MIN / -1 division won't be encountered, the
> first operand is widened from narrower unsigned and/or the second operand is
> a constant other than all ones (in this case both are true).
> The problem is that the narrower type in this case is _Bool and
> ubsan_instrument_division only instruments it if op0's type is INTEGER_TYPE
> or REAL_TYPE. Strangely this doesn't happen in C++ FE.
> Anyway, we only shorten divisions if the INT_MIN / -1 case is impossible,
> so I think we should be fine even with -fstrict-enums in C++ in case it
> shortened to ENUMERAL_TYPEs.
>
> The following patch just instruments those on the ubsan_instrument_division
> side. Perhaps only the first hunk and testcase might be needed because
> we shouldn't shorten if the other case could be triggered.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
> or shall I omit the second hunk?
LGTM as-is
Richard.
> 2023-03-17 Jakub Jelinek <jakub@redhat.com>
>
> PR c/109151
> * c-ubsan.cc (ubsan_instrument_division): Handle all scalar integral
> types rather than just INTEGER_TYPE.
>
> * c-c++-common/ubsan/div-by-zero-8.c: New test.
>
> --- gcc/c-family/c-ubsan.cc.jj 2023-02-28 11:38:28.965868044 +0100
> +++ gcc/c-family/c-ubsan.cc 2023-03-16 09:49:51.651126302 +0100
> @@ -53,7 +53,7 @@ ubsan_instrument_division (location_t lo
> op0 = unshare_expr (op0);
> op1 = unshare_expr (op1);
>
> - if (TREE_CODE (type) == INTEGER_TYPE
> + if (INTEGRAL_TYPE_P (type)
> && sanitize_flags_p (SANITIZE_DIVIDE))
> t = fold_build2 (EQ_EXPR, boolean_type_node,
> op1, build_int_cst (type, 0));
> @@ -68,7 +68,7 @@ ubsan_instrument_division (location_t lo
> t = NULL_TREE;
>
> /* We check INT_MIN / -1 only for signed types. */
> - if (TREE_CODE (type) == INTEGER_TYPE
> + if (INTEGRAL_TYPE_P (type)
> && sanitize_flags_p (SANITIZE_SI_OVERFLOW)
> && !TYPE_UNSIGNED (type))
> {
> --- gcc/testsuite/c-c++-common/ubsan/div-by-zero-8.c.jj 2023-03-16 10:01:31.626824994 +0100
> +++ gcc/testsuite/c-c++-common/ubsan/div-by-zero-8.c 2023-03-16 10:03:05.510443440 +0100
> @@ -0,0 +1,14 @@
> +/* PR c/109151 */
> +/* { dg-do run } */
> +/* { dg-options "-fsanitize=integer-divide-by-zero -Wno-div-by-zero -fno-sanitize-recover=integer-divide-by-zero" } */
> +/* { dg-shouldfail "ubsan" } */
> +
> +int d;
> +
> +int
> +main ()
> +{
> + d = ((short) (d == 1 | d > 9)) / 0;
> +}
> +
> +/* { dg-output "division by zero" } */
>
> Jakub
>
>
On Fri, Mar 17, 2023 at 09:14:04AM +0100, Jakub Jelinek wrote:
> Hi!
>
> On the following testcase, the C FE decides to shorten the division because
> it has a guarantee that INT_MIN / -1 division won't be encountered, the
> first operand is widened from narrower unsigned and/or the second operand is
> a constant other than all ones (in this case both are true).
> The problem is that the narrower type in this case is _Bool and
> ubsan_instrument_division only instruments it if op0's type is INTEGER_TYPE
> or REAL_TYPE. Strangely this doesn't happen in C++ FE.
I was curious. The difference is because C++ passes this
(gdb) pge op0
(int) (short int) (VIEW_CONVERT_EXPR<int>(d) == 1 | VIEW_CONVERT_EXPR<int>(d) > 9)
to shorten_binary_op while C passes:
(gdb) pge op0
(int) (<<< Unknown tree: c_maybe_const_expr
d >>> == 1 || <<< Unknown tree: c_maybe_const_expr
d >>> > 9)
so when we remove the '(int)' cast, we have different types underneath,
either short or bool.
In C, the BIT_IOR_EXPR -> TRUTH_OR_EXPR change is because we call c_convert ->
convert_to_integer -> do_narrow.
In C++, we never called do_narrow so shorten_binary_op gets the original tree.
Anyway, thanks for the patch.
Marek
@@ -53,7 +53,7 @@ ubsan_instrument_division (location_t lo
op0 = unshare_expr (op0);
op1 = unshare_expr (op1);
- if (TREE_CODE (type) == INTEGER_TYPE
+ if (INTEGRAL_TYPE_P (type)
&& sanitize_flags_p (SANITIZE_DIVIDE))
t = fold_build2 (EQ_EXPR, boolean_type_node,
op1, build_int_cst (type, 0));
@@ -68,7 +68,7 @@ ubsan_instrument_division (location_t lo
t = NULL_TREE;
/* We check INT_MIN / -1 only for signed types. */
- if (TREE_CODE (type) == INTEGER_TYPE
+ if (INTEGRAL_TYPE_P (type)
&& sanitize_flags_p (SANITIZE_SI_OVERFLOW)
&& !TYPE_UNSIGNED (type))
{
@@ -0,0 +1,14 @@
+/* PR c/109151 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=integer-divide-by-zero -Wno-div-by-zero -fno-sanitize-recover=integer-divide-by-zero" } */
+/* { dg-shouldfail "ubsan" } */
+
+int d;
+
+int
+main ()
+{
+ d = ((short) (d == 1 | d > 9)) / 0;
+}
+
+/* { dg-output "division by zero" } */