[v3,1/2] soft-fp: Use temporary variable in FP_FRAC_SUB_3/FP_FRAC_SUB_4

Message ID bb315fd23d73811b8a391f9fe63dc83f36fa5880.1541048728.git.zongbox@gmail.com
State New, archived
Headers

Commit Message

Zong Li Nov. 1, 2018, 5:10 a.m. UTC
  In FRAC_SUB_3(R, X, Y) and FRAC_SUB_4(R,, X, Y), it reference both
the X[N] and X[N] after R[N] have been set. If one of the X and Y is
the same address with R, the result of the calculation is wrong,
because the value of the original X and Y are overwritten.

In glibc, there are two places use FRAC_SUB and occurs the overlap.
The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source
_FP_DIV_MEAT_N_loop_u as the destination. This macro only be used
when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc
extend to _FP_FRAC_SUB_1 in this macro. so it also work because
_FP_FRAC_SUB_1 has no overlap problem in its implementation.
The second places is _FP_DIV_MEAT_4_udiv, the original value of X##_f[0]
is overwritten before the calculatation.

In FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after
destination have been set, so they have no problem.

After this modification, we can pass the soft floating testing of glibc
testsuites on RV32.

	* soft-fp/op-4.h (_FP_FRAC_SUB_3, _FP_FRAC_SUB_4): Use temporary
	variable to avoid overlap arguments.
---
 ChangeLog      |  5 +++++
 soft-fp/op-4.h | 63 ++++++++++++++++++++++++++++++++--------------------------
 2 files changed, 40 insertions(+), 28 deletions(-)
  

Comments

Joseph Myers Nov. 1, 2018, 5:41 p.m. UTC | #1
On Thu, 1 Nov 2018, Zong Li wrote:

> In FRAC_SUB_3(R, X, Y) and FRAC_SUB_4(R,, X, Y), it reference both
> the X[N] and X[N] after R[N] have been set. If one of the X and Y is
> the same address with R, the result of the calculation is wrong,
> because the value of the original X and Y are overwritten.
> 
> In glibc, there are two places use FRAC_SUB and occurs the overlap.
> The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source
> _FP_DIV_MEAT_N_loop_u as the destination. This macro only be used
> when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc
> extend to _FP_FRAC_SUB_1 in this macro. so it also work because
> _FP_FRAC_SUB_1 has no overlap problem in its implementation.
> The second places is _FP_DIV_MEAT_4_udiv, the original value of X##_f[0]
> is overwritten before the calculatation.
> 
> In FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after
> destination have been set, so they have no problem.

Thanks, I've committed this patch.

(The #if 0 version of __FP_FRAC_SUB_2 would violate sequence point rules 
if rl and xl were the same - "((rl = xl - yl) > xl)" is not valid in that 
case - but as it's #if 0 that's not a particular concern.  The generic C 
version of sub_ddmmss seems to get this right, and the assembly versions 
mostly use earlyclobbers to deal with this.)
  
Zong Li Nov. 2, 2018, 8:56 a.m. UTC | #2
Joseph Myers <joseph@codesourcery.com> 於 2018年11月2日 週五 上午1:41寫道:
>
> On Thu, 1 Nov 2018, Zong Li wrote:
>
> > In FRAC_SUB_3(R, X, Y) and FRAC_SUB_4(R,, X, Y), it reference both
> > the X[N] and X[N] after R[N] have been set. If one of the X and Y is
> > the same address with R, the result of the calculation is wrong,
> > because the value of the original X and Y are overwritten.
> >
> > In glibc, there are two places use FRAC_SUB and occurs the overlap.
> > The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source
> > _FP_DIV_MEAT_N_loop_u as the destination. This macro only be used
> > when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc
> > extend to _FP_FRAC_SUB_1 in this macro. so it also work because
> > _FP_FRAC_SUB_1 has no overlap problem in its implementation.
> > The second places is _FP_DIV_MEAT_4_udiv, the original value of X##_f[0]
> > is overwritten before the calculatation.
> >
> > In FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after
> > destination have been set, so they have no problem.
>
> Thanks, I've committed this patch.
>
> (The #if 0 version of __FP_FRAC_SUB_2 would violate sequence point rules
> if rl and xl were the same - "((rl = xl - yl) > xl)" is not valid in that
> case - but as it's #if 0 that's not a particular concern.  The generic C
> version of sub_ddmmss seems to get this right, and the assembly versions
> mostly use earlyclobbers to deal with this.)
>

Ok, I see. Sorry for being naive to ask these but I am a little bit
curious about why the #if 0 version is present in glibc?
and what is the time would use it?
  
Joseph Myers Nov. 2, 2018, 1:18 p.m. UTC | #3
On Fri, 2 Nov 2018, Zong Li wrote:

> Ok, I see. Sorry for being naive to ask these but I am a little bit
> curious about why the #if 0 version is present in glibc?
> and what is the time would use it?

You would have to research the state of the code when first added to glibc 
and the Linux kernel, remembering that there may well have been no 
community discussion of it at the time.
  

Patch

diff --git a/ChangeLog b/ChangeLog
index b798b63..0cced2b 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@ 
+2018-10-31  Zong Li  <zong@andestech.com>
+
+	* soft-fp/op-4.h (_FP_FRAC_SUB_3, _FP_FRAC_SUB_4): Use temporary
+	variable to avoid overlap arguments.
+
 2018-10-31  Samuel Thibault  <samuel.thibault@ens-lyon.org>
 
 	* manual/errno.texi (EIEIO): Document how translators should
diff --git a/soft-fp/op-4.h b/soft-fp/op-4.h
index 01b87d0..b429801 100644
--- a/soft-fp/op-4.h
+++ b/soft-fp/op-4.h
@@ -696,39 +696,46 @@ 
 #endif
 
 #ifndef __FP_FRAC_SUB_3
-# define __FP_FRAC_SUB_3(r2, r1, r0, x2, x1, x0, y2, y1, y0)	\
-  do								\
-    {								\
-      _FP_W_TYPE __FP_FRAC_SUB_3_c1, __FP_FRAC_SUB_3_c2;	\
-      r0 = x0 - y0;						\
-      __FP_FRAC_SUB_3_c1 = r0 > x0;				\
-      r1 = x1 - y1;						\
-      __FP_FRAC_SUB_3_c2 = r1 > x1;				\
-      r1 -= __FP_FRAC_SUB_3_c1;					\
-      __FP_FRAC_SUB_3_c2 |= __FP_FRAC_SUB_3_c1 && (y1 == x1);	\
-      r2 = x2 - y2 - __FP_FRAC_SUB_3_c2;			\
-    }								\
+# define __FP_FRAC_SUB_3(r2, r1, r0, x2, x1, x0, y2, y1, y0)    \
+  do                                                            \
+    {                                                           \
+      _FP_W_TYPE __FP_FRAC_SUB_3_tmp[2];                        \
+      _FP_W_TYPE __FP_FRAC_SUB_3_c1, __FP_FRAC_SUB_3_c2;        \
+      __FP_FRAC_SUB_3_tmp[0] = x0 - y0;                         \
+      __FP_FRAC_SUB_3_c1 = __FP_FRAC_SUB_3_tmp[0] > x0;         \
+      __FP_FRAC_SUB_3_tmp[1] = x1 - y1;                         \
+      __FP_FRAC_SUB_3_c2 = __FP_FRAC_SUB_3_tmp[1] > x1;         \
+      __FP_FRAC_SUB_3_tmp[1] -= __FP_FRAC_SUB_3_c1;             \
+      __FP_FRAC_SUB_3_c2 |= __FP_FRAC_SUB_3_c1 && (y1 == x1);   \
+      r2 = x2 - y2 - __FP_FRAC_SUB_3_c2;                        \
+      r1 = __FP_FRAC_SUB_3_tmp[1];                              \
+      r0 = __FP_FRAC_SUB_3_tmp[0];                              \
+    }                                                           \
   while (0)
 #endif
 
 #ifndef __FP_FRAC_SUB_4
 # define __FP_FRAC_SUB_4(r3, r2, r1, r0, x3, x2, x1, x0, y3, y2, y1, y0) \
-  do									\
-    {									\
-      _FP_W_TYPE __FP_FRAC_SUB_4_c1, __FP_FRAC_SUB_4_c2;		\
-      _FP_W_TYPE __FP_FRAC_SUB_4_c3;					\
-      r0 = x0 - y0;							\
-      __FP_FRAC_SUB_4_c1 = r0 > x0;					\
-      r1 = x1 - y1;							\
-      __FP_FRAC_SUB_4_c2 = r1 > x1;					\
-      r1 -= __FP_FRAC_SUB_4_c1;						\
-      __FP_FRAC_SUB_4_c2 |= __FP_FRAC_SUB_4_c1 && (y1 == x1);		\
-      r2 = x2 - y2;							\
-      __FP_FRAC_SUB_4_c3 = r2 > x2;					\
-      r2 -= __FP_FRAC_SUB_4_c2;						\
-      __FP_FRAC_SUB_4_c3 |= __FP_FRAC_SUB_4_c2 && (y2 == x2);		\
-      r3 = x3 - y3 - __FP_FRAC_SUB_4_c3;				\
-    }									\
+  do                                                                     \
+    {                                                                    \
+      _FP_W_TYPE __FP_FRAC_SUB_4_tmp[3];                                 \
+      _FP_W_TYPE __FP_FRAC_SUB_4_c1, __FP_FRAC_SUB_4_c2;                 \
+      _FP_W_TYPE __FP_FRAC_SUB_4_c3;                                     \
+      __FP_FRAC_SUB_4_tmp[0] = x0 - y0;                                  \
+      __FP_FRAC_SUB_4_c1 = __FP_FRAC_SUB_4_tmp[0] > x0;                  \
+      __FP_FRAC_SUB_4_tmp[1] = x1 - y1;                                  \
+      __FP_FRAC_SUB_4_c2 = __FP_FRAC_SUB_4_tmp[1] > x1;                  \
+      __FP_FRAC_SUB_4_tmp[1] -= __FP_FRAC_SUB_4_c1;                      \
+      __FP_FRAC_SUB_4_c2 |= __FP_FRAC_SUB_4_c1 && (y1 == x1);            \
+      __FP_FRAC_SUB_4_tmp[2] = x2 - y2;                                  \
+      __FP_FRAC_SUB_4_c3 = __FP_FRAC_SUB_4_tmp[2] > x2;                  \
+      __FP_FRAC_SUB_4_tmp[2] -= __FP_FRAC_SUB_4_c2;                      \
+      __FP_FRAC_SUB_4_c3 |= __FP_FRAC_SUB_4_c2 && (y2 == x2);            \
+      r3 = x3 - y3 - __FP_FRAC_SUB_4_c3;                                 \
+      r2 = __FP_FRAC_SUB_4_tmp[2];                                       \
+      r1 = __FP_FRAC_SUB_4_tmp[1];                                       \
+      r0 = __FP_FRAC_SUB_4_tmp[0];                                       \
+    }                                                                    \
   while (0)
 #endif