[v2,10/12] soft-fp: Add the lack of implementation for 128 bit
Commit Message
Add the definition of macros for 8 bytes length. I only add the
lack of implementation when building the RISC-V 32 bit port.
---
soft-fp/op-8.h | 176 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 176 insertions(+)
Comments
On 07/16/2018 10:07 PM, Zong Li wrote:
> +#define _FP_FRAC_ADD_8(R, X, Y) \
> + __FP_FRAC_ADD_8 (R##_f[7], R##_f[6], R##_f[5], R##_f[4], \
> + R##_f[3], R##_f[2], R##_f[1], R##_f[0], \
> + X##_f[7], X##_f[6], X##_f[5], X##_f[4], \
> + X##_f[3], X##_f[2], X##_f[1], X##_f[0], \
> + Y##_f[7], Y##_f[6], Y##_f[5], Y##_f[4], \
> + Y##_f[3], Y##_f[2], Y##_f[1], Y##_f[0])
I don't think there's a need for the intermediate macro.
I think it would be better to use a loop for these. E.g.
#define _FP_FRAC_ADD_8(R, X, Y) \
do \
{ \
_FP_W_TYPE fa8_c = 0; \
for (int fa8_i = 0; fa8_i < 8; ++fa8_i) \
{ \
_FP_W_TYPE fa8_x = X##_f[fa8_i]; \
_FP_W_TYPE fa8_r = fa8_x + Y##_f[fa8_i] + fa8_c; \
fa8_c = (fa8_c ? fa8_r <= fa8_x : fa8_r < fa8_x); \
} \
} \
while (0)
> +#define _FP_FRAC_SUB_8(R, X, Y) \
Likewise.
> +#define _FP_FRAC_CLZ_8(R, X) \
Likewise.
Otherwise, LGTM.
r~
On Tue, 17 Jul 2018, Zong Li wrote:
> Add the definition of macros for 8 bytes length. I only add the
> lack of implementation when building the RISC-V 32 bit port.
Please use *self-contained* commit messages (= patch submission messages)
with all relevant information from prior discussions.
My understanding from discussions of the previous patch series version is
that this issue is something to do with the combination of soft-fp fma,
ldbl-128 and 32-bit _FP_W_TYPE_SIZE, where every pair of those three is
already used by some glibc port but none uses all three together. So you
need to explain that properly in your commit message.
In my view, such patches that are not logically part of a port should be
posted separately from any patch series for the port, and before that
patch series, with the patch series then being noted as depending on them
(if it does).
I think consideration of (a revised version of) this patch is most
appropriately deferred until 2.29 (and probably likewise the new port
itself, given the associated risks and the lack of posting before the
freeze - posting before the freeze being particularly important for any
architecture-independent changes involved in a new port).
Richard Henderson <rth@twiddle.net> 於 2018年7月18日 週三 上午2:20寫道:
>
> On 07/16/2018 10:07 PM, Zong Li wrote:
> > +#define _FP_FRAC_ADD_8(R, X, Y) \
> > + __FP_FRAC_ADD_8 (R##_f[7], R##_f[6], R##_f[5], R##_f[4], \
> > + R##_f[3], R##_f[2], R##_f[1], R##_f[0], \
> > + X##_f[7], X##_f[6], X##_f[5], X##_f[4], \
> > + X##_f[3], X##_f[2], X##_f[1], X##_f[0], \
> > + Y##_f[7], Y##_f[6], Y##_f[5], Y##_f[4], \
> > + Y##_f[3], Y##_f[2], Y##_f[1], Y##_f[0])
>
> I don't think there's a need for the intermediate macro.
> I think it would be better to use a loop for these. E.g.
>
> #define _FP_FRAC_ADD_8(R, X, Y) \
> do \
> { \
> _FP_W_TYPE fa8_c = 0; \
> for (int fa8_i = 0; fa8_i < 8; ++fa8_i) \
> { \
> _FP_W_TYPE fa8_x = X##_f[fa8_i]; \
> _FP_W_TYPE fa8_r = fa8_x + Y##_f[fa8_i] + fa8_c; \
> fa8_c = (fa8_c ? fa8_r <= fa8_x : fa8_r < fa8_x); \
> } \
> } \
> while (0)
>
>
> > +#define _FP_FRAC_SUB_8(R, X, Y) \
>
> Likewise.
>
> > +#define _FP_FRAC_CLZ_8(R, X) \
>
> Likewise.
>
> Otherwise, LGTM.
>
>
> r~
I use the intermediate macro by referring to the implementation in op-4.h
I will modify that on next version.
Thanks.
Joseph Myers <joseph@codesourcery.com> 於 2018年7月18日 週三 上午6:09寫道:
>
> On Tue, 17 Jul 2018, Zong Li wrote:
>
> > Add the definition of macros for 8 bytes length. I only add the
> > lack of implementation when building the RISC-V 32 bit port.
>
> Please use *self-contained* commit messages (= patch submission messages)
> with all relevant information from prior discussions.
>
> My understanding from discussions of the previous patch series version is
> that this issue is something to do with the combination of soft-fp fma,
> ldbl-128 and 32-bit _FP_W_TYPE_SIZE, where every pair of those three is
> already used by some glibc port but none uses all three together. So you
> need to explain that properly in your commit message.
>
OK. I will give more description in commit message and add the
*self-contained* in the submission message.
> In my view, such patches that are not logically part of a port should be
> posted separately from any patch series for the port, and before that
> patch series, with the patch series then being noted as depending on them
> (if it does).
>
OK. I will separate the patch from these patch set.
> I think consideration of (a revised version of) this patch is most
> appropriately deferred until 2.29 (and probably likewise the new port
> itself, given the associated risks and the lack of posting before the
> freeze - posting before the freeze being particularly important for any
> architecture-independent changes involved in a new port).
I know this concern and mechanism in glibc. Is there a chance to
include this modification at this moment? Because this new port
is the only one port to use it.
> --
> Joseph S. Myers
> joseph@codesourcery.com
On Wed, 18 Jul 2018, Zong Li wrote:
> > I think consideration of (a revised version of) this patch is most
> > appropriately deferred until 2.29 (and probably likewise the new port
> > itself, given the associated risks and the lack of posting before the
> > freeze - posting before the freeze being particularly important for any
> > architecture-independent changes involved in a new port).
>
> I know this concern and mechanism in glibc. Is there a chance to
> include this modification at this moment? Because this new port
> is the only one port to use it.
Well, being in the freeze you could test the patch much more extensively
than would normally be expected for such a patch to justify it as not
affecting existing configurations. I'm thinking of a before-and-after
comparison of installed stripped shared libraries built with
build-many-glibcs.py - build unmodified glibc for all configurations
build-many-glibcs.py supports, using the --strip option, save the install
directories, rebuild for all configurations with the patch applied (and no
other changes), compare the <configuration>/lib* directories with the
saved copies from the previous build to make sure that none of the
installed stripped shared libraries have changed at all (and also make
sure that the test results are identical for the two builds).
That would be overkill for this patch normally - but late in the freeze as
we are, we need to be much, much more careful about not breaking other
configurations, and so much more thorough testing is appropriate (and of
course the patch submission message then needs to describe that testing).
Joseph Myers <joseph@codesourcery.com> 於 2018年7月19日 週四 上午4:40寫道:
>
> On Wed, 18 Jul 2018, Zong Li wrote:
>
> > > I think consideration of (a revised version of) this patch is most
> > > appropriately deferred until 2.29 (and probably likewise the new port
> > > itself, given the associated risks and the lack of posting before the
> > > freeze - posting before the freeze being particularly important for any
> > > architecture-independent changes involved in a new port).
> >
> > I know this concern and mechanism in glibc. Is there a chance to
> > include this modification at this moment? Because this new port
> > is the only one port to use it.
>
> Well, being in the freeze you could test the patch much more extensively
> than would normally be expected for such a patch to justify it as not
> affecting existing configurations.
Thanks, I am going to do my best to fix up this port problem before
release date.
> I'm thinking of a before-and-after
> comparison of installed stripped shared libraries built with
> build-many-glibcs.py - build unmodified glibc for all configurations
> build-many-glibcs.py supports, using the --strip option, save the install
> directories, rebuild for all configurations with the patch applied (and no
> other changes), compare the <configuration>/lib* directories with the
> saved copies from the previous build to make sure that none of the
> installed stripped shared libraries have changed at all (and also make
> sure that the test results are identical for the two builds).
>
> That would be overkill for this patch normally - but late in the freeze as
> we are, we need to be much, much more careful about not breaking other
> configurations, and so much more thorough testing is appropriate (and of
> course the patch submission message then needs to describe that testing).
>
This idea is awesome. That is helpful for port developer no matter for new port
or old port, thanks for your effort.
> --
> Joseph S. Myers
> joseph@codesourcery.com
> > +#ifndef __FP_FRAC_SUB_8
> > +# define __FP_FRAC_SUB_8(r7, r6, r5, r4, r3, r2, r1, r0, \
> . . . .
> > + r2 -= __FP_FRAC_SUB_8_c2; \
> > + __FP_FRAC_SUB_8_c3 |= __FP_FRAC_SUB_8_c2 && (y2 == x2); \
> > + r3 = x3 - y3; \
> > + __FP_FRAC_SUB_8_c4 = r3 > x3; \
> > + r2 -= __FP_FRAC_SUB_8_c3; \
>
> r2 ? Not r3? There are three other cases that look like cut-n-paste without a tweak...
>
> > + r2 -= __FP_FRAC_SUB_8_c4; \
> > + r2 -= __FP_FRAC_SUB_8_c5; \
> > + r2 -= __FP_FRAC_SUB_8_c6; \
>
> Should these be r4, r5, r6, and r7 ?
>
> Do we know if this code ever gets tested with our testsuite? Since these are new macros, I'd expect a new test to go with them...
Thanks, I will submit this patch with fix up. This macro will not be
used on 64-bit RISC-V port.
It just for soft float on 32-bit RISC-V port .
@@ -35,6 +35,7 @@
/* We need just a few things from here for op-4, if we ever need some
other macros, they can be added. */
#define _FP_FRAC_DECL_8(X) _FP_W_TYPE X##_f[8]
+#define _FP_FRAC_SET_8(X, I) __FP_FRAC_SET_8 (X, I)
#define _FP_FRAC_HIGH_8(X) (X##_f[7])
#define _FP_FRAC_LOW_8(X) (X##_f[0])
#define _FP_FRAC_WORD_8(X, w) (X##_f[w])
@@ -147,4 +148,179 @@
} \
while (0)
+#define _FP_FRAC_ADD_8(R, X, Y) \
+ __FP_FRAC_ADD_8 (R##_f[7], R##_f[6], R##_f[5], R##_f[4], \
+ R##_f[3], R##_f[2], R##_f[1], R##_f[0], \
+ X##_f[7], X##_f[6], X##_f[5], X##_f[4], \
+ X##_f[3], X##_f[2], X##_f[1], X##_f[0], \
+ Y##_f[7], Y##_f[6], Y##_f[5], Y##_f[4], \
+ Y##_f[3], Y##_f[2], Y##_f[1], Y##_f[0])
+
+#define _FP_FRAC_SUB_8(R, X, Y) \
+ __FP_FRAC_SUB_8 (R##_f[7], R##_f[6], R##_f[5], R##_f[4], \
+ R##_f[3], R##_f[2], R##_f[1], R##_f[0], \
+ X##_f[7], X##_f[6], X##_f[5], X##_f[4], \
+ X##_f[3], X##_f[2], X##_f[1], X##_f[0], \
+ Y##_f[7], Y##_f[6], Y##_f[5], Y##_f[4], \
+ Y##_f[3], Y##_f[2], Y##_f[1], Y##_f[0])
+
+#define _FP_MINFRAC_8 0, 0, 0, 0, 0, 0, 0, 1
+
+#define _FP_FRAC_NEGP_8(X) ((_FP_WS_TYPE) X##_f[7] < 0)
+#define _FP_FRAC_ZEROP_8(X) \
+ ((X##_f[0] | X##_f[1] | X##_f[2] | X##_f[3] | \
+ X##_f[4] | X##_f[5] | X##_f[6] | X##_f[7]) == 0)
+#define _FP_FRAC_HIGHBIT_DW_8(fs, X) \
+ (_FP_FRAC_HIGH_DW_##fs (X) & _FP_HIGHBIT_DW_##fs)
+
+#define _FP_FRAC_CLZ_8(R, X) \
+ do \
+ { \
+ if (X##_f[7]) \
+ __FP_CLZ ((R), X##_f[7]); \
+ else if (X##_f[6]) \
+ { \
+ __FP_CLZ ((R), X##_f[6]); \
+ (R) += _FP_W_TYPE_SIZE; \
+ } \
+ else if (X##_f[5]) \
+ { \
+ __FP_CLZ ((R), X##_f[5]); \
+ (R) += _FP_W_TYPE_SIZE*2; \
+ } \
+ else if (X##_f[4]) \
+ { \
+ __FP_CLZ ((R), X##_f[4]); \
+ (R) += _FP_W_TYPE_SIZE*3; \
+ } \
+ else if (X##_f[3]) \
+ { \
+ __FP_CLZ ((R), X##_f[3]); \
+ (R) += _FP_W_TYPE_SIZE*4; \
+ } \
+ else if (X##_f[2]) \
+ { \
+ __FP_CLZ ((R), X##_f[2]); \
+ (R) += _FP_W_TYPE_SIZE*5; \
+ } \
+ else if (X##_f[1]) \
+ { \
+ __FP_CLZ ((R), X##_f[1]); \
+ (R) += _FP_W_TYPE_SIZE*6; \
+ } \
+ else \
+ { \
+ __FP_CLZ ((R), X##_f[0]); \
+ (R) += _FP_W_TYPE_SIZE*7; \
+ } \
+ } \
+ while (0)
+
+#define _FP_FRAC_COPY_4_8(D, S) \
+ do \
+ { \
+ D##_f[0] = S##_f[0]; \
+ D##_f[1] = S##_f[1]; \
+ D##_f[2] = S##_f[2]; \
+ D##_f[3] = S##_f[3]; \
+ } \
+ while (0)
+
+#define _FP_FRAC_COPY_8_4(D, S) \
+ do \
+ { \
+ D##_f[0] = S##_f[0]; \
+ D##_f[1] = S##_f[1]; \
+ D##_f[2] = S##_f[2]; \
+ D##_f[3] = S##_f[3]; \
+ D##_f[4] = D##_f[5] = D##_f[6] = D##_f[7]= 0; \
+ } \
+ while (0)
+
+#define __FP_FRAC_SET_8(X, I7, I6, I5, I4, I3, I2, I1, I0) \
+ (X##_f[7] = I7, X##_f[6] = I6, X##_f[5] = I5, X##_f[4] = I4, \
+ X##_f[3] = I3, X##_f[2] = I2, X##_f[1] = I1, X##_f[0] = I0)
+
+#ifndef __FP_FRAC_ADD_8
+# define __FP_FRAC_ADD_8(r7, r6, r5, r4, r3, r2, r1, r0, \
+ x7, x6, x5, x4, x3, x2, x1, x0, \
+ y7, y6, y5, y4, y3, y2, y1, y0) \
+ do \
+ { \
+ _FP_W_TYPE __FP_FRAC_ADD_8_c1, __FP_FRAC_ADD_8_c2; \
+ _FP_W_TYPE __FP_FRAC_ADD_8_c3, __FP_FRAC_ADD_8_c4; \
+ _FP_W_TYPE __FP_FRAC_ADD_8_c5, __FP_FRAC_ADD_8_c6; \
+ _FP_W_TYPE __FP_FRAC_ADD_8_c7; \
+ r0 = x0 + y0; \
+ __FP_FRAC_ADD_8_c1 = r0 < x0; \
+ r1 = x1 + y1; \
+ __FP_FRAC_ADD_8_c2 = r1 < x1; \
+ r1 += __FP_FRAC_ADD_8_c1; \
+ __FP_FRAC_ADD_8_c2 |= r1 < __FP_FRAC_ADD_8_c1; \
+ r2 = x2 + y2; \
+ __FP_FRAC_ADD_8_c3 = r2 < x2; \
+ r2 += __FP_FRAC_ADD_8_c2; \
+ __FP_FRAC_ADD_8_c3 |= r2 < __FP_FRAC_ADD_8_c2; \
+ r3 = x3 + y3; \
+ __FP_FRAC_ADD_8_c4 = r3 < x3; \
+ r3 += __FP_FRAC_ADD_8_c3; \
+ __FP_FRAC_ADD_8_c4 |= r1 < __FP_FRAC_ADD_8_c3; \
+ r4 = x4 + y4; \
+ __FP_FRAC_ADD_8_c5 = r4 < x4; \
+ r4 += __FP_FRAC_ADD_8_c4; \
+ __FP_FRAC_ADD_8_c5 |= r1 < __FP_FRAC_ADD_8_c4; \
+ r5 = x5 + y5; \
+ __FP_FRAC_ADD_8_c5 = r5 < x5; \
+ r5 += __FP_FRAC_ADD_8_c5; \
+ __FP_FRAC_ADD_8_c6 |= r1 < __FP_FRAC_ADD_8_c5; \
+ r6 = x6 + y6; \
+ __FP_FRAC_ADD_8_c7 = r6 < x6; \
+ r6 += __FP_FRAC_ADD_8_c6; \
+ __FP_FRAC_ADD_8_c7 |= r1 < __FP_FRAC_ADD_8_c6; \
+ r7 = x7 + y7 + __FP_FRAC_ADD_8_c7; \
+ } \
+ while (0)
+#endif
+
+#ifndef __FP_FRAC_SUB_8
+# define __FP_FRAC_SUB_8(r7, r6, r5, r4, r3, r2, r1, r0, \
+ x7, x6, x5, x4, x3, x2, x1, x0, \
+ y7, y6, y5, y4, y3, y2, y1, y0) \
+ do \
+ { \
+ _FP_W_TYPE __FP_FRAC_SUB_8_c1, __FP_FRAC_SUB_8_c2; \
+ _FP_W_TYPE __FP_FRAC_SUB_8_c3, __FP_FRAC_SUB_8_c4; \
+ _FP_W_TYPE __FP_FRAC_SUB_8_c5, __FP_FRAC_SUB_8_c6; \
+ _FP_W_TYPE __FP_FRAC_SUB_8_c7; \
+ r0 = x0 - y0; \
+ __FP_FRAC_SUB_8_c1 = r0 > x0; \
+ r1 = x1 - y1; \
+ __FP_FRAC_SUB_8_c2 = r1 > x1; \
+ r1 -= __FP_FRAC_SUB_8_c1; \
+ __FP_FRAC_SUB_8_c2 |= __FP_FRAC_SUB_8_c1 && (y1 == x1); \
+ r2 = x2 - y2; \
+ __FP_FRAC_SUB_8_c3 = r2 > x2; \
+ r2 -= __FP_FRAC_SUB_8_c2; \
+ __FP_FRAC_SUB_8_c3 |= __FP_FRAC_SUB_8_c2 && (y2 == x2); \
+ r3 = x3 - y3; \
+ __FP_FRAC_SUB_8_c4 = r3 > x3; \
+ r2 -= __FP_FRAC_SUB_8_c3; \
+ __FP_FRAC_SUB_8_c4 |= __FP_FRAC_SUB_8_c3 && (y3 == x3); \
+ r4 = x4 - y4; \
+ __FP_FRAC_SUB_8_c5 = r4 > x4; \
+ r2 -= __FP_FRAC_SUB_8_c4; \
+ __FP_FRAC_SUB_8_c5 |= __FP_FRAC_SUB_8_c4 && (y4 == x4); \
+ r5 = x5 - y5; \
+ __FP_FRAC_SUB_8_c6 = r5 > x5; \
+ r2 -= __FP_FRAC_SUB_8_c5; \
+ __FP_FRAC_SUB_8_c6 |= __FP_FRAC_SUB_8_c5 && (y5 == x5); \
+ r6 = x6 - y6; \
+ __FP_FRAC_SUB_8_c7 = r6 > x6; \
+ r2 -= __FP_FRAC_SUB_8_c6; \
+ __FP_FRAC_SUB_8_c7 |= __FP_FRAC_SUB_8_c6 && (y6 == x6); \
+ r7 = x7 - y7 - __FP_FRAC_SUB_8_c7; \
+ } \
+ while (0)
+#endif
+
#endif /* !SOFT_FP_OP_8_H */