Message ID | 5e51c4ea8cfe99f2faac5e206982658bd6c5d265.1539595555.git.zongbox@gmail.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 10941 invoked by alias); 15 Oct 2018 11:54:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 10808 invoked by uid 89); 15 Oct 2018 11:54:24 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.9 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=H*r:sk:q19-v6s, H*f:sk:cover.1, H*r:sk:mail-pl, HX-HELO:sk:mail-pl X-HELO: mail-pl1-f170.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=G7qxmHodgSBbjkb7R4sdL7wje4grXpn947kKfcAgg1M=; b=ibPaS/AcV57Ngp0TARWIi+4d3tsmFuL5mgB8l6fTTQFT0iZ9tjqKHCQD2YgzrJw6Mi X00hBMNrNdo03aIO1fNi7rNz7yFtSaymbfShIDJm9sWp+Z/Bmj6cwptdlGJ9Ybezeb8+ XjTQL+E/GKvbUVbzL9BY29HJB6CfSDvcept8qWHyr18oZ5t32cHMh2qX+eM0efyziO8l nPnOp3PgmDpZZ9Llkftjsrx7o0hWJetIWgoCskeRuHw2Y2/ubI1G3VeqpHq4sd1625LL pvW83+v2bp3FbiK5NzO8jt67NqDWWDYM1QoK/+HGowjA++XCvXQJW+qpQmzGUc+fLRPb YOwQ== Return-Path: <zongbox@gmail.com> From: Zong Li <zongbox@gmail.com> To: joseph@codesourcery.com, palmer@dabbelt.com, darius@bluespec.com, andrew@sifive.com, dj@redhat.com, libc-alpha@sourceware.org Cc: jimw@sifive.com, kito@andestech.com, greentime@andestech.com, zongbox@gmail.com Subject: [PATCH 1/2] soft-fp: Fix overwritten issue for division in op-4.h Date: Mon, 15 Oct 2018 19:52:30 +0800 Message-Id: <5e51c4ea8cfe99f2faac5e206982658bd6c5d265.1539595555.git.zongbox@gmail.com> In-Reply-To: <cover.1539595555.git.zongbox@gmail.com> References: <cover.1539595555.git.zongbox@gmail.com> |
Commit Message
Zong Li
Oct. 15, 2018, 11:52 a.m. UTC
The original value of X##_f[0] is overwritten before the calculatation. After this modification, we can pass the soft floating testing of glibc testsuites on RV32. * soft-fp/op-4.h: Fix wrong calculation of division. --- ChangeLog | 4 ++++ soft-fp/op-4.h | 7 ++++++- 2 files changed, 10 insertions(+), 1 deletion(-)
Comments
On Mon, 15 Oct 2018, Zong Li wrote: > diff --git a/soft-fp/op-4.h b/soft-fp/op-4.h > index 01b87d0..a407168 100644 > --- a/soft-fp/op-4.h > +++ b/soft-fp/op-4.h > @@ -517,10 +517,15 @@ > R##_f[_FP_DIV_MEAT_4_udiv_i] = -1; \ > if (!_FP_DIV_MEAT_4_udiv_i) \ > break; \ > - __FP_FRAC_SUB_4 (X##_f[3], X##_f[2], X##_f[1], X##_f[0], \ > + UWtype __FP_FRAC_tmp; \ > + __FP_FRAC_SUB_4 (X##_f[2], X##_f[1], X##_f[0], __FP_FRAC_tmp, \ > Y##_f[2], Y##_f[1], Y##_f[0], 0, \ > X##_f[2], X##_f[1], X##_f[0], \ > _FP_DIV_MEAT_4_udiv_n_f[_FP_DIV_MEAT_4_udiv_i]); \ > + X##_f[3] = X##_f[2]; \ > + X##_f[2] = X##_f[1]; \ > + X##_f[1] = X##_f[0]; \ > + X##_f[0] = __FP_FRAC_tmp; \ This approach, using the X array in a shifted form to store the result with a single temporary, is not a clean solution. First, we should work out what the interface *should* be to all the FRAC_ADD and FRAC_SUB macros, regarding permitted overlap between the output and the inputs. This requires examining all those macros and their uses to determine what consistent or nearly consistent patterns there are in that interface. Given an appropriate choice of interface, all the definitions that do not fit that choice, and all the uses that do not fit that choice, then need to be fixed. If that choice says the overlap above it not OK, a whole four-word temporary should be defined using _FP_FRAC_DECL_4 for the result, and then copied with _FP_FRAC_COPY_4. The commit message for any proposed change needs to describe the conclusions regarding the interface for permitted overlaps, and the analysis done to determine what needs to change. I suspect that the intended interface is that *exact* overlap is permitted between the output and the *second* input, but not other overlap. In which case at least the definitions of __FP_FRAC_SUB_3 and __FP_FRAC_SUB_4 look buggy and need changing (they reference y1 and y2 after r1 and r2 have been set), as well as the user you're changing above (which should use a full temporary).
On Mon, 15 Oct 2018, Joseph Myers wrote: > to be fixed. If that choice says the overlap above it not OK, a whole > four-word temporary should be defined using _FP_FRAC_DECL_4 for the > result, and then copied with _FP_FRAC_COPY_4. The commit message for any In fact, you don't need that copy from the temporary back to X, because the result is immediately used in a call to _FP_FRAC_SUB_4 which can become _FP_FRAC_SUB_4 (X, Y, _FP_DIV_MEAT_4_tmp).
Joseph Myers <joseph@codesourcery.com> 於 2018年10月15日 週一 下午9:04寫道: > > On Mon, 15 Oct 2018, Joseph Myers wrote: > > > to be fixed. If that choice says the overlap above it not OK, a whole > > four-word temporary should be defined using _FP_FRAC_DECL_4 for the > > result, and then copied with _FP_FRAC_COPY_4. The commit message for any > > In fact, you don't need that copy from the temporary back to X, because > the result is immediately used in a call to _FP_FRAC_SUB_4 which can > become _FP_FRAC_SUB_4 (X, Y, _FP_DIV_MEAT_4_tmp). > Yes, I will use the full temporary replace with the shifted form if overlay is not OK.
Joseph Myers <joseph@codesourcery.com> 於 2018年10月15日 週一 下午9:01寫道: > > On Mon, 15 Oct 2018, Zong Li wrote: > > > diff --git a/soft-fp/op-4.h b/soft-fp/op-4.h > > index 01b87d0..a407168 100644 > > --- a/soft-fp/op-4.h > > +++ b/soft-fp/op-4.h > > @@ -517,10 +517,15 @@ > > R##_f[_FP_DIV_MEAT_4_udiv_i] = -1; \ > > if (!_FP_DIV_MEAT_4_udiv_i) \ > > break; \ > > - __FP_FRAC_SUB_4 (X##_f[3], X##_f[2], X##_f[1], X##_f[0], \ > > + UWtype __FP_FRAC_tmp; \ > > + __FP_FRAC_SUB_4 (X##_f[2], X##_f[1], X##_f[0], __FP_FRAC_tmp, \ > > Y##_f[2], Y##_f[1], Y##_f[0], 0, \ > > X##_f[2], X##_f[1], X##_f[0], \ > > _FP_DIV_MEAT_4_udiv_n_f[_FP_DIV_MEAT_4_udiv_i]); \ > > + X##_f[3] = X##_f[2]; \ > > + X##_f[2] = X##_f[1]; \ > > + X##_f[1] = X##_f[0]; \ > > + X##_f[0] = __FP_FRAC_tmp; \ > > This approach, using the X array in a shifted form to store the result > with a single temporary, is not a clean solution. > > First, we should work out what the interface *should* be to all the > FRAC_ADD and FRAC_SUB macros, regarding permitted overlap between the > output and the inputs. This requires examining all those macros and their > uses to determine what consistent or nearly consistent patterns there are > in that interface. > > Given an appropriate choice of interface, all the definitions that do not > fit that choice, and all the uses that do not fit that choice, then need > to be fixed. If that choice says the overlap above it not OK, a whole > four-word temporary should be defined using _FP_FRAC_DECL_4 for the > result, and then copied with _FP_FRAC_COPY_4. The commit message for any > proposed change needs to describe the conclusions regarding the interface > for permitted overlaps, and the analysis done to determine what needs to > change. > > I suspect that the intended interface is that *exact* overlap is permitted > between the output and the *second* input, but not other overlap. In > which case at least the definitions of __FP_FRAC_SUB_3 and __FP_FRAC_SUB_4 > look buggy and need changing (they reference y1 and y2 after r1 and r2 > have been set), as well as the user you're changing above (which should > use a full temporary). > I spend some time to check all the FRAC_ADD and FRAC_SUB use cases. In FRAC_ADD_3 and FRAC_ADD_4(R, X, Y), it reference the x[N] after r[N] have been set. If the x and r are the same address, the expression r[N] < x[N] is always false(it always be equal), then the carry bit is always zero. In glibc, all FRAC_ADD users aren't specify the r and x with the same address, but there are some users specify the r and y with the same address, and they are still work because y[N] not be used again in FRAC_ADD. In FRAC_SUB_3 and FRAC_SUB_4, it is more strict that it reference both the x[N] and y[N] after r[N] have been set. If one of the x and y is the same address with r, the result of the calculation is wrong because the value of the original x and y are overwritten. In glibc, there are two places use FRAC_SUB and occurs the overlap. The first is _FP_DIV_MEAT_N_loop in op-common.h, it uses the source _FP_DIV_MEAT_N_loop_u as the destination. This macro only be used when N is one(_FP_DIV_MEAT_1_loop) and then the _FP_FRAC_SUB_##wc extend to _FP_FRAC_SUB_1 in this macro. so it also work because _FP_FRAC_SUB_1 has no overlap problem in its implementation. The second places is I had modified in this patch, and I would use the full temporary replace single temporary for next version. In FRAC_ADD_1, FRAC_ADD_2, FRAC_SUB_1 and FRAC_SUB_2, there don't refer the source after destination have been set, so they have no problem. In conclusion, I inclines to keep the current implementation of FRAC_SUB_3, FRAC_SUB_4, FRAC_ADD_3 and FRAC_ADD_4 instead of using the temporary variables like FRAC_ADD_2 and FRAC_SUB_2, and fix the _FP_DIV_MEAT_N_loop together in next version. Because using temporary variables have to additional move the result from temporary to real destination in the most situation. The macros user should be responsible for ensuing no overlap between sources and destination. I want to know whether anyone has some suggestion, please let me know, thanks. I test the macros by hand and glibc testsutie, and it seems that the overlap is unreasonable. The overlap will cause the wrong result and failed on glibc testsuite.
On Mon, 22 Oct 2018, Zong Li wrote: > I spend some time to check all the FRAC_ADD and FRAC_SUB use cases. Thanks for the analysis of the various definitions and users of these macros. > In conclusion, I inclines to keep the current implementation of > FRAC_SUB_3, FRAC_SUB_4, FRAC_ADD_3 and FRAC_ADD_4 instead of using the > temporary variables like FRAC_ADD_2 and FRAC_SUB_2, and fix the > _FP_DIV_MEAT_N_loop together in next version. Because using temporary I think a basic principle here should be that all FRAC_ADD and FRAC_SUB macros have the same interface regarding what overlap is permitted between arguments and results. That is, if that interface is defined not to allow any overlap, then there are several users to change to avoid overlap, even if they are only using FRAC_ADD macros with overlap between R and Y, or only FRAC_SUB_1 and FRAC_SUB_2, and so the overlap does not cause problems with the present implementations. I think it would be less disruptive to allow exact overlap between corresponding elements of R and Y. Then __FP_FRAC_SUB_3 and __FP_FRAC_SUB_4 would be fixed to use temporaries to follow that interface, while _FP_DIV_MEAT_4_udiv and _FP_DIV_MEAT_N_loop would also be made to use temporaries so their calls to the FRAC_SUB macros follow that restriction on overlap. (It's true that the current sfp-machine.h files only use _FP_DIV_MEAT_1_loop, not _FP_DIV_MEAT_2_loop or _FP_DIV_MEAT_4_loop, but it's still appropriate to fix _FP_DIV_MEAT_N_loop to follow the FRAC_SUB interface properly.) The compiler should have no difficulty optimizing out the temporaries in cases where they end up not being needed. In all cases, names of temporary variables in soft-fp macros should start with the full name of the macro (so __FP_FRAC_SUB_3_* for temporaries inside __FP_FRAC_SUB_3, for example), to avoid the risk of bugs resulting from shadowing when one macro calls another macro with the same variable name.
On Mon, 22 Oct 2018 05:10:59 PDT (-0700), joseph@codesourcery.com wrote: > On Mon, 22 Oct 2018, Zong Li wrote: > >> I spend some time to check all the FRAC_ADD and FRAC_SUB use cases. > > Thanks for the analysis of the various definitions and users of these > macros. > >> In conclusion, I inclines to keep the current implementation of >> FRAC_SUB_3, FRAC_SUB_4, FRAC_ADD_3 and FRAC_ADD_4 instead of using the >> temporary variables like FRAC_ADD_2 and FRAC_SUB_2, and fix the >> _FP_DIV_MEAT_N_loop together in next version. Because using temporary > > I think a basic principle here should be that all FRAC_ADD and FRAC_SUB > macros have the same interface regarding what overlap is permitted between > arguments and results. > > That is, if that interface is defined not to allow any overlap, then there > are several users to change to avoid overlap, even if they are only using > FRAC_ADD macros with overlap between R and Y, or only FRAC_SUB_1 and > FRAC_SUB_2, and so the overlap does not cause problems with the present > implementations. > > I think it would be less disruptive to allow exact overlap between > corresponding elements of R and Y. Then __FP_FRAC_SUB_3 and > __FP_FRAC_SUB_4 would be fixed to use temporaries to follow that > interface, while _FP_DIV_MEAT_4_udiv and _FP_DIV_MEAT_N_loop would also be > made to use temporaries so their calls to the FRAC_SUB macros follow that > restriction on overlap. (It's true that the current sfp-machine.h files > only use _FP_DIV_MEAT_1_loop, not _FP_DIV_MEAT_2_loop or > _FP_DIV_MEAT_4_loop, but it's still appropriate to fix _FP_DIV_MEAT_N_loop > to follow the FRAC_SUB interface properly.) > > The compiler should have no difficulty optimizing out the temporaries in > cases where they end up not being needed. > > In all cases, names of temporary variables in soft-fp macros should start > with the full name of the macro (so __FP_FRAC_SUB_3_* for temporaries > inside __FP_FRAC_SUB_3, for example), to avoid the risk of bugs resulting > from shadowing when one macro calls another macro with the same variable > name. Thanks for digging in to this so deeply, I'm afraid I'm way over my head here. If you need any help feel free to ask, as I'm not really following any of this.
Joseph Myers <joseph@codesourcery.com> 於 2018年10月22日 週一 下午8:11寫道: > > On Mon, 22 Oct 2018, Zong Li wrote: > > > I spend some time to check all the FRAC_ADD and FRAC_SUB use cases. > > Thanks for the analysis of the various definitions and users of these > macros. > > > In conclusion, I inclines to keep the current implementation of > > FRAC_SUB_3, FRAC_SUB_4, FRAC_ADD_3 and FRAC_ADD_4 instead of using the > > temporary variables like FRAC_ADD_2 and FRAC_SUB_2, and fix the > > _FP_DIV_MEAT_N_loop together in next version. Because using temporary > > I think a basic principle here should be that all FRAC_ADD and FRAC_SUB > macros have the same interface regarding what overlap is permitted between > arguments and results. > > That is, if that interface is defined not to allow any overlap, then there > are several users to change to avoid overlap, even if they are only using > FRAC_ADD macros with overlap between R and Y, or only FRAC_SUB_1 and > FRAC_SUB_2, and so the overlap does not cause problems with the present > implementations. > > I think it would be less disruptive to allow exact overlap between > corresponding elements of R and Y. Then __FP_FRAC_SUB_3 and > __FP_FRAC_SUB_4 would be fixed to use temporaries to follow that > interface, while _FP_DIV_MEAT_4_udiv and _FP_DIV_MEAT_N_loop would also be > made to use temporaries so their calls to the FRAC_SUB macros follow that > restriction on overlap. (It's true that the current sfp-machine.h files > only use _FP_DIV_MEAT_1_loop, not _FP_DIV_MEAT_2_loop or > _FP_DIV_MEAT_4_loop, but it's still appropriate to fix _FP_DIV_MEAT_N_loop > to follow the FRAC_SUB interface properly.) > > The compiler should have no difficulty optimizing out the temporaries in > cases where they end up not being needed. > > In all cases, names of temporary variables in soft-fp macros should start > with the full name of the macro (so __FP_FRAC_SUB_3_* for temporaries > inside __FP_FRAC_SUB_3, for example), to avoid the risk of bugs resulting > from shadowing when one macro calls another macro with the same variable > name. I see, I would change the __FP_FRAC_SUB_3, __FP_FRAC_SUB_4, __FP_FRAC_ADD_3 and __FP_FRAC_ADD_4 implementation, let the interface is identical for users.
Palmer Dabbelt <palmer@dabbelt.com> 於 2018年10月23日 週二 上午3:31寫道: > > On Mon, 22 Oct 2018 05:10:59 PDT (-0700), joseph@codesourcery.com wrote: > > On Mon, 22 Oct 2018, Zong Li wrote: > > > >> I spend some time to check all the FRAC_ADD and FRAC_SUB use cases. > > > > Thanks for the analysis of the various definitions and users of these > > macros. > > > >> In conclusion, I inclines to keep the current implementation of > >> FRAC_SUB_3, FRAC_SUB_4, FRAC_ADD_3 and FRAC_ADD_4 instead of using the > >> temporary variables like FRAC_ADD_2 and FRAC_SUB_2, and fix the > >> _FP_DIV_MEAT_N_loop together in next version. Because using temporary > > > > I think a basic principle here should be that all FRAC_ADD and FRAC_SUB > > macros have the same interface regarding what overlap is permitted between > > arguments and results. > > > > That is, if that interface is defined not to allow any overlap, then there > > are several users to change to avoid overlap, even if they are only using > > FRAC_ADD macros with overlap between R and Y, or only FRAC_SUB_1 and > > FRAC_SUB_2, and so the overlap does not cause problems with the present > > implementations. > > > > I think it would be less disruptive to allow exact overlap between > > corresponding elements of R and Y. Then __FP_FRAC_SUB_3 and > > __FP_FRAC_SUB_4 would be fixed to use temporaries to follow that > > interface, while _FP_DIV_MEAT_4_udiv and _FP_DIV_MEAT_N_loop would also be > > made to use temporaries so their calls to the FRAC_SUB macros follow that > > restriction on overlap. (It's true that the current sfp-machine.h files > > only use _FP_DIV_MEAT_1_loop, not _FP_DIV_MEAT_2_loop or > > _FP_DIV_MEAT_4_loop, but it's still appropriate to fix _FP_DIV_MEAT_N_loop > > to follow the FRAC_SUB interface properly.) > > > > The compiler should have no difficulty optimizing out the temporaries in > > cases where they end up not being needed. > > > > In all cases, names of temporary variables in soft-fp macros should start > > with the full name of the macro (so __FP_FRAC_SUB_3_* for temporaries > > inside __FP_FRAC_SUB_3, for example), to avoid the risk of bugs resulting > > from shadowing when one macro calls another macro with the same variable > > name. > > Thanks for digging in to this so deeply, I'm afraid I'm way over my head here. > If you need any help feel free to ask, as I'm not really following any of this. OK, thanks!
On Tue, 23 Oct 2018, Zong Li wrote: > > I think it would be less disruptive to allow exact overlap between > > corresponding elements of R and Y. Then __FP_FRAC_SUB_3 and > > __FP_FRAC_SUB_4 would be fixed to use temporaries to follow that > > interface, while _FP_DIV_MEAT_4_udiv and _FP_DIV_MEAT_N_loop would also be > > made to use temporaries so their calls to the FRAC_SUB macros follow that > > restriction on overlap. (It's true that the current sfp-machine.h files > > only use _FP_DIV_MEAT_1_loop, not _FP_DIV_MEAT_2_loop or > > _FP_DIV_MEAT_4_loop, but it's still appropriate to fix _FP_DIV_MEAT_N_loop > > to follow the FRAC_SUB interface properly.) > > > > The compiler should have no difficulty optimizing out the temporaries in > > cases where they end up not being needed. > > > > In all cases, names of temporary variables in soft-fp macros should start > > with the full name of the macro (so __FP_FRAC_SUB_3_* for temporaries > > inside __FP_FRAC_SUB_3, for example), to avoid the risk of bugs resulting > > from shadowing when one macro calls another macro with the same variable > > name. > > I see, I would change the __FP_FRAC_SUB_3, __FP_FRAC_SUB_4, > __FP_FRAC_ADD_3 and __FP_FRAC_ADD_4 implementation, let the interface > is identical for users. My expectation is that __FP_FRAC_ADD_3 and __FP_FRAC_ADD_4 don't need to change (because they are already valid for exact overlap between the result and the second input).
Joseph Myers <joseph@codesourcery.com> 於 2018年10月24日 週三 上午4:44寫道: > > On Tue, 23 Oct 2018, Zong Li wrote: > > > > I think it would be less disruptive to allow exact overlap between > > > corresponding elements of R and Y. Then __FP_FRAC_SUB_3 and > > > __FP_FRAC_SUB_4 would be fixed to use temporaries to follow that > > > interface, while _FP_DIV_MEAT_4_udiv and _FP_DIV_MEAT_N_loop would also be > > > made to use temporaries so their calls to the FRAC_SUB macros follow that > > > restriction on overlap. (It's true that the current sfp-machine.h files > > > only use _FP_DIV_MEAT_1_loop, not _FP_DIV_MEAT_2_loop or > > > _FP_DIV_MEAT_4_loop, but it's still appropriate to fix _FP_DIV_MEAT_N_loop > > > to follow the FRAC_SUB interface properly.) > > > > > > The compiler should have no difficulty optimizing out the temporaries in > > > cases where they end up not being needed. > > > > > > In all cases, names of temporary variables in soft-fp macros should start > > > with the full name of the macro (so __FP_FRAC_SUB_3_* for temporaries > > > inside __FP_FRAC_SUB_3, for example), to avoid the risk of bugs resulting > > > from shadowing when one macro calls another macro with the same variable > > > name. > > > > I see, I would change the __FP_FRAC_SUB_3, __FP_FRAC_SUB_4, > > __FP_FRAC_ADD_3 and __FP_FRAC_ADD_4 implementation, let the interface > > is identical for users. > > My expectation is that __FP_FRAC_ADD_3 and __FP_FRAC_ADD_4 don't need to > change (because they are already valid for exact overlap between the > result and the second input). > OK, I would change the __FP_FRAC_SUB_3 and __FP_FRAC_SUB_4 and submit next version after I testing glibc testsuite.
diff --git a/ChangeLog b/ChangeLog index 2601765..bb30093 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2018-10-15 Zong Li <zongbox@gmail.com> + + * soft-fp/op-4.h: Fix wrong calculation of division. + 2018-10-14 Paul Eggert <eggert@cs.ucla.edu> regex: simplify by using intprops.h diff --git a/soft-fp/op-4.h b/soft-fp/op-4.h index 01b87d0..a407168 100644 --- a/soft-fp/op-4.h +++ b/soft-fp/op-4.h @@ -517,10 +517,15 @@ R##_f[_FP_DIV_MEAT_4_udiv_i] = -1; \ if (!_FP_DIV_MEAT_4_udiv_i) \ break; \ - __FP_FRAC_SUB_4 (X##_f[3], X##_f[2], X##_f[1], X##_f[0], \ + UWtype __FP_FRAC_tmp; \ + __FP_FRAC_SUB_4 (X##_f[2], X##_f[1], X##_f[0], __FP_FRAC_tmp, \ Y##_f[2], Y##_f[1], Y##_f[0], 0, \ X##_f[2], X##_f[1], X##_f[0], \ _FP_DIV_MEAT_4_udiv_n_f[_FP_DIV_MEAT_4_udiv_i]); \ + X##_f[3] = X##_f[2]; \ + X##_f[2] = X##_f[1]; \ + X##_f[1] = X##_f[0]; \ + X##_f[0] = __FP_FRAC_tmp; \ _FP_FRAC_SUB_4 (X, Y, X); \ if (X##_f[3] > Y##_f[3]) \ { \