Message ID | 20170815195541.GA19812@gmail.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 128654 invoked by alias); 15 Aug 2017 19:55:49 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 127553 invoked by uid 89); 15 Aug 2017 19:55:45 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=55am, 55AM X-HELO: mga04.intel.com X-ExtLoop1: 1 Date: Tue, 15 Aug 2017 12:55:41 -0700 From: "H.J. Lu" <hongjiu.lu@intel.com> To: GNU C Library <libc-alpha@sourceware.org> Subject: Re: [PATCH] x86-64: Put L(SP_INF_0) in .rodata.cst4 section [BZ #21955] Message-ID: <20170815195541.GA19812@gmail.com> Reply-To: "H.J. Lu" <hjl.tools@gmail.com> References: <20170815173855.GA19119@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170815173855.GA19119@gmail.com> User-Agent: Mutt/1.8.3 (2017-05-23) |
Commit Message
Lu, Hongjiu
Aug. 15, 2017, 7:55 p.m. UTC
On Tue, Aug 15, 2017 at 10:38:55AM -0700, H.J. Lu wrote: > sysdeps/x86_64/fpu/e_expf.S has > > /* Here if |x| is Inf */ > lea L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */ > movss (%rdx,%rax,4), %xmm0 /* return zero or Inf */ > ret > ... > .section .rodata.cst8,"aM",@progbits,8 > ... > .p2align 2 > L(SP_INF_0): > .long 0x7f800000 /* single precision Inf */ > .long 0 /* single precision zero */ > .type L(SP_INF_0), @object > ASM_SIZE_DIRECTIVE(L(SP_INF_0)) > > Since L(SP_INF_0) is accessed as an array of 4-byte elements, it should > be placed in > > .section .rodata.cst4,"aM",@progbits,4 > > Tested on x86-64. Any comments? > > H.J. > --- > [BZ #21955] > * sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in > .rodata.cst4 section. L(SP_RANGE) has the same issue. This updated patch fixes both. H.J. --- sysdeps/x86_64/fpu/e_expf.S has lea L(SP_RANGE)(%rip), %rdx /* load over/underflow bound */ cmpl (%rdx,%rax,4), %ecx /* |x|<under/overflow bound ? */ ... /* Here if |x| is Inf */ lea L(SP_INF_0)(%rip), %rdx /* depending on sign of x: */ movss (%rdx,%rax,4), %xmm0 /* return zero or Inf */ ret ... .section .rodata.cst8,"aM",@progbits,8 ... .p2align 2 L(SP_RANGE): /* single precision overflow/underflow bounds */ .long 0x42b17217 /* if x>this bound, then result overflows */ .long 0x42cff1b4 /* if x<this bound, then result underflows */ .type L(SP_RANGE), @object ASM_SIZE_DIRECTIVE(L(SP_RANGE)) .p2align 2 L(SP_INF_0): .long 0x7f800000 /* single precision Inf */ .long 0 /* single precision zero */ .type L(SP_INF_0), @object ASM_SIZE_DIRECTIVE(L(SP_INF_0)) Since L(SP_RANGE) and L(SP_INF_0) are accessed as arrays of 4-byte elements, they should be placed in .rodata.cst4 section. [BZ #21955] * sysdeps/x86_64/fpu/e_expf.S (L(SP_RANGE)): Place it in .rodata.cst4 section. (L(SP_INF_0)): Likewise. --- sysdeps/x86_64/fpu/e_expf.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Tue, 15 Aug 2017, H.J. Lu wrote: > > [BZ #21955] > > * sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in > > .rodata.cst4 section. > > L(SP_RANGE) has the same issue. This updated patch fixes both. It's a lot more than just expf. There are various other x86_64 and x86 libm files that could use .rodata.cstN but don't. (Obviously this only works for invididual objects where the code doesn't use offsets from one object to another, not when an array of two or more objects is being used unless you choose the section appropriately to preserve the array as such.)
On Tue, Aug 15, 2017 at 1:02 PM, Joseph Myers <joseph@codesourcery.com> wrote: > On Tue, 15 Aug 2017, H.J. Lu wrote: > >> > [BZ #21955] >> > * sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in >> > .rodata.cst4 section. >> >> L(SP_RANGE) has the same issue. This updated patch fixes both. > > It's a lot more than just expf. There are various other x86_64 and x86 > libm files that could use .rodata.cstN but don't. (Obviously this only > works for invididual objects where the code doesn't use offsets from one > object to another, not when an array of two or more objects is being used > unless you choose the section appropriately to preserve the array as > such.) > Aren't .rodata.cstN optimization? In case of e_expf.S, it is a correctness issue.
On Tue, Aug 15, 2017 at 4:14 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Tue, Aug 15, 2017 at 1:02 PM, Joseph Myers <joseph@codesourcery.com> wrote: >> On Tue, 15 Aug 2017, H.J. Lu wrote: >> >>> > [BZ #21955] >>> > * sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in >>> > .rodata.cst4 section. >>> >>> L(SP_RANGE) has the same issue. This updated patch fixes both. >> >> It's a lot more than just expf. There are various other x86_64 and x86 >> libm files that could use .rodata.cstN but don't. (Obviously this only >> works for invididual objects where the code doesn't use offsets from one >> object to another, not when an array of two or more objects is being used >> unless you choose the section appropriately to preserve the array as >> such.) > > Aren't .rodata.cstN optimization? In case of e_expf.S, it is a correctness > issue. Please elaborate - I was just about to ask whether this actually makes any difference to anything, other than abstract tidiness of the .rodata segment. The code actually misbehaves without this change? How? zw
On Tue, 15 Aug 2017, H.J. Lu wrote: > On Tue, Aug 15, 2017 at 1:02 PM, Joseph Myers <joseph@codesourcery.com> wrote: > > On Tue, 15 Aug 2017, H.J. Lu wrote: > > > >> > [BZ #21955] > >> > * sysdeps/x86_64/fpu/e_expf.S (L(SP_INF_0)): Place it in > >> > .rodata.cst4 section. > >> > >> L(SP_RANGE) has the same issue. This updated patch fixes both. > > > > It's a lot more than just expf. There are various other x86_64 and x86 > > libm files that could use .rodata.cstN but don't. (Obviously this only > > works for invididual objects where the code doesn't use offsets from one > > object to another, not when an array of two or more objects is being used > > unless you choose the section appropriately to preserve the array as > > such.) > > Aren't .rodata.cstN optimization? In case of e_expf.S, it is a correctness > issue. They are generally optimization. Could you explain this correctness issue in more detail? SP_INF_0 appears to be an array of two 4-byte values. Because it's used as an array, the two values need to stay adjacent. That is, I'd expect it to need, for correctness, to be in .rodata.cst8, as it is at present, and *not* .rodata.cst4 (if in .rodata.cst4, it might get split up as those values get unified with other values in that section).
diff --git a/sysdeps/x86_64/fpu/e_expf.S b/sysdeps/x86_64/fpu/e_expf.S index 4fd2bb1fb5..29e421b4bd 100644 --- a/sysdeps/x86_64/fpu/e_expf.S +++ b/sysdeps/x86_64/fpu/e_expf.S @@ -297,6 +297,7 @@ L(DP_P0): /* double precision polynomial coefficient P0 */ .type L(DP_P0), @object ASM_SIZE_DIRECTIVE(L(DP_P0)) + .section .rodata.cst4,"aM",@progbits,4 .p2align 2 L(SP_RANGE): /* single precision overflow/underflow bounds */ .long 0x42b17217 /* if x>this bound, then result overflows */ @@ -311,7 +312,6 @@ L(SP_INF_0): .type L(SP_INF_0), @object ASM_SIZE_DIRECTIVE(L(SP_INF_0)) - .section .rodata.cst4,"aM",@progbits,4 .p2align 2 L(SP_RS): /* single precision 2^23+2^22 */ .long 0x4b400000