glibc 2.21 - Machine maintainers, please test your machines.

Message ID 20150125215150.GA15033@gmail.com
State Dropped
Headers

Commit Message

H.J. Lu Jan. 25, 2015, 9:51 p.m. UTC
  On Sun, Jan 25, 2015 at 10:12:23AM -0800, H.J. Lu wrote:
> On Sun, Jan 25, 2015 at 8:45 AM, Chris Metcalf <cmetcalf@ezchip.com> wrote:
> > On 1/25/2015 10:18 AM, Chris Metcalf wrote:
> >>
> >> I'm now running with the following change, to see if tilegx32 will
> >> also pass with it; this implements my suggestion of rounding new_sem to
> >> an 8-byte boundary explicitly on ILP32 platforms.
> >
> >
> > With my proposed change, tilegx32 (and tilegx64) pass all the tests.
> > Repeated
> > here with a suitable ChangeLog.  Let me know if this is acceptable to commit
> > for 2.21.
> >
> > 2015-01-25  Chris Metcalf  <cmetcalf@ezchip.com>
> >
> >         * sysdeps/nptl/internaltypes.h (to_new_sem): Define.  Provides new
> >         behavior for [__HAVE_64B_ATOMICS && !defined (_LP64)].
> >         * nptl/sem_getvalue.c (__new_sem_getvalue): Use to_new_sem.
> >         * nptl/sem_init.c (__new_sem_init): Likewise.
> >         * nptl/sem_post.c (__new_sem_post): Likewise.
> >         * nptl/sem_wait.c (__new_sem_wait): Likewise.
> >         (__new_sem_trywait): Likewise.
> >         * nptl/sem_timedwait.c (sem_timedwait): Likewise.
> >         * nptl/sem_open.c (sem_open): Add comment about to_new_sem.
> >
> >
> 
> Can you try something similar to siginfo_t in:
> 
> sysdeps/unix/sysv/linux/x86/bits/siginfo.h
> 
> We had a similar problem on x32 before.
> 
> 

This is for x86.  Each target can do something like it if needed.

H.J.
---
  

Comments

Chris Metcalf Jan. 25, 2015, 10:10 p.m. UTC | #1
On 1/25/2015 4:51 PM, H.J. Lu wrote:

> This is for x86.  Each target can do something like it if needed.

This does change the ABI for existing code, however, which may legitimately
have sem_t objects that are only aligned to 4-byte boundaries.  Your change
is a good idea for new platforms.  I suppose we could also do something
with symbol versioning to use it for existing 32-bit platforms with
64-bit atomics, too.  So there seem to be multiple fixes we can consider.

> H.J.
> ---
> diff --git a/sysdeps/x86/bits/semaphore.h b/sysdeps/x86/bits/semaphore.h
> index 18b2b3c..4890e0d 100644
> --- a/sysdeps/x86/bits/semaphore.h
> +++ b/sysdeps/x86/bits/semaphore.h
> @@ -28,6 +28,11 @@
>   # define __SIZEOF_SEM_T	16
>   #endif
>   
> +#ifdef __x86_64__
> +# define __SEM_ALIGN_T	long long int
> +#else
> +# define __SEM_ALIGN_T	long int
> +#endif
>   
>   /* Value returned if `sem_open' failed.  */
>   #define SEM_FAILED      ((sem_t *) 0)
> @@ -36,5 +41,5 @@
>   typedef union
>   {
>     char __size[__SIZEOF_SEM_T];
> -  long int __align;
> +  __SEM_ALIGN_T __align;
>   } sem_t;
>
>
  
H.J. Lu Jan. 25, 2015, 11:05 p.m. UTC | #2
On Sun, Jan 25, 2015 at 2:10 PM, Chris Metcalf <cmetcalf@ezchip.com> wrote:
> On 1/25/2015 4:51 PM, H.J. Lu wrote:
>
>> This is for x86.  Each target can do something like it if needed.
>
>
> This does change the ABI for existing code, however, which may legitimately
> have sem_t objects that are only aligned to 4-byte boundaries.  Your change
> is a good idea for new platforms.  I suppose we could also do something
> with symbol versioning to use it for existing 32-bit platforms with
> 64-bit atomics, too.  So there seem to be multiple fixes we can consider.
>

It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
It may not work on all targets.  It works for x32.
  
Chris Metcalf Jan. 26, 2015, 12:46 a.m. UTC | #3
On 1/25/2015 6:05 PM, H.J. Lu wrote:
> On Sun, Jan 25, 2015 at 2:10 PM, Chris Metcalf <cmetcalf@ezchip.com> wrote:
>> On 1/25/2015 4:51 PM, H.J. Lu wrote:
>>
>>> This is for x86.  Each target can do something like it if needed.
>>
>> This does change the ABI for existing code, however, which may legitimately
>> have sem_t objects that are only aligned to 4-byte boundaries.  Your change
>> is a good idea for new platforms.  I suppose we could also do something
>> with symbol versioning to use it for existing 32-bit platforms with
>> 64-bit atomics, too.  So there seem to be multiple fixes we can consider.
>>
> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
> It may not work on all targets.  It works for x32.

The size is not the issue I'm referring to; it's the alignment.

Imagine you have some code built against glibc 2.20 or older, where the
user has written:

   sem_t my_sem;

Given the 4-byte alignment constraints, this could result in my_sem
being aligned to 4 bytes but not to 8 bytes.  Now if we were to make the
change you propose for glibc 2.21, the user passing this semaphore to
sem_post() could be passing a pointer not aligned to 8 bytes, despite
the new headers saying that that was required.  Bang, bus error -- or on x32,
presumably, worse performance even if it performs correctly.
  
H.J. Lu Jan. 26, 2015, 1:14 a.m. UTC | #4
On Sun, Jan 25, 2015 at 4:46 PM, Chris Metcalf <cmetcalf@ezchip.com> wrote:
> On 1/25/2015 6:05 PM, H.J. Lu wrote:
>>
>> On Sun, Jan 25, 2015 at 2:10 PM, Chris Metcalf <cmetcalf@ezchip.com>
>> wrote:
>>>
>>> On 1/25/2015 4:51 PM, H.J. Lu wrote:
>>>
>>>> This is for x86.  Each target can do something like it if needed.
>>>
>>>
>>> This does change the ABI for existing code, however, which may
>>> legitimately
>>> have sem_t objects that are only aligned to 4-byte boundaries.  Your
>>> change
>>> is a good idea for new platforms.  I suppose we could also do something
>>> with symbol versioning to use it for existing 32-bit platforms with
>>> 64-bit atomics, too.  So there seem to be multiple fixes we can consider.
>>>
>> It doesn't change the size, only increases alignment from 4 bytes to 8
>> bytes.
>> It may not work on all targets.  It works for x32.
>
>
> The size is not the issue I'm referring to; it's the alignment.
>
> Imagine you have some code built against glibc 2.20 or older, where the
> user has written:
>
>   sem_t my_sem;
>
> Given the 4-byte alignment constraints, this could result in my_sem
> being aligned to 4 bytes but not to 8 bytes.  Now if we were to make the
> change you propose for glibc 2.21, the user passing this semaphore to
> sem_post() could be passing a pointer not aligned to 8 bytes, despite
> the new headers saying that that was required.  Bang, bus error -- or on
> x32,
> presumably, worse performance even if it performs correctly.
>

Realign new_sem requires a few instructions.  For x32, the performance
gain may be a wash.  For x32, I prefer to align sem_t to 8 bytes.
  
Andreas Schwab Jan. 26, 2015, 9:24 a.m. UTC | #5
"H.J. Lu" <hjl.tools@gmail.com> writes:

> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.

Alignment is part of the ABI.

Andreas.
  
H.J. Lu Jan. 26, 2015, 12:44 p.m. UTC | #6
On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
> "H.J. Lu" <hjl.tools@gmail.com> writes:
>
>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>
> Alignment is part of the ABI.
>

For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
existing binaries.
  
Torvald Riegel Jan. 26, 2015, 12:50 p.m. UTC | #7
On Mon, 2015-01-26 at 04:44 -0800, H.J. Lu wrote:
> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
> > "H.J. Lu" <hjl.tools@gmail.com> writes:
> >
> >> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
> >
> > Alignment is part of the ABI.
> >
> 
> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
> existing binaries.

But you'd still have to do the alignment check at runtime, because you
can't expect the application to have used the new headers with proper
alignment.  Or are you saying that the alignment is just a performance
thing in the particular case of x32, because the atomics work either
way?  If so, then using __alignof__(sem_t) in Chris' patch would be good
because then to_new_sem would resolve to a noop in x32 with your
alignment change applied.
  
H.J. Lu Jan. 26, 2015, 1:25 p.m. UTC | #8
On Mon, Jan 26, 2015 at 4:50 AM, Torvald Riegel <triegel@redhat.com> wrote:
> On Mon, 2015-01-26 at 04:44 -0800, H.J. Lu wrote:
>> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>> > "H.J. Lu" <hjl.tools@gmail.com> writes:
>> >
>> >> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>> >
>> > Alignment is part of the ABI.
>> >
>>
>> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>> existing binaries.
>
> But you'd still have to do the alignment check at runtime, because you
> can't expect the application to have used the new headers with proper
> alignment.  Or are you saying that the alignment is just a performance
> thing in the particular case of x32, because the atomics work either

Yes.

> way?  If so, then using __alignof__(sem_t) in Chris' patch would be good
> because then to_new_sem would resolve to a noop in x32 with your
> alignment change applied.

I don't want a noop.  I want compiler not to generate a noop at all.
  
Torvald Riegel Jan. 26, 2015, 1:32 p.m. UTC | #9
On Mon, 2015-01-26 at 05:25 -0800, H.J. Lu wrote:
> On Mon, Jan 26, 2015 at 4:50 AM, Torvald Riegel <triegel@redhat.com> wrote:
> > way?  If so, then using __alignof__(sem_t) in Chris' patch would be good
> > because then to_new_sem would resolve to a noop in x32 with your
> > alignment change applied.
> 
> I don't want a noop.  I want compiler not to generate a noop at all.

Yes, that's what I mean.  Compiler should figure out that no conversion
is necessary and not emit any code for this conversion.
  
H.J. Lu Jan. 26, 2015, 1:35 p.m. UTC | #10
On Mon, Jan 26, 2015 at 5:32 AM, Torvald Riegel <triegel@redhat.com> wrote:
> On Mon, 2015-01-26 at 05:25 -0800, H.J. Lu wrote:
>> On Mon, Jan 26, 2015 at 4:50 AM, Torvald Riegel <triegel@redhat.com> wrote:
>> > way?  If so, then using __alignof__(sem_t) in Chris' patch would be good
>> > because then to_new_sem would resolve to a noop in x32 with your
>> > alignment change applied.
>>
>> I don't want a noop.  I want compiler not to generate a noop at all.
>
> Yes, that's what I mean.  Compiler should figure out that no conversion
> is necessary and not emit any code for this conversion.
>

If it is true, we should drop "&& !defined (_LP64)" in:

+#if __HAVE_64B_ATOMICS && !defined (_LP64)
+# define to_new_sem(s) ((struct new_sem *)(((uintptr_t)(s) + 4) & -8))
+#else
+# define to_new_sem(s) ((struct new_sem *)(s))
+#endif
  
Torvald Riegel Jan. 26, 2015, 1:39 p.m. UTC | #11
On Mon, 2015-01-26 at 05:35 -0800, H.J. Lu wrote:
> On Mon, Jan 26, 2015 at 5:32 AM, Torvald Riegel <triegel@redhat.com> wrote:
> > On Mon, 2015-01-26 at 05:25 -0800, H.J. Lu wrote:
> >> On Mon, Jan 26, 2015 at 4:50 AM, Torvald Riegel <triegel@redhat.com> wrote:
> >> > way?  If so, then using __alignof__(sem_t) in Chris' patch would be good
> >> > because then to_new_sem would resolve to a noop in x32 with your
> >> > alignment change applied.
> >>
> >> I don't want a noop.  I want compiler not to generate a noop at all.
> >
> > Yes, that's what I mean.  Compiler should figure out that no conversion
> > is necessary and not emit any code for this conversion.
> >
> 
> If it is true, we should drop "&& !defined (_LP64)" in:
> 
> +#if __HAVE_64B_ATOMICS && !defined (_LP64)
> +# define to_new_sem(s) ((struct new_sem *)(((uintptr_t)(s) + 4) & -8))
> +#else
> +# define to_new_sem(s) ((struct new_sem *)(s))
> +#endif
> 

Maybe I wasn't clear.  I suggest to keep this, but use
__alignof__(sem_t) there in such a way that if sem_t is 8-byte aligned,
the compiler will figure our at compile time that to_new_sem(s) is
essentially just a cast.

This way, we can keep this code for all archs including tilegx, yet do
not have any overhead for the conversion on archs that have sufficient
alignment of sem_t (eg, x32 with your patch).

Sounds good?
  
H.J. Lu Jan. 26, 2015, 1:42 p.m. UTC | #12
On Mon, Jan 26, 2015 at 5:39 AM, Torvald Riegel <triegel@redhat.com> wrote:
> On Mon, 2015-01-26 at 05:35 -0800, H.J. Lu wrote:
>> On Mon, Jan 26, 2015 at 5:32 AM, Torvald Riegel <triegel@redhat.com> wrote:
>> > On Mon, 2015-01-26 at 05:25 -0800, H.J. Lu wrote:
>> >> On Mon, Jan 26, 2015 at 4:50 AM, Torvald Riegel <triegel@redhat.com> wrote:
>> >> > way?  If so, then using __alignof__(sem_t) in Chris' patch would be good
>> >> > because then to_new_sem would resolve to a noop in x32 with your
>> >> > alignment change applied.
>> >>
>> >> I don't want a noop.  I want compiler not to generate a noop at all.
>> >
>> > Yes, that's what I mean.  Compiler should figure out that no conversion
>> > is necessary and not emit any code for this conversion.
>> >
>>
>> If it is true, we should drop "&& !defined (_LP64)" in:
>>
>> +#if __HAVE_64B_ATOMICS && !defined (_LP64)
>> +# define to_new_sem(s) ((struct new_sem *)(((uintptr_t)(s) + 4) & -8))
>> +#else
>> +# define to_new_sem(s) ((struct new_sem *)(s))
>> +#endif
>>
>
> Maybe I wasn't clear.  I suggest to keep this, but use
> __alignof__(sem_t) there in such a way that if sem_t is 8-byte aligned,
> the compiler will figure our at compile time that to_new_sem(s) is
> essentially just a cast.
>
> This way, we can keep this code for all archs including tilegx, yet do
> not have any overhead for the conversion on archs that have sufficient
> alignment of sem_t (eg, x32 with your patch).
>
> Sounds good?
>

That sounds good to me.  Please include my sysdeps/x86/bits/semaphore.h
change when you make this change.

Thanks.
  
Andreas Schwab Jan. 26, 2015, 1:48 p.m. UTC | #13
"H.J. Lu" <hjl.tools@gmail.com> writes:

> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>
>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>>
>> Alignment is part of the ABI.
>>
>
> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
> existing binaries.

The compiler may generate code to take advantage of the bigger
alignment, which will fail if not fulfilled (this is not just about
unaligned accesses).

Andreas.
  
H.J. Lu Jan. 26, 2015, 1:50 p.m. UTC | #14
On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
> "H.J. Lu" <hjl.tools@gmail.com> writes:
>
>> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>
>>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>>>
>>> Alignment is part of the ABI.
>>>
>>
>> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>> existing binaries.
>
> The compiler may generate code to take advantage of the bigger
> alignment, which will fail if not fulfilled (this is not just about
> unaligned accesses).
>

Failure shouldn't happen on x32 in this case.
  
Torvald Riegel Jan. 26, 2015, 1:51 p.m. UTC | #15
On Mon, 2015-01-26 at 05:50 -0800, H.J. Lu wrote:
> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
> > "H.J. Lu" <hjl.tools@gmail.com> writes:
> >
> >> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
> >>> "H.J. Lu" <hjl.tools@gmail.com> writes:
> >>>
> >>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
> >>>
> >>> Alignment is part of the ABI.
> >>>
> >>
> >> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
> >> existing binaries.
> >
> > The compiler may generate code to take advantage of the bigger
> > alignment, which will fail if not fulfilled (this is not just about
> > unaligned accesses).
> >
> 
> Failure shouldn't happen on x32 in this case.
> 

Then please provide a *detailed* comment why this is the case along with
the alignment change in x86 semaphore.h.  Given that we're discussing
whether this is safe or not, I think we should have detailed
documentation.  And this will also help conclude the discussion.
  
H.J. Lu Jan. 26, 2015, 1:58 p.m. UTC | #16
On Mon, Jan 26, 2015 at 5:51 AM, Torvald Riegel <triegel@redhat.com> wrote:
> On Mon, 2015-01-26 at 05:50 -0800, H.J. Lu wrote:
>> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
>> > "H.J. Lu" <hjl.tools@gmail.com> writes:
>> >
>> >> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>> >>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>> >>>
>> >>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>> >>>
>> >>> Alignment is part of the ABI.
>> >>>
>> >>
>> >> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>> >> existing binaries.
>> >
>> > The compiler may generate code to take advantage of the bigger
>> > alignment, which will fail if not fulfilled (this is not just about
>> > unaligned accesses).
>> >
>>
>> Failure shouldn't happen on x32 in this case.
>>
>
> Then please provide a *detailed* comment why this is the case along with
> the alignment change in x86 semaphore.h.  Given that we're discussing
> whether this is safe or not, I think we should have detailed
> documentation.  And this will also help conclude the discussion.
>

This the part of x86 specification.  From vol 1 of x86 SDM:

4.1.1 Alignment of Words, Doublewords, Quadwords, and Double Quadwords

Words, doublewords, and quadwords do not need to be aligned in memory
on natural boundaries. The natural
boundaries for words, double words, and quadwords are even-numbered
addresses, addresses evenly divisible by
four, and addresses evenly divisible by eight, respectively. However,
to improve the performance of programs, data
structures (especially stacks) should be aligned on natural boundaries
whenever possible. The reason for this is
that the processor requires two memory accesses to make an unaligned
memory access; aligned accesses require
only one memory access. A word or doubleword operand that crosses a
4-byte boundary or a quadword operand
that crosses an 8-byte boundary is considered unaligned and requires
two separate memory bus cycles for access.

Do I really to quote this?
  
H.J. Lu Jan. 26, 2015, 2 p.m. UTC | #17
On Mon, Jan 26, 2015 at 5:58 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Jan 26, 2015 at 5:51 AM, Torvald Riegel <triegel@redhat.com> wrote:
>> On Mon, 2015-01-26 at 05:50 -0800, H.J. Lu wrote:
>>> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
>>> > "H.J. Lu" <hjl.tools@gmail.com> writes:
>>> >
>>> >> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>>> >>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>> >>>
>>> >>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>>> >>>
>>> >>> Alignment is part of the ABI.
>>> >>>
>>> >>
>>> >> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>>> >> existing binaries.
>>> >
>>> > The compiler may generate code to take advantage of the bigger
>>> > alignment, which will fail if not fulfilled (this is not just about
>>> > unaligned accesses).
>>> >
>>>
>>> Failure shouldn't happen on x32 in this case.
>>>
>>
>> Then please provide a *detailed* comment why this is the case along with
>> the alignment change in x86 semaphore.h.  Given that we're discussing
>> whether this is safe or not, I think we should have detailed
>> documentation.  And this will also help conclude the discussion.
>>
>
> This the part of x86 specification.  From vol 1 of x86 SDM:
>
> 4.1.1 Alignment of Words, Doublewords, Quadwords, and Double Quadwords
>
> Words, doublewords, and quadwords do not need to be aligned in memory
> on natural boundaries. The natural
> boundaries for words, double words, and quadwords are even-numbered
> addresses, addresses evenly divisible by
> four, and addresses evenly divisible by eight, respectively. However,
> to improve the performance of programs, data
> structures (especially stacks) should be aligned on natural boundaries
> whenever possible. The reason for this is
> that the processor requires two memory accesses to make an unaligned
> memory access; aligned accesses require
> only one memory access. A word or doubleword operand that crosses a
> 4-byte boundary or a quadword operand
> that crosses an 8-byte boundary is considered unaligned and requires
> two separate memory bus cycles for access.
>
> Do I really to quote this?
>

You can download x86 SDM from:

http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf
  
Torvald Riegel Jan. 26, 2015, 2:35 p.m. UTC | #18
On Mon, 2015-01-26 at 05:58 -0800, H.J. Lu wrote:
> On Mon, Jan 26, 2015 at 5:51 AM, Torvald Riegel <triegel@redhat.com> wrote:
> > On Mon, 2015-01-26 at 05:50 -0800, H.J. Lu wrote:
> >> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
> >> > "H.J. Lu" <hjl.tools@gmail.com> writes:
> >> >
> >> >> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
> >> >>> "H.J. Lu" <hjl.tools@gmail.com> writes:
> >> >>>
> >> >>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
> >> >>>
> >> >>> Alignment is part of the ABI.
> >> >>>
> >> >>
> >> >> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
> >> >> existing binaries.
> >> >
> >> > The compiler may generate code to take advantage of the bigger
> >> > alignment, which will fail if not fulfilled (this is not just about
> >> > unaligned accesses).
> >> >
> >>
> >> Failure shouldn't happen on x32 in this case.
> >>
> >
> > Then please provide a *detailed* comment why this is the case along with
> > the alignment change in x86 semaphore.h.  Given that we're discussing
> > whether this is safe or not, I think we should have detailed
> > documentation.  And this will also help conclude the discussion.
> >
> 
> This the part of x86 specification.  From vol 1 of x86 SDM:
> 
> 4.1.1 Alignment of Words, Doublewords, Quadwords, and Double Quadwords
> 
> Words, doublewords, and quadwords do not need to be aligned in memory
> on natural boundaries. The natural
> boundaries for words, double words, and quadwords are even-numbered
> addresses, addresses evenly divisible by
> four, and addresses evenly divisible by eight, respectively. However,
> to improve the performance of programs, data
> structures (especially stacks) should be aligned on natural boundaries
> whenever possible. The reason for this is
> that the processor requires two memory accesses to make an unaligned
> memory access; aligned accesses require
> only one memory access. A word or doubleword operand that crosses a
> 4-byte boundary or a quadword operand
> that crosses an 8-byte boundary is considered unaligned and requires
> two separate memory bus cycles for access.
> 
> Do I really to quote this?
> 

It made an experienced glibc developer wonder whether it would work, so
personally, I think a comment in the code is useful.  If we need to
exchange emails about this, it's likely that others will wonder too, so
a comment doesn't hurt.
For example, I believe that if such a comment would have been part of
the patch, we wouldn't need to have to discuss this, because it would
have been clear from the start.

IMO, we should want to make glibc simple instead of creating puzzles :)

For the comment, I think that you could point to just this section, or
summarize in a way that clarifies that unaligned is not an issue for
x32, or something like that.
  
H.J. Lu Jan. 26, 2015, 2:38 p.m. UTC | #19
On Mon, Jan 26, 2015 at 6:35 AM, Torvald Riegel <triegel@redhat.com> wrote:
> On Mon, 2015-01-26 at 05:58 -0800, H.J. Lu wrote:
>> On Mon, Jan 26, 2015 at 5:51 AM, Torvald Riegel <triegel@redhat.com> wrote:
>> > On Mon, 2015-01-26 at 05:50 -0800, H.J. Lu wrote:
>> >> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
>> >> > "H.J. Lu" <hjl.tools@gmail.com> writes:
>> >> >
>> >> >> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>> >> >>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>> >> >>>
>> >> >>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>> >> >>>
>> >> >>> Alignment is part of the ABI.
>> >> >>>
>> >> >>
>> >> >> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>> >> >> existing binaries.
>> >> >
>> >> > The compiler may generate code to take advantage of the bigger
>> >> > alignment, which will fail if not fulfilled (this is not just about
>> >> > unaligned accesses).
>> >> >
>> >>
>> >> Failure shouldn't happen on x32 in this case.
>> >>
>> >
>> > Then please provide a *detailed* comment why this is the case along with
>> > the alignment change in x86 semaphore.h.  Given that we're discussing
>> > whether this is safe or not, I think we should have detailed
>> > documentation.  And this will also help conclude the discussion.
>> >
>>
>> This the part of x86 specification.  From vol 1 of x86 SDM:
>>
>> 4.1.1 Alignment of Words, Doublewords, Quadwords, and Double Quadwords
>>
>> Words, doublewords, and quadwords do not need to be aligned in memory
>> on natural boundaries. The natural
>> boundaries for words, double words, and quadwords are even-numbered
>> addresses, addresses evenly divisible by
>> four, and addresses evenly divisible by eight, respectively. However,
>> to improve the performance of programs, data
>> structures (especially stacks) should be aligned on natural boundaries
>> whenever possible. The reason for this is
>> that the processor requires two memory accesses to make an unaligned
>> memory access; aligned accesses require
>> only one memory access. A word or doubleword operand that crosses a
>> 4-byte boundary or a quadword operand
>> that crosses an 8-byte boundary is considered unaligned and requires
>> two separate memory bus cycles for access.
>>
>> Do I really to quote this?
>>
>
> It made an experienced glibc developer wonder whether it would work, so
> personally, I think a comment in the code is useful.  If we need to
> exchange emails about this, it's likely that others will wonder too, so
> a comment doesn't hurt.
> For example, I believe that if such a comment would have been part of
> the patch, we wouldn't need to have to discuss this, because it would
> have been clear from the start.
>
> IMO, we should want to make glibc simple instead of creating puzzles :)
>
> For the comment, I think that you could point to just this section, or
> summarize in a way that clarifies that unaligned is not an issue for
> x32, or something like that.
>

I will prepare a patch to include some comments.
  
Andreas Schwab Jan. 26, 2015, 2:51 p.m. UTC | #20
"H.J. Lu" <hjl.tools@gmail.com> writes:

> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>
>>> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>>
>>>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>>>>
>>>> Alignment is part of the ABI.
>>>>
>>>
>>> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>>> existing binaries.
>>
>> The compiler may generate code to take advantage of the bigger
>> alignment, which will fail if not fulfilled (this is not just about
>> unaligned accesses).
>>
>
> Failure shouldn't happen on x32 in this case.

Which part of "this is not just about unaligned accesses" didn't you
understand?

Andreas.
  
H.J. Lu Jan. 26, 2015, 2:53 p.m. UTC | #21
On Mon, Jan 26, 2015 at 6:51 AM, Andreas Schwab <schwab@suse.de> wrote:
> "H.J. Lu" <hjl.tools@gmail.com> writes:
>
>> On Mon, Jan 26, 2015 at 5:48 AM, Andreas Schwab <schwab@suse.de> wrote:
>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>
>>>> On Mon, Jan 26, 2015 at 1:24 AM, Andreas Schwab <schwab@suse.de> wrote:
>>>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>>>
>>>>>> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
>>>>>
>>>>> Alignment is part of the ABI.
>>>>>
>>>>
>>>> For x32, we can increase alignment from 4 bytes to 8 bytes without breaking
>>>> existing binaries.
>>>
>>> The compiler may generate code to take advantage of the bigger
>>> alignment, which will fail if not fulfilled (this is not just about
>>> unaligned accesses).
>>>
>>
>> Failure shouldn't happen on x32 in this case.
>
> Which part of "this is not just about unaligned accesses" didn't you
> understand?
>

Please elaborate.  An example/testcase will be more helpful.

Thanks.
  
Andreas Schwab Jan. 26, 2015, 3:03 p.m. UTC | #22
The compiler can simplify operations based on the known alignment of
objects.  This is like any other undefined behaviour.

Andreas.
  
Torvald Riegel Jan. 26, 2015, 3:33 p.m. UTC | #23
On Mon, 2015-01-26 at 16:03 +0100, Andreas Schwab wrote:
> The compiler can simplify operations based on the known alignment of
> objects.  This is like any other undefined behaviour.

Is your concern regarding glibc code (ie, sem_*()) or code outside
glibc?

For glibc, the accesses which might not be aligned as promised are to
either the private field, or in the form of atomic operations.  Do you
seem examples of badness happening in both cases, or is this more a case
of not being able to prove absence of badness (which would be a valid
concern too).

For code outside of glibc, I'm not sure something can actually happen.
If an application picks up the new semaphore.h, sem_t will be
8B-aligned.  If not, then not.  I don't think this can be partially
8B/4B-aligned, or can it?
  
Andreas Schwab Jan. 26, 2015, 3:52 p.m. UTC | #24
Torvald Riegel <triegel@redhat.com> writes:

> On Mon, 2015-01-26 at 16:03 +0100, Andreas Schwab wrote:
>> The compiler can simplify operations based on the known alignment of
>> objects.  This is like any other undefined behaviour.
>
> Is your concern regarding glibc code (ie, sem_*()) or code outside
> glibc?

It doesn't matter.  If you have a type with 8 byte alignment, but the
object has only 4 byte alignment you get undefined behaviour.

> For code outside of glibc, I'm not sure something can actually happen.
> If an application picks up the new semaphore.h, sem_t will be
> 8B-aligned.  If not, then not.  I don't think this can be partially
> 8B/4B-aligned, or can it?

You are changing the ABI.  Anything can happen.

Andreas.
  
Torvald Riegel Jan. 26, 2015, 4:14 p.m. UTC | #25
[Note that I don't have a strong opinion in either direction regarding
this.  I just want to make sure there's no misunderstanding here and
everyone knows why we decide either way (and agrees with the decision).]

On Mon, 2015-01-26 at 16:52 +0100, Andreas Schwab wrote:
> Torvald Riegel <triegel@redhat.com> writes:
> 
> > On Mon, 2015-01-26 at 16:03 +0100, Andreas Schwab wrote:
> >> The compiler can simplify operations based on the known alignment of
> >> objects.  This is like any other undefined behaviour.
> >
> > Is your concern regarding glibc code (ie, sem_*()) or code outside
> > glibc?
> 
> It doesn't matter.  If you have a type with 8 byte alignment, but the
> object has only 4 byte alignment you get undefined behaviour.

Well, but we control the glibc code and to some extent it's compilation
(if we assume no LTO, at least).  It seems HJ is claiming that while his
change could be undefined behavior, it is harmless for x32 with the
compiler we currently support.

> > For code outside of glibc, I'm not sure something can actually happen.
> > If an application picks up the new semaphore.h, sem_t will be
> > 8B-aligned.  If not, then not.  I don't think this can be partially
> > 8B/4B-aligned, or can it?
> 
> You are changing the ABI.  Anything can happen.

I agree we change the ABI.  Just to clarify though, your concern is thus
specifically due to ABI changes of sem_t and what they would do in user
programs (e.g., if a sem_t is part of another struct whose alignment
changes in return) -- and not regarding how glibc code could fail if
presented with a non-8B-aligned sem_t?
  
Andreas Schwab Jan. 26, 2015, 4:18 p.m. UTC | #26
Torvald Riegel <triegel@redhat.com> writes:

> I agree we change the ABI.  Just to clarify though, your concern is thus
> specifically due to ABI changes of sem_t and what they would do in user
> programs (e.g., if a sem_t is part of another struct whose alignment
> changes in return) -- and not regarding how glibc code could fail if
> presented with a non-8B-aligned sem_t?

Both can fail.  Just because it doesn't fail today doesn't mean it won't
bite you in the future.

Andreas.
  
H.J. Lu Jan. 26, 2015, 4:21 p.m. UTC | #27
On Mon, Jan 26, 2015 at 8:18 AM, Andreas Schwab <schwab@suse.de> wrote:
> Torvald Riegel <triegel@redhat.com> writes:
>
>> I agree we change the ABI.  Just to clarify though, your concern is thus
>> specifically due to ABI changes of sem_t and what they would do in user
>> programs (e.g., if a sem_t is part of another struct whose alignment
>> changes in return) -- and not regarding how glibc code could fail if
>> presented with a non-8B-aligned sem_t?
>
> Both can fail.  Just because it doesn't fail today doesn't mean it won't
> bite you in the future.
>

The problem is

struct foo
{
  int foo;
  semt_t sem;
  int bar;
}

sem/bar have different offsets and struct foo has different sizes when
alignment of sem_t is is changed.  I withdrew my change to sem_t.
  
Torvald Riegel Jan. 26, 2015, 4:26 p.m. UTC | #28
On Mon, 2015-01-26 at 17:18 +0100, Andreas Schwab wrote:
> Torvald Riegel <triegel@redhat.com> writes:
> 
> > I agree we change the ABI.  Just to clarify though, your concern is thus
> > specifically due to ABI changes of sem_t and what they would do in user
> > programs (e.g., if a sem_t is part of another struct whose alignment
> > changes in return) -- and not regarding how glibc code could fail if
> > presented with a non-8B-aligned sem_t?
> 
> Both can fail.  Just because it doesn't fail today doesn't mean it won't
> bite you in the future.

IMHO this conversation would benefit from some more verbosity in your
replies.  It seems you and HJ disagree, so the only way I see to solve
this is to actually talk about it.

Anyway, I'll let you and HJ hash this out.  In the end, this is
primarily about whether x32's sem_t uses a bigger alignment or not.
  
Carlos O'Donell Jan. 27, 2015, 4:17 a.m. UTC | #29
On 01/25/2015 06:05 PM, H.J. Lu wrote:
> On Sun, Jan 25, 2015 at 2:10 PM, Chris Metcalf <cmetcalf@ezchip.com> wrote:
>> On 1/25/2015 4:51 PM, H.J. Lu wrote:
>>
>>> This is for x86.  Each target can do something like it if needed.
>>
>>
>> This does change the ABI for existing code, however, which may legitimately
>> have sem_t objects that are only aligned to 4-byte boundaries.  Your change
>> is a good idea for new platforms.  I suppose we could also do something
>> with symbol versioning to use it for existing 32-bit platforms with
>> 64-bit atomics, too.  So there seem to be multiple fixes we can consider.
>>
> 
> It doesn't change the size, only increases alignment from 4 bytes to 8 bytes.
> It may not work on all targets.  It works for x32.

How can you change the alignment?

That will break applications that embedded a sem_t in other structures by
changing the size of those structures as they expand to account for the
changed object alignment?

Cheers,
Carlos.
  

Patch

diff --git a/sysdeps/x86/bits/semaphore.h b/sysdeps/x86/bits/semaphore.h
index 18b2b3c..4890e0d 100644
--- a/sysdeps/x86/bits/semaphore.h
+++ b/sysdeps/x86/bits/semaphore.h
@@ -28,6 +28,11 @@ 
 # define __SIZEOF_SEM_T	16
 #endif
 
+#ifdef __x86_64__
+# define __SEM_ALIGN_T	long long int
+#else
+# define __SEM_ALIGN_T	long int
+#endif
 
 /* Value returned if `sem_open' failed.  */
 #define SEM_FAILED      ((sem_t *) 0)
@@ -36,5 +41,5 @@ 
 typedef union
 {
   char __size[__SIZEOF_SEM_T];
-  long int __align;
+  __SEM_ALIGN_T __align;
 } sem_t;