Initialize the entire obstack struct [BZ #17919]

Message ID 20150203145649.GG1528@spoyarek.pnq.redhat.com
State Dropped
Headers

Commit Message

Siddhesh Poyarekar Feb. 3, 2015, 2:56 p.m. UTC
  Hi,

obstack_init does not completely initialize the obstack structure; it
leaves out the padding bits and valgrind complains about it on s390x:

==15793== Memcheck, a memory error detector
==15793== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==15793== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==15793== Command: /root/obstack
==15793==
==15793== Conditional jump or move depends on uninitialised value(s)
==15793==    at 0x403E48CA4E: obstack_free (in /lib64/libc-2.12.so)
==15793==    by 0x8000072D: main (obstack.c:12)
==15793==
==15793==
==15793== HEAP SUMMARY:
==15793==     in use at exit: 0 bytes in 0 blocks
==15793==   total heap usage: 1 allocs, 1 frees, 4,064 bytes allocated
==15793==
==15793== All heap blocks were freed -- no leaks are possible
==15793==
==15793== For counts of detected and suppressed errors, rerun with: -v
==15793== Use --track-origins=yes to see where uninitialised values come from
==15793== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)


The fix below (against gnulib, but is identical for glibc) initializes
all of the obstack struct at once.  Verified that the valgrind warning
is fixed.  OK for 2.22 and gnulib?

Siddhesh

ChangeLog for gnulib:

	obstack: Initialize whole obstack structure.
	* lib/obstack.c (_obstack_begin): Initialize all of H.

ChangeLog for glibc:

	[BZ #17919]
	* malloc/obstack.c (_obstack_begin): Initialize all of H.
  

Comments

Siddhesh Poyarekar Feb. 3, 2015, 3 p.m. UTC | #1
... and I forgot to add bug-gnulib to cc before I hit send.

Siddhesh

On Tue, Feb 03, 2015 at 08:26:49PM +0530, Siddhesh Poyarekar wrote:
> Hi,
> 
> obstack_init does not completely initialize the obstack structure; it
> leaves out the padding bits and valgrind complains about it on s390x:
> 
> ==15793== Memcheck, a memory error detector
> ==15793== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
> ==15793== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
> ==15793== Command: /root/obstack
> ==15793==
> ==15793== Conditional jump or move depends on uninitialised value(s)
> ==15793==    at 0x403E48CA4E: obstack_free (in /lib64/libc-2.12.so)
> ==15793==    by 0x8000072D: main (obstack.c:12)
> ==15793==
> ==15793==
> ==15793== HEAP SUMMARY:
> ==15793==     in use at exit: 0 bytes in 0 blocks
> ==15793==   total heap usage: 1 allocs, 1 frees, 4,064 bytes allocated
> ==15793==
> ==15793== All heap blocks were freed -- no leaks are possible
> ==15793==
> ==15793== For counts of detected and suppressed errors, rerun with: -v
> ==15793== Use --track-origins=yes to see where uninitialised values come from
> ==15793== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
> 
> 
> The fix below (against gnulib, but is identical for glibc) initializes
> all of the obstack struct at once.  Verified that the valgrind warning
> is fixed.  OK for 2.22 and gnulib?
> 
> Siddhesh
> 
> ChangeLog for gnulib:
> 
> 	obstack: Initialize whole obstack structure.
> 	* lib/obstack.c (_obstack_begin): Initialize all of H.
> 
> ChangeLog for glibc:
> 
> 	[BZ #17919]
> 	* malloc/obstack.c (_obstack_begin): Initialize all of H.
> 
> diff --git a/malloc/obstack.c b/malloc/obstack.c
> index 5bb3f0d..c1d6ded 100644
> --- a/lib/obstack.c
> +++ b/lib/obstack.c
> @@ -148,6 +148,8 @@ _obstack_begin (struct obstack *h,
>  {
>    struct _obstack_chunk *chunk; /* points to new chunk */
>  
> +  memset (h, 0, sizeof (struct obstack));
> +
>    if (alignment == 0)
>      alignment = DEFAULT_ALIGNMENT;
>    if (size == 0)
  
H.J. Lu Feb. 3, 2015, 3:38 p.m. UTC | #2
On Tue, Feb 3, 2015 at 7:00 AM, Siddhesh Poyarekar <siddhesh@redhat.com> wrote:
> ... and I forgot to add bug-gnulib to cc before I hit send.
>
> Siddhesh
>
> On Tue, Feb 03, 2015 at 08:26:49PM +0530, Siddhesh Poyarekar wrote:
>> Hi,
>>
>> obstack_init does not completely initialize the obstack structure; it
>> leaves out the padding bits and valgrind complains about it on s390x:
>>
>> ==15793== Memcheck, a memory error detector
>> ==15793== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>> ==15793== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>> ==15793== Command: /root/obstack
>> ==15793==
>> ==15793== Conditional jump or move depends on uninitialised value(s)
>> ==15793==    at 0x403E48CA4E: obstack_free (in /lib64/libc-2.12.so)
>> ==15793==    by 0x8000072D: main (obstack.c:12)
>> ==15793==
>> ==15793==
>> ==15793== HEAP SUMMARY:
>> ==15793==     in use at exit: 0 bytes in 0 blocks
>> ==15793==   total heap usage: 1 allocs, 1 frees, 4,064 bytes allocated
>> ==15793==
>> ==15793== All heap blocks were freed -- no leaks are possible
>> ==15793==
>> ==15793== For counts of detected and suppressed errors, rerun with: -v
>> ==15793== Use --track-origins=yes to see where uninitialised values come from
>> ==15793== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
>>
>>
>> The fix below (against gnulib, but is identical for glibc) initializes
>> all of the obstack struct at once.  Verified that the valgrind warning
>> is fixed.  OK for 2.22 and gnulib?
>>
>> Siddhesh
>>
>> ChangeLog for gnulib:
>>
>>       obstack: Initialize whole obstack structure.
>>       * lib/obstack.c (_obstack_begin): Initialize all of H.
>>
>> ChangeLog for glibc:
>>
>>       [BZ #17919]
>>       * malloc/obstack.c (_obstack_begin): Initialize all of H.
>>
>> diff --git a/malloc/obstack.c b/malloc/obstack.c
>> index 5bb3f0d..c1d6ded 100644
>> --- a/lib/obstack.c
>> +++ b/lib/obstack.c
>> @@ -148,6 +148,8 @@ _obstack_begin (struct obstack *h,
>>  {
>>    struct _obstack_chunk *chunk; /* points to new chunk */
>>
>> +  memset (h, 0, sizeof (struct obstack));
>> +
>>    if (alignment == 0)
>>      alignment = DEFAULT_ALIGNMENT;
>>    if (size == 0)
>
>

I think you should also remove

h->use_extra_arg = 0;
  
H.J. Lu Feb. 3, 2015, 3:38 p.m. UTC | #3
On Tue, Feb 3, 2015 at 7:38 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Feb 3, 2015 at 7:00 AM, Siddhesh Poyarekar <siddhesh@redhat.com> wrote:
>> ... and I forgot to add bug-gnulib to cc before I hit send.
>>
>> Siddhesh
>>
>> On Tue, Feb 03, 2015 at 08:26:49PM +0530, Siddhesh Poyarekar wrote:
>>> Hi,
>>>
>>> obstack_init does not completely initialize the obstack structure; it
>>> leaves out the padding bits and valgrind complains about it on s390x:
>>>
>>> ==15793== Memcheck, a memory error detector
>>> ==15793== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>>> ==15793== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>>> ==15793== Command: /root/obstack
>>> ==15793==
>>> ==15793== Conditional jump or move depends on uninitialised value(s)
>>> ==15793==    at 0x403E48CA4E: obstack_free (in /lib64/libc-2.12.so)
>>> ==15793==    by 0x8000072D: main (obstack.c:12)
>>> ==15793==
>>> ==15793==
>>> ==15793== HEAP SUMMARY:
>>> ==15793==     in use at exit: 0 bytes in 0 blocks
>>> ==15793==   total heap usage: 1 allocs, 1 frees, 4,064 bytes allocated
>>> ==15793==
>>> ==15793== All heap blocks were freed -- no leaks are possible
>>> ==15793==
>>> ==15793== For counts of detected and suppressed errors, rerun with: -v
>>> ==15793== Use --track-origins=yes to see where uninitialised values come from
>>> ==15793== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
>>>
>>>
>>> The fix below (against gnulib, but is identical for glibc) initializes
>>> all of the obstack struct at once.  Verified that the valgrind warning
>>> is fixed.  OK for 2.22 and gnulib?
>>>
>>> Siddhesh
>>>
>>> ChangeLog for gnulib:
>>>
>>>       obstack: Initialize whole obstack structure.
>>>       * lib/obstack.c (_obstack_begin): Initialize all of H.
>>>
>>> ChangeLog for glibc:
>>>
>>>       [BZ #17919]
>>>       * malloc/obstack.c (_obstack_begin): Initialize all of H.
>>>
>>> diff --git a/malloc/obstack.c b/malloc/obstack.c
>>> index 5bb3f0d..c1d6ded 100644
>>> --- a/lib/obstack.c
>>> +++ b/lib/obstack.c
>>> @@ -148,6 +148,8 @@ _obstack_begin (struct obstack *h,
>>>  {
>>>    struct _obstack_chunk *chunk; /* points to new chunk */
>>>
>>> +  memset (h, 0, sizeof (struct obstack));
>>> +
>>>    if (alignment == 0)
>>>      alignment = DEFAULT_ALIGNMENT;
>>>    if (size == 0)
>>
>>
>
> I think you should also remove
>
> h->use_extra_arg = 0;
>

And

  /* The initial chunk now contains no empty object.  */
  h->maybe_empty_object = 0;
  h->alloc_failed = 0;
  
Paul Eggert Feb. 3, 2015, 4:08 p.m. UTC | #4
Does this patch fix any actual bug?  If so, what's the bug?  Is there a 
faster way to fix it, other than zeroing out the entire struct obstack?
  
Siddhesh Poyarekar Feb. 3, 2015, 4:25 p.m. UTC | #5
On Tue, Feb 03, 2015 at 08:08:54AM -0800, Paul Eggert wrote:
> Does this patch fix any actual bug?  If so, what's the bug?  Is there a
> faster way to fix it, other than zeroing out the entire struct obstack?

It fixes a valgrind warning on s390x, but if by bug you mean a user
visible problem, then no, there is no such thing.  I can't think of a
faster way that is not uglier, which is why I took the simpler, more
readable approach.

Siddhesh
  
Carlos O'Donell Feb. 3, 2015, 4:30 p.m. UTC | #6
On 02/03/2015 09:56 AM, Siddhesh Poyarekar wrote:
> Hi,
> 
> obstack_init does not completely initialize the obstack structure; it
> leaves out the padding bits and valgrind complains about it on s390x:
> 
> ==15793== Memcheck, a memory error detector
> ==15793== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
> ==15793== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
> ==15793== Command: /root/obstack
> ==15793==
> ==15793== Conditional jump or move depends on uninitialised value(s)
> ==15793==    at 0x403E48CA4E: obstack_free (in /lib64/libc-2.12.so)
> ==15793==    by 0x8000072D: main (obstack.c:12)
> ==15793==
> ==15793==
> ==15793== HEAP SUMMARY:
> ==15793==     in use at exit: 0 bytes in 0 blocks
> ==15793==   total heap usage: 1 allocs, 1 frees, 4,064 bytes allocated
> ==15793==
> ==15793== All heap blocks were freed -- no leaks are possible
> ==15793==
> ==15793== For counts of detected and suppressed errors, rerun with: -v
> ==15793== Use --track-origins=yes to see where uninitialised values come from
> ==15793== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)

Is this a valgrind bug, false positive in valgrind, or bug in glibc?

It clearly says we have a move that depends on an uninitizlied value,
so something read the value and tried to do something with it.

Cheers,
Carlos.
  
Siddhesh Poyarekar Feb. 3, 2015, 4:38 p.m. UTC | #7
[Also looping in bug-gnulib]

On Tue, Feb 03, 2015 at 11:30:17AM -0500, Carlos O'Donell wrote:
> Is this a valgrind bug, false positive in valgrind, or bug in glibc?
> 
> It clearly says we have a move that depends on an uninitizlied value,
> so something read the value and tried to do something with it.

It is a combination of multiple things actually.  The uninitialized
value here is the padding in the structure and we have fixed such
warnings in the past; see the nscd stats warning[1] for example.

This warning doesn't trigger on x86 though due to the code that gcc
generates for it - it explicitly ANDs the bit field in the struct
before testing it.  On s390, the optimizer does away with the AND
operation and tests the entire thing for some reason - I haven't
figured out why but the instruction combiner RTL pass does away with
the AND operation.

This is also why I started out filing a gcc bug, but then changed my
mind and fixed it in glibc instead, since I presumed that the compiler
can assume that the padding is also appropriately initialized.  But
then, I am not completely sure if the compiler is allowed to make that
assumption.

Siddhesh
  
Andreas Schwab Feb. 3, 2015, 4:46 p.m. UTC | #8
Siddhesh Poyarekar <siddhesh@redhat.com> writes:

> This is also why I started out filing a gcc bug, but then changed my
> mind and fixed it in glibc instead, since I presumed that the compiler
> can assume that the padding is also appropriately initialized.

No, it can't.  Padding must be assumed to be random.

Andreas.
  
Jim Meyering Feb. 3, 2015, 4:47 p.m. UTC | #9
On Tue, Feb 3, 2015 at 8:08 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
> Does this patch fix any actual bug?  If so, what's the bug?  Is there a
> faster way to fix it, other than zeroing out the entire struct obstack?

I believe that this addresses the MSAN used-uninitialized warning
I explored privately a few months ago. Assuming it's the same one,
initializing the entire struct obstack is not necessary.
  
Carlos O'Donell Feb. 3, 2015, 4:49 p.m. UTC | #10
On 02/03/2015 11:46 AM, Andreas Schwab wrote:
> Siddhesh Poyarekar <siddhesh@redhat.com> writes:
> 
>> This is also why I started out filing a gcc bug, but then changed my
>> mind and fixed it in glibc instead, since I presumed that the compiler
>> can assume that the padding is also appropriately initialized.
> 
> No, it can't.  Padding must be assumed to be random.

Fully agreed.

Cheers,
Carlos.
  
Carlos O'Donell Feb. 3, 2015, 4:49 p.m. UTC | #11
On 02/03/2015 11:38 AM, Siddhesh Poyarekar wrote:
> [Also looping in bug-gnulib]
> 
> On Tue, Feb 03, 2015 at 11:30:17AM -0500, Carlos O'Donell wrote:
>> Is this a valgrind bug, false positive in valgrind, or bug in glibc?
>>
>> It clearly says we have a move that depends on an uninitizlied value,
>> so something read the value and tried to do something with it.
> 
> It is a combination of multiple things actually.  The uninitialized
> value here is the padding in the structure and we have fixed such
> warnings in the past; see the nscd stats warning[1] for example.

That case was not as performance sensitive as creating and using
obstacks.
 
> This warning doesn't trigger on x86 though due to the code that gcc
> generates for it - it explicitly ANDs the bit field in the struct
> before testing it.  On s390, the optimizer does away with the AND
> operation and tests the entire thing for some reason - I haven't
> figured out why but the instruction combiner RTL pass does away with
> the AND operation.
> 
> This is also why I started out filing a gcc bug, but then changed my
> mind and fixed it in glibc instead, since I presumed that the compiler
> can assume that the padding is also appropriately initialized.  But
> then, I am not completely sure if the compiler is allowed to make that
> assumption.

Thanks for the clarification.

IMO zero-initialized padding, for this case, isn't something you can
expect. Therefore I think it's a compiler bug.

I think it's OK to work around this in glibc, but it needs a comment
with a reference to the filed gcc bug. I do not think it is valid
for gcc on s390x to use the entire bit field along with padding and
I believe it could result in incorrect operation.

Cheers,
Carlos.
  
Siddhesh Poyarekar Feb. 3, 2015, 5:05 p.m. UTC | #12
On Tue, Feb 03, 2015 at 11:49:26AM -0500, Carlos O'Donell wrote:
> IMO zero-initialized padding, for this case, isn't something you can
> expect. Therefore I think it's a compiler bug.

Thanks, I've filed a bug now:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64923

> I think it's OK to work around this in glibc, but it needs a comment
> with a reference to the filed gcc bug. I do not think it is valid
> for gcc on s390x to use the entire bit field along with padding and
> I believe it could result in incorrect operation.

Nothing is breaking due to this right now, so we could probably wait
and see what the gcc folks think of this.

Siddhesh
  
Andrew Pinski Feb. 3, 2015, 5:10 p.m. UTC | #13
> On Feb 3, 2015, at 9:05 AM, Siddhesh Poyarekar <siddhesh@redhat.com> wrote:
> 
>> On Tue, Feb 03, 2015 at 11:49:26AM -0500, Carlos O'Donell wrote:
>> IMO zero-initialized padding, for this case, isn't something you can
>> expect. Therefore I think it's a compiler bug.
> 
> Thanks, I've filed a bug now:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64923

It is not a glibc or a gcc bug but rather a valgrind one. We are comparing against zero since this is the sign bit. Valgrind does not realize that and gives a warning. 

Thanks,
Andrew

> 
>> I think it's OK to work around this in glibc, but it needs a comment
>> with a reference to the filed gcc bug. I do not think it is valid
>> for gcc on s390x to use the entire bit field along with padding and
>> I believe it could result in incorrect operation.
> 
> Nothing is breaking due to this right now, so we could probably wait
> and see what the gcc folks think of this.
> 
> Siddhesh
  
Siddhesh Poyarekar Feb. 3, 2015, 5:26 p.m. UTC | #14
On Tue, Feb 03, 2015 at 09:10:50AM -0800, pinskia@gmail.com wrote:
> It is not a glibc or a gcc bug but rather a valgrind one. We are
> comparing against zero since this is the sign bit. Valgrind does not
> realize that and gives a warning.

Of course, thanks for responding.

Siddhesh
  
Carlos O'Donell Feb. 3, 2015, 6:14 p.m. UTC | #15
On 02/03/2015 12:05 PM, Siddhesh Poyarekar wrote:
> On Tue, Feb 03, 2015 at 11:49:26AM -0500, Carlos O'Donell wrote:
>> IMO zero-initialized padding, for this case, isn't something you can
>> expect. Therefore I think it's a compiler bug.
> 
> Thanks, I've filed a bug now:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64923
> 
>> I think it's OK to work around this in glibc, but it needs a comment
>> with a reference to the filed gcc bug. I do not think it is valid
>> for gcc on s390x to use the entire bit field along with padding and
>> I believe it could result in incorrect operation.
> 
> Nothing is breaking due to this right now, so we could probably wait
> and see what the gcc folks think of this.

I would check it into 2.22 and reference the GCC PR.

However, I see that GCC thinks this is a valgrind bug.

If valgrind is simply looking at the comparison to make
the warning then it falls into the 'false positive' category.
In which case I think Valgrind should set up an exception for
this warning on s390.

Cheers,
Carlos.
  
Carlos O'Donell Feb. 3, 2015, 6:14 p.m. UTC | #16
On 02/03/2015 01:14 PM, Carlos O'Donell wrote:
> On 02/03/2015 12:05 PM, Siddhesh Poyarekar wrote:
>> On Tue, Feb 03, 2015 at 11:49:26AM -0500, Carlos O'Donell wrote:
>>> IMO zero-initialized padding, for this case, isn't something you can
>>> expect. Therefore I think it's a compiler bug.
>>
>> Thanks, I've filed a bug now:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64923
>>
>>> I think it's OK to work around this in glibc, but it needs a comment
>>> with a reference to the filed gcc bug. I do not think it is valid
>>> for gcc on s390x to use the entire bit field along with padding and
>>> I believe it could result in incorrect operation.
>>
>> Nothing is breaking due to this right now, so we could probably wait
>> and see what the gcc folks think of this.
> 
> I would check it into 2.22 and reference the GCC PR.
> 
> However, I see that GCC thinks this is a valgrind bug.
> 
> If valgrind is simply looking at the comparison to make
> the warning then it falls into the 'false positive' category.
> In which case I think Valgrind should set up an exception for
> this warning on s390.

To be clear, I think nothing needs to be done now except file
an upstream valgrind PR.

c.
  
Mark Wielaard Feb. 3, 2015, 11:25 p.m. UTC | #17
On Tue, 2015-02-03 at 13:14 -0500, Carlos O'Donell wrote:
> On 02/03/2015 01:14 PM, Carlos O'Donell wrote:
> > However, I see that GCC thinks this is a valgrind bug.
> > 
> > If valgrind is simply looking at the comparison to make
> > the warning then it falls into the 'false positive' category.
> > In which case I think Valgrind should set up an exception for
> > this warning on s390.
> 
> To be clear, I think nothing needs to be done now except file
> an upstream valgrind PR.

Yes please:
https://bugs.kde.org/enter_bug.cgi?product=valgrind&format=guided

This sounds a bit like https://bugs.kde.org/show_bug.cgi?id=308427 "s390
memcheck reports tsearch conditional jump or move depends on
uninitialized value". But that bug got fixed some time ago and should
not be in the latest version of valgrind. It does contain a nice
description of how best to report such an issue.

Just run under valgrind with --vgdb-error=0 and then attach with gdb,
through target remote | vgdb. Please include the disassembly output from
gdb around the code that produces the warning. And if possible use the
valgrind gdb monitor commands to figure out which bits valgrind thinks
are (un)defined.
http://valgrind.org/docs/manual/mc-manual.html#mc-manual.monitor-commands

Thanks,

Mark
  
Siddhesh Poyarekar Feb. 5, 2015, 6:07 a.m. UTC | #18
On Wed, Feb 04, 2015 at 12:25:18AM +0100, Mark Wielaard wrote:
> This sounds a bit like https://bugs.kde.org/show_bug.cgi?id=308427 "s390
> memcheck reports tsearch conditional jump or move depends on
> uninitialized value". But that bug got fixed some time ago and should
> not be in the latest version of valgrind. It does contain a nice
> description of how best to report such an issue.

I've filed:

https://bugs.kde.org/show_bug.cgi?id=343802

which is a variant of 308427, which fixes the lg+jhe combination, but
not the lt+jl combination which the newer gcc produces.

Siddhesh
  

Patch

diff --git a/malloc/obstack.c b/malloc/obstack.c
index 5bb3f0d..c1d6ded 100644
--- a/lib/obstack.c
+++ b/lib/obstack.c
@@ -148,6 +148,8 @@  _obstack_begin (struct obstack *h,
 {
   struct _obstack_chunk *chunk; /* points to new chunk */
 
+  memset (h, 0, sizeof (struct obstack));
+
   if (alignment == 0)
     alignment = DEFAULT_ALIGNMENT;
   if (size == 0)