qsort: Fix a typo causing unnecessary malloc/free
Checks
Context |
Check |
Description |
redhat-pt-bot/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
redhat-pt-bot/TryBot-32bit |
success
|
Build for i686
|
linaro-tcwg-bot/tcwg_glibc_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_glibc_check--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 |
success
|
Testing passed
|
Commit Message
In qsort_r we allocate a buffer sized QSORT_STACK_SIZE (1024) on stack
and we intend to use it if all elements can fit into it. But there is a
typo:
if (total_size < sizeof buf)
buf = tmp;
else
/* allocate a buffer on heap and use it ... */
Here "buf" is a pointer, thus sizeof buf is just 4 or 8, instead of
1024.
This bug is detected debugging some strange heap corruption running the
Ruby-3.3.0 test suite (on an experimental Linux From Scratch build using
Binutils-2.41.90 and Glibc trunk, and also Fedora Rawhide [1]). It
seems Ruby is doing some wild "optimization" by jumping into somewhere
in qsort_r instead of calling it normally, resulting in a double free of
buf if we allocate it on heap. The issue can be reproduced
deterministically with:
LD_PRELOAD=/usr/lib/libc_malloc_debug.so MALLOC_CHECK_=3 \
LD_LIBRARY_PATH=. ./ruby test/runner.rb test/ruby/test_enum.rb
in Ruby-3.3.0 tree after building it. This change would hide the issue
for Ruby, but Ruby is likely still buggy (if using this "optimization"
sorting larger arrays).
[1]:https://kojipkgs.fedoraproject.org//work/tasks/9729/111889729/build.log
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
---
stdlib/qsort.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Mon, Jan 22, 2024 at 11:32 AM Xi Ruoyao <xry111@xry111.site> wrote:
>
> In qsort_r we allocate a buffer sized QSORT_STACK_SIZE (1024) on stack
> and we intend to use it if all elements can fit into it. But there is a
> typo:
>
> if (total_size < sizeof buf)
> buf = tmp;
> else
> /* allocate a buffer on heap and use it ... */
>
> Here "buf" is a pointer, thus sizeof buf is just 4 or 8, instead of
> 1024.
This sounds like a real bug. A glibc bug report is needed to track it.
> This bug is detected debugging some strange heap corruption running the
> Ruby-3.3.0 test suite (on an experimental Linux From Scratch build using
> Binutils-2.41.90 and Glibc trunk, and also Fedora Rawhide [1]). It
> seems Ruby is doing some wild "optimization" by jumping into somewhere
> in qsort_r instead of calling it normally, resulting in a double free of
> buf if we allocate it on heap. The issue can be reproduced
> deterministically with:
>
> LD_PRELOAD=/usr/lib/libc_malloc_debug.so MALLOC_CHECK_=3 \
> LD_LIBRARY_PATH=. ./ruby test/runner.rb test/ruby/test_enum.rb
>
> in Ruby-3.3.0 tree after building it. This change would hide the issue
> for Ruby, but Ruby is likely still buggy (if using this "optimization"
> sorting larger arrays).
>
> [1]:https://kojipkgs.fedoraproject.org//work/tasks/9729/111889729/build.log
>
> Signed-off-by: Xi Ruoyao <xry111@xry111.site>
> ---
> stdlib/qsort.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/stdlib/qsort.c b/stdlib/qsort.c
> index 7f5a00fb33..1c1505e7b2 100644
> --- a/stdlib/qsort.c
> +++ b/stdlib/qsort.c
> @@ -353,7 +353,7 @@ __qsort_r (void *const pbase, size_t total_elems, size_t size,
> if (size > INDIRECT_SORT_SIZE_THRES)
> total_size = 2 * total_elems * sizeof (void *) + size;
>
> - if (total_size < sizeof buf)
> + if (total_size < sizeof tmp)
> buf = tmp;
> else
> {
> --
> 2.43.0
>
On Mon, 2024-01-22 at 11:46 -0800, H.J. Lu wrote:
> On Mon, Jan 22, 2024 at 11:32 AM Xi Ruoyao <xry111@xry111.site> wrote:
> >
> > In qsort_r we allocate a buffer sized QSORT_STACK_SIZE (1024) on stack
> > and we intend to use it if all elements can fit into it. But there is a
> > typo:
> >
> > if (total_size < sizeof buf)
> > buf = tmp;
> > else
> > /* allocate a buffer on heap and use it ... */
> >
> > Here "buf" is a pointer, thus sizeof buf is just 4 or 8, instead of
> > 1024.
>
> This sounds like a real bug. A glibc bug report is needed to track it.
https://sourceware.org/bugzilla/show_bug.cgi?id=31276
On Tue, 2024-01-23 at 03:54 +0800, Xi Ruoyao wrote:
> On Mon, 2024-01-22 at 11:46 -0800, H.J. Lu wrote:
> > On Mon, Jan 22, 2024 at 11:32 AM Xi Ruoyao <xry111@xry111.site> wrote:
> > >
> > > In qsort_r we allocate a buffer sized QSORT_STACK_SIZE (1024) on stack
> > > and we intend to use it if all elements can fit into it. But there is a
> > > typo:
> > >
> > > if (total_size < sizeof buf)
Hmm, and it seems this should be "<=" instead of "<". I'm preparing a
V2 with the operator changed too, and the BZ number in commit message...
> > > buf = tmp;
> > > else
> > > /* allocate a buffer on heap and use it ... */
> > >
> > > Here "buf" is a pointer, thus sizeof buf is just 4 or 8, instead of
> > > 1024.
> >
> > This sounds like a real bug. A glibc bug report is needed to track it.
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=31276
>
* Xi Ruoyao:
> This bug is detected debugging some strange heap corruption running the
> Ruby-3.3.0 test suite (on an experimental Linux From Scratch build using
> Binutils-2.41.90 and Glibc trunk, and also Fedora Rawhide [1]). It
> seems Ruby is doing some wild "optimization" by jumping into somewhere
> in qsort_r instead of calling it normally, resulting in a double free of
> buf if we allocate it on heap. The issue can be reproduced
> deterministically with:
>
> LD_PRELOAD=/usr/lib/libc_malloc_debug.so MALLOC_CHECK_=3 \
> LD_LIBRARY_PATH=. ./ruby test/runner.rb test/ruby/test_enum.rb
>
> in Ruby-3.3.0 tree after building it. This change would hide the issue
> for Ruby, but Ruby is likely still buggy (if using this "optimization"
> sorting larger arrays).
>
> [1]:https://kojipkgs.fedoraproject.org//work/tasks/9729/111889729/build.log
Sorry, missed that part. I suspect it's because the Ruby garbage
collector updates pointers in the array during invocation of the
comparison function, and we later discard those updates when we copy the
sorted runs back into the application-supplied array. Ruby probably
scans the stack conservatively, triggering object pinning, which is
probably the reason why we don't see the issue with the on-stack copy.
The issue is reproducible with the old (glibc 2.37) qsort_r if the Ruby
test case uses a suitable array size.
We've posted our findings to the Ruby bug:
<https://bugs.ruby-lang.org/issues/20203>
Thanks,
Florian
On Wed, 2024-01-24 at 14:45 +0100, Florian Weimer wrote:
> * Xi Ruoyao:
>
> > This bug is detected debugging some strange heap corruption running the
> > Ruby-3.3.0 test suite (on an experimental Linux From Scratch build using
> > Binutils-2.41.90 and Glibc trunk, and also Fedora Rawhide [1]). It
> > seems Ruby is doing some wild "optimization" by jumping into somewhere
> > in qsort_r instead of calling it normally, resulting in a double free of
> > buf if we allocate it on heap. The issue can be reproduced
> > deterministically with:
> >
> > LD_PRELOAD=/usr/lib/libc_malloc_debug.so MALLOC_CHECK_=3 \
> > LD_LIBRARY_PATH=. ./ruby test/runner.rb test/ruby/test_enum.rb
> >
> > in Ruby-3.3.0 tree after building it. This change would hide the issue
> > for Ruby, but Ruby is likely still buggy (if using this "optimization"
> > sorting larger arrays).
> >
> > [1]:https://kojipkgs.fedoraproject.org//work/tasks/9729/111889729/build.log
>
> Sorry, missed that part. I suspect it's because the Ruby garbage
> collector updates pointers in the array during invocation of the
> comparison function, and we later discard those updates when we copy the
> sorted runs back into the application-supplied array. Ruby probably
> scans the stack conservatively, triggering object pinning, which is
> probably the reason why we don't see the issue with the on-stack copy.
>
> The issue is reproducible with the old (glibc 2.37) qsort_r if the Ruby
> test case uses a suitable array size.
>
> We've posted our findings to the Ruby bug:
>
> <https://bugs.ruby-lang.org/issues/20203>
Ooooops. We've disabled the use of qsort_r for Ruby in BLFS for now:
https://wiki.linuxfromscratch.org/blfs/changeset/77b455955c17e22d6aa544c766b30850978960ab.
I guess their own sort should work with their own GC anyway...
@@ -353,7 +353,7 @@ __qsort_r (void *const pbase, size_t total_elems, size_t size,
if (size > INDIRECT_SORT_SIZE_THRES)
total_size = 2 * total_elems * sizeof (void *) + size;
- if (total_size < sizeof buf)
+ if (total_size < sizeof tmp)
buf = tmp;
else
{