Fix tcache count maximum

  Hi Carlos,

> Please create a bug for this.
>
> This is a publicly visible issue with tcache and tunables.

Sure, BZ 24531.

> This patch conflates two issues.
>
> (a) Adding checking to tunable.
>
> (b) Lifting limit.
>
> Please split into two bugs. One to fix tunables, another to raise the
> tcache chunk caching limit.

> If you are going to do (b) and change the sign of counts then you need
> to go through and fix other code that expects to possibly have a
> negative value.

If there is any code that expects it to be negative that's a serious bug...
Char is neither signed nor unsigned, the valid range for char is 0..127.

2939   ++(tcache->counts[tc_idx]);

        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        assert (tcache->counts[tc_idx] != 0);
        See below for discussion.

In all cases we already check tcache->counts[tc_idx] < mp_.tcache_count,
so there can be no overflow iff mp_.tcache_count is the maximum value of
tcache->counts[] entries. So no checks needed.

2947   tcache_entry *e = tcache->entries[tc_idx];
2948   assert (tc_idx < TCACHE_MAX_BINS);
2949   assert (tcache->counts[tc_idx] > 0);

        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        Always true now if counts is only positive.
        Remove?

Yes this assert is redundant since we already checked there is a valid entry
(or just added several new entries). So this assert can never trigger, it only 
fails if tcache_put has an overflow bug.

2950   tcache->entries[tc_idx] = e->next;
2951   --(tcache->counts[tc_idx]);

        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        May wrap on error, should we check that and assert?
        We expect the caller to check for != NULL entry,
        indicating at least one entry. It's possible the list
        is corrupt and 'e' is pointing to garbage, so an
        assert might be good here?

        assert (tcache->counts[tc_idx] != MAX_TCACHE_COUNT);

No this can't underflow after we fix the overflow bug. 

> The manual/memory.texi needs updating because you made the
> count twice the size, and the rough estimates for size of
> tcache should be updated. The manual should also list the
> actual limit being imposed.

Which size do you mean? I can't find anything in manual/memory.texi
refering to tcache. I did update the tunables which incorrectly states
there is no limit on glibc.malloc.tcache_count.

See updated patch below - this should be simple and safe to backport.

Cheers,
Wilco

[PATCH v2] Fix tcache count maximum (BZ #24531)

The tcache counts[] array is a char, which has a very small range and thus
may overflow.  When setting tcache_count tunable, there is no overflow check.
However the tunable must not be larger than the maximum value of the tcache
counts[] array, otherwise it can overflow when filling the tcache.

Eg. export GLIBC_TUNABLES=glibc.malloc.tcache_count=4096
leads to crashes in benchtests:

Running /build/glibc/benchtests/bench-strcoll
bench-strcoll: malloc.c:2949: tcache_get: Assertion `tcache->counts[tc_idx] > 0' failed.
Aborted

ChangeLog:
2019-05-07  Wilco Dijkstra  <wdijkstr@arm.com>

        [BZ #24531]
        * malloc/malloc.c (MAX_TCACHE_COUNT): New define.
        (tcache_put): Remove redundant assert.
        (do_set_tcache_count): Only update if count is small enough.
        * manual/tunables.texi (glibc.malloc.tcache_count): Document max value.

Fix tcache count maximum

Commit Message

Comments

Patch