Created tunable to force small pages on stack allocation.
Checks
Context |
Check |
Description |
dj/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
dj/TryBot-32bit |
success
|
Build for i686
|
Commit Message
Created tunable glibc.pthread.stack_hugetlb to control when hugepages
can be used for stack allocation.
In case THP are enabled and glibc.pthread.stack_hugetlb is set to
0, glibc will madvise the kernel not to use allow hugepages for stack
allocations.
---
nptl/allocatestack.c | 6 ++++++
nptl/nptl-stack.c | 1 +
nptl/nptl-stack.h | 3 +++
nptl/pthread_mutex_conf.c | 23 +++++++++++++++++++++++
sysdeps/nptl/dl-tunables.list | 6 ++++++
5 files changed, 39 insertions(+)
Comments
* Cupertino Miranda via Libc-alpha:
> +static void
> +TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
> +{
> + enum malloc_thp_mode_t thp_mode = __malloc_thp_mode ();
> + /*
> + Only allow to change the default of this tunable if the system
> + does support and has either 'madvise' or 'always' mode. Otherwise
> + the madvise() call is wasteful.
> + */
> + switch(thp_mode)
> + {
> + case malloc_thp_mode_always:
> + case malloc_thp_mode_madvise:
> + __nptl_stack_hugetlb = (int32_t) valp->numval;
> + break;
> + default:
> + break;
> + }
> +}
I suspect that quite a few failing madvise calls are cheaper than the
those three system calls in __malloc_thp_mode. In addition,
__malloc_thp_mode may fail due to future kernel changes, disabling the
tunable by accident.
Thanks,
Florian
Hi Florian
> * Cupertino Miranda via Libc-alpha:
>
>> +static void
>> +TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
>> +{
>> + enum malloc_thp_mode_t thp_mode = __malloc_thp_mode ();
>> + /*
>> + Only allow to change the default of this tunable if the system
>> + does support and has either 'madvise' or 'always' mode. Otherwise
>> + the madvise() call is wasteful.
>> + */
>> + switch(thp_mode)
>> + {
>> + case malloc_thp_mode_always:
>> + case malloc_thp_mode_madvise:
>> + __nptl_stack_hugetlb = (int32_t) valp->numval;
>> + break;
>> + default:
>> + break;
>> + }
>> +}
>
> I suspect that quite a few failing madvise calls are cheaper than the
> those three system calls in __malloc_thp_mode. In addition,
> __malloc_thp_mode may fail due to future kernel changes, disabling the
> tunable by accident.
Thanks for your review. Ok, I did not check the inner workings of
malloc_thp_mode. I used it by suggestion from the RFC thread.
I will prepare a version removing it.
>
> Thanks,
> Florian
Cupertino Miranda via Libc-alpha writes:
> Hi Florian
>
>> * Cupertino Miranda via Libc-alpha:
>>
>>> +static void
>>> +TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
>>> +{
>>> + enum malloc_thp_mode_t thp_mode = __malloc_thp_mode ();
>>> + /*
>>> + Only allow to change the default of this tunable if the system
>>> + does support and has either 'madvise' or 'always' mode. Otherwise
>>> + the madvise() call is wasteful.
>>> + */
>>> + switch(thp_mode)
>>> + {
>>> + case malloc_thp_mode_always:
>>> + case malloc_thp_mode_madvise:
>>> + __nptl_stack_hugetlb = (int32_t) valp->numval;
>>> + break;
>>> + default:
>>> + break;
>>> + }
>>> +}
>>
>> I suspect that quite a few failing madvise calls are cheaper than the
>> those three system calls in __malloc_thp_mode. In addition,
>> __malloc_thp_mode may fail due to future kernel changes, disabling the
>> tunable by accident.
> Thanks for your review. Ok, I did not check the inner workings of
> malloc_thp_mode. I used it by suggestion from the RFC thread.
> I will prepare a version removing it.
>
>>
>> Thanks,
>> Florian
kkjjj b
On 13/03/23 07:07, Cupertino Miranda wrote:
>
> Hi Florian
>
>> * Cupertino Miranda via Libc-alpha:
>>
>>> +static void
>>> +TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
>>> +{
>>> + enum malloc_thp_mode_t thp_mode = __malloc_thp_mode ();
>>> + /*
>>> + Only allow to change the default of this tunable if the system
>>> + does support and has either 'madvise' or 'always' mode. Otherwise
>>> + the madvise() call is wasteful.
>>> + */
>>> + switch(thp_mode)
>>> + {
>>> + case malloc_thp_mode_always:
>>> + case malloc_thp_mode_madvise:
>>> + __nptl_stack_hugetlb = (int32_t) valp->numval;
>>> + break;
>>> + default:
>>> + break;
>>> + }
>>> +}
>>
>> I suspect that quite a few failing madvise calls are cheaper than the
>> those three system calls in __malloc_thp_mode. In addition,
>> __malloc_thp_mode may fail due to future kernel changes, disabling the
>> tunable by accident.
> Thanks for your review. Ok, I did not check the inner workings of
> malloc_thp_mode. I used it by suggestion from the RFC thread.
> I will prepare a version removing it.
It only make sense to madvise iff __malloc_thp_mode is set as 'always',
and tunable is a opt-in feature. So I think it would make sense to
check the sysfs is tunable is used; the startup code will be amortized
on high threads workloads.
> I suspect that quite a few failing madvise calls are cheaper than the
> those three system calls in __malloc_thp_mode. In addition,
> __malloc_thp_mode may fail due to future kernel changes, disabling the
> tunable by accident.
Please find the new patch which removes the __malloc_thp_mode
verification below.
Looking forward to your comments.
Cupertino
> kkjjj b
Apologies, sent this garbage by mistake when email client was not
responsive. :(
Adhemerval Zanella Netto writes:
> On 13/03/23 07:07, Cupertino Miranda wrote:
>>
>> Hi Florian
>>
>>> * Cupertino Miranda via Libc-alpha:
>>>
>>>> +static void
>>>> +TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
>>>> +{
>>>> + enum malloc_thp_mode_t thp_mode = __malloc_thp_mode ();
>>>> + /*
>>>> + Only allow to change the default of this tunable if the system
>>>> + does support and has either 'madvise' or 'always' mode. Otherwise
>>>> + the madvise() call is wasteful.
>>>> + */
>>>> + switch(thp_mode)
>>>> + {
>>>> + case malloc_thp_mode_always:
>>>> + case malloc_thp_mode_madvise:
>>>> + __nptl_stack_hugetlb = (int32_t) valp->numval;
>>>> + break;
>>>> + default:
>>>> + break;
>>>> + }
>>>> +}
>>>
>>> I suspect that quite a few failing madvise calls are cheaper than the
>>> those three system calls in __malloc_thp_mode. In addition,
>>> __malloc_thp_mode may fail due to future kernel changes, disabling the
>>> tunable by accident.
>> Thanks for your review. Ok, I did not check the inner workings of
>> malloc_thp_mode. I used it by suggestion from the RFC thread.
>> I will prepare a version removing it.
>
> It only make sense to madvise iff __malloc_thp_mode is set as 'always',
> and tunable is a opt-in feature. So I think it would make sense to
> check the sysfs is tunable is used; the startup code will be amortized
> on high threads workloads.
From my side either solution is Ok. Using the __malloc_thp_mode would
protect the code from calls to madvise in systems that do not have
hugetlb support. This would avoid returning an error on madvise if the
advice argument is not identified as valid.
@@ -369,6 +369,12 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
if (__glibc_unlikely (mem == MAP_FAILED))
return errno;
+ /* Do madvise in case the tunable glibc.pthread.stack_hugetlb is
+ set to 0, disabling hugetlb. */
+ if (__glibc_unlikely (__nptl_stack_hugetlb == 0)
+ && __madvise(mem, size, MADV_NOHUGEPAGE) != 0)
+ return errno;
+
/* SIZE is guaranteed to be greater than zero.
So we can never get a null pointer back from mmap. */
assert (mem != NULL);
@@ -21,6 +21,7 @@
#include <pthreadP.h>
size_t __nptl_stack_cache_maxsize = 40 * 1024 * 1024;
+int32_t __nptl_stack_hugetlb = 1;
void
__nptl_stack_list_del (list_t *elem)
@@ -27,6 +27,9 @@
/* Maximum size of the cache, in bytes. 40 MiB by default. */
extern size_t __nptl_stack_cache_maxsize attribute_hidden;
+/* Should allow stacks to use hugetlb. 1 is default */
+extern int32_t __nptl_stack_hugetlb;
+
/* Check whether the stack is still used or not. */
static inline bool
__nptl_stack_in_use (struct pthread *pd)
@@ -24,6 +24,7 @@
#include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */
#include <elf/dl-tunables.h>
#include <nptl-stack.h>
+#include <malloc-hugepages.h>
struct mutex_config __mutex_aconf =
{
@@ -45,6 +46,26 @@ TUNABLE_CALLBACK (set_stack_cache_size) (tunable_val_t *valp)
__nptl_stack_cache_maxsize = valp->numval;
}
+static void
+TUNABLE_CALLBACK (set_stack_hugetlb) (tunable_val_t *valp)
+{
+ enum malloc_thp_mode_t thp_mode = __malloc_thp_mode ();
+ /*
+ Only allow to change the default of this tunable if the system
+ does support and has either 'madvise' or 'always' mode. Otherwise
+ the madvise() call is wasteful.
+ */
+ switch(thp_mode)
+ {
+ case malloc_thp_mode_always:
+ case malloc_thp_mode_madvise:
+ __nptl_stack_hugetlb = (int32_t) valp->numval;
+ break;
+ default:
+ break;
+ }
+}
+
void
__pthread_tunables_init (void)
{
@@ -52,5 +73,7 @@ __pthread_tunables_init (void)
TUNABLE_CALLBACK (set_mutex_spin_count));
TUNABLE_GET (stack_cache_size, size_t,
TUNABLE_CALLBACK (set_stack_cache_size));
+ TUNABLE_GET (stack_hugetlb, int32_t,
+ TUNABLE_CALLBACK (set_stack_hugetlb));
}
#endif
@@ -33,5 +33,11 @@ glibc {
maxval: 1
default: 1
}
+ stack_hugetlb {
+ type: INT_32
+ minval: 0
+ maxval: 1
+ default: 0
+ }
}
}