[BZ,#11787] Fix stack guard size accounting

Message ID 5A2FE5ED.1080300@arm.com
State New, archived
Headers

Commit Message

Szabolcs Nagy Dec. 12, 2017, 2:21 p.m. UTC
  Previously if user requested S stack and G guard when creating a
thread, the total mapping was S and the actual available stack was
S - G - static_tls, which is not what the user requested.

This patch fixes the guard size accounting by pretending the user
requested S + G stack.  This way all later logic works out except
when reporting the user requested stack size (pthread_getattr_np)
or when computing the minimal stack size (__pthread_get_minstack).

Normally this will increase thread stack allocations by one page.
TLS accounting is not affected, that will require a separate fix.

2017-12-12  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	[BZ #11787]
	* nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
	* nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
	stacksize.
	* nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
  

Comments

Szabolcs Nagy Dec. 18, 2017, 10:28 a.m. UTC | #1
On 12/12/17 14:21, Szabolcs Nagy wrote:
> Previously if user requested S stack and G guard when creating a
> thread, the total mapping was S and the actual available stack was
> S - G - static_tls, which is not what the user requested.
> 
> This patch fixes the guard size accounting by pretending the user
> requested S + G stack.  This way all later logic works out except
> when reporting the user requested stack size (pthread_getattr_np)
> or when computing the minimal stack size (__pthread_get_minstack).
> 
> Normally this will increase thread stack allocations by one page.
> TLS accounting is not affected, that will require a separate fix.
> 
> 2017-12-12  Szabolcs Nagy  <szabolcs.nagy@arm.com>
> 
> 	[BZ #11787]
> 	* nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
> 	* nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
> 	stacksize.
> 	* nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
> 

ping.
  
Florian Weimer Dec. 20, 2017, 1:09 p.m. UTC | #2
On 12/12/2017 03:21 PM, Szabolcs Nagy wrote:
> Previously if user requested S stack and G guard when creating a
> thread, the total mapping was S and the actual available stack was
> S - G - static_tls, which is not what the user requested.
> 
> This patch fixes the guard size accounting by pretending the user
> requested S + G stack.  This way all later logic works out except
> when reporting the user requested stack size (pthread_getattr_np)
> or when computing the minimal stack size (__pthread_get_minstack).
> 
> Normally this will increase thread stack allocations by one page.
> TLS accounting is not affected, that will require a separate fix.

Should this fix use a separate bug number?

> 2017-12-12  Szabolcs Nagy<szabolcs.nagy@arm.com>
> 
> 	[BZ #11787]
> 	* nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
> 	* nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
> 	stacksize.
> 	* nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.

Patch looks good in general.  The computation for stackblock_size looks 
right to me.

> diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
> index 1cc789319564b468cf07bdb1304b27dc5a91e86f..9525322b1f92bb34aa21dcab28566aecd7434e90 100644
> --- a/nptl/allocatestack.c
> +++ b/nptl/allocatestack.c
> @@ -532,6 +532,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
>         /* Make sure the size of the stack is enough for the guard and
>   	 eventually the thread descriptor.  */
>         guardsize = (attr->guardsize + pagesize_m1) & ~pagesize_m1;
> +      size += guardsize;
>         if (__builtin_expect (size < ((guardsize + __static_tls_size
>   				     + MINIMAL_REST_STACK + pagesize_m1)
>   				    & ~pagesize_m1),

I wonder if we should add an overflow check there and return EINVAL if

   guardsize < attr->guardsize || size + guardsize < guardsize

Thanks,
Florian
  
Szabolcs Nagy Dec. 20, 2017, 2:55 p.m. UTC | #3
On 20/12/17 13:09, Florian Weimer wrote:
> On 12/12/2017 03:21 PM, Szabolcs Nagy wrote:
>> Previously if user requested S stack and G guard when creating a
>> thread, the total mapping was S and the actual available stack was
>> S - G - static_tls, which is not what the user requested.
>>
>> This patch fixes the guard size accounting by pretending the user
>> requested S + G stack.  This way all later logic works out except
>> when reporting the user requested stack size (pthread_getattr_np)
>> or when computing the minimal stack size (__pthread_get_minstack).
>>
>> Normally this will increase thread stack allocations by one page.
>> TLS accounting is not affected, that will require a separate fix.
> 
> Should this fix use a separate bug number?
> 

i can open a separate bug.

>> 2017-12-12  Szabolcs Nagy<szabolcs.nagy@arm.com>
>>
>>     [BZ #11787]
>>     * nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
>>     * nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
>>     stacksize.
>>     * nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
> 
> Patch looks good in general.  The computation for stackblock_size looks right to me.
> 
>> diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
>> index 1cc789319564b468cf07bdb1304b27dc5a91e86f..9525322b1f92bb34aa21dcab28566aecd7434e90 100644
>> --- a/nptl/allocatestack.c
>> +++ b/nptl/allocatestack.c
>> @@ -532,6 +532,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
>>         /* Make sure the size of the stack is enough for the guard and
>>        eventually the thread descriptor.  */
>>         guardsize = (attr->guardsize + pagesize_m1) & ~pagesize_m1;
>> +      size += guardsize;
>>         if (__builtin_expect (size < ((guardsize + __static_tls_size
>>                        + MINIMAL_REST_STACK + pagesize_m1)
>>                       & ~pagesize_m1),
> 
> I wonder if we should add an overflow check there and return EINVAL if
> 
>   guardsize < attr->guardsize || size + guardsize < guardsize
> 

makes sense.

but i saw several overflowing arithmetics around stack computation.
  
Carlos O'Donell Dec. 20, 2017, 5:46 p.m. UTC | #4
On 12/12/2017 06:21 AM, Szabolcs Nagy wrote:
> Previously if user requested S stack and G guard when creating a
> thread, the total mapping was S and the actual available stack was
> S - G - static_tls, which is not what the user requested.
> 
> This patch fixes the guard size accounting by pretending the user
> requested S + G stack.  This way all later logic works out except
> when reporting the user requested stack size (pthread_getattr_np)
> or when computing the minimal stack size (__pthread_get_minstack).
> 
> Normally this will increase thread stack allocations by one page.
> TLS accounting is not affected, that will require a separate fix.
> 
> 2017-12-12  Szabolcs Nagy  <szabolcs.nagy@arm.com>
> 
> 	[BZ #11787]
> 	* nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
> 	* nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
> 	stacksize.
> 	* nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.

LGTM with the refactoring fixed. 

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> 
> 
> diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
> index 1cc789319564b468cf07bdb1304b27dc5a91e86f..9525322b1f92bb34aa21dcab28566aecd7434e90 100644
> --- a/nptl/allocatestack.c
> +++ b/nptl/allocatestack.c
> @@ -532,6 +532,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
>        /* Make sure the size of the stack is enough for the guard and
>  	 eventually the thread descriptor.  */
>        guardsize = (attr->guardsize + pagesize_m1) & ~pagesize_m1;
> +      size += guardsize;

OK.

>        if (__builtin_expect (size < ((guardsize + __static_tls_size
>  				     + MINIMAL_REST_STACK + pagesize_m1)
>  				    & ~pagesize_m1),
> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
> index 869e926f17aa218ab0b122eeda935069ef419ae5..e5c0bdfbebbcc501e4135280f09d55d021943363 100644
> --- a/nptl/nptl-init.c
> +++ b/nptl/nptl-init.c
> @@ -473,8 +473,5 @@ strong_alias (__pthread_initialize_minimal_internal,
>  size_t
>  __pthread_get_minstack (const pthread_attr_t *attr)
>  {
> -  struct pthread_attr *iattr = (struct pthread_attr *) attr;
> -
> -  return (GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN
> -	  + iattr->guardsize);
> +  return GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN;
>  }

We no longer need pthread_attr_t. Please refactor.

> diff --git a/nptl/pthread_getattr_np.c b/nptl/pthread_getattr_np.c
> index 06093b3d9270c2fcaa5af191852c9eca25dd506f..c5f82f8d22c74fd9e2433dff0a0d98d78e272484 100644
> --- a/nptl/pthread_getattr_np.c
> +++ b/nptl/pthread_getattr_np.c
> @@ -57,9 +57,10 @@ pthread_getattr_np (pthread_t thread_id, pthread_attr_t *attr)
>    /* The sizes are subject to alignment.  */
>    if (__glibc_likely (thread->stackblock != NULL))
>      {
> -      iattr->stacksize = thread->stackblock_size;
> +      iattr->stacksize = thread->stackblock_size - thread->guardsize;

OK.

>  #if _STACK_GROWS_DOWN
> -      iattr->stackaddr = (char *) thread->stackblock + iattr->stacksize;
> +      iattr->stackaddr = (char *) thread->stackblock
> +			 + thread->stackblock_size;

OK.

>  #else
>        iattr->stackaddr = (char *) thread->stackblock;
>  #endif
  
Florian Weimer Dec. 20, 2017, 9:24 p.m. UTC | #5
On 12/20/2017 06:46 PM, Carlos O'Donell wrote:
>> diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
>> index 869e926f17aa218ab0b122eeda935069ef419ae5..e5c0bdfbebbcc501e4135280f09d55d021943363 100644
>> --- a/nptl/nptl-init.c
>> +++ b/nptl/nptl-init.c
>> @@ -473,8 +473,5 @@ strong_alias (__pthread_initialize_minimal_internal,
>>   size_t
>>   __pthread_get_minstack (const pthread_attr_t *attr)
>>   {
>> -  struct pthread_attr *iattr = (struct pthread_attr *) attr;
>> -
>> -  return (GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN
>> -	  + iattr->guardsize);
>> +  return GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN;
>>   }
> We no longer need pthread_attr_t. Please refactor.

Please move this into a separate patch.  The function also needs to be 
renamed because there is code out there which attempts to call it if it 
exists (notably, in Firefox and Rust—you might not see it because some 
refactoring broke it and switched from C linkage to Rust linkage, so the 
function is never found).

Thanks,
Florian
  
Carlos O'Donell Jan. 8, 2018, 3:20 p.m. UTC | #6
On 12/12/2017 06:21 AM, Szabolcs Nagy wrote:
> Previously if user requested S stack and G guard when creating a
> thread, the total mapping was S and the actual available stack was
> S - G - static_tls, which is not what the user requested.
> 
> This patch fixes the guard size accounting by pretending the user
> requested S + G stack.  This way all later logic works out except
> when reporting the user requested stack size (pthread_getattr_np)
> or when computing the minimal stack size (__pthread_get_minstack).
> 
> Normally this will increase thread stack allocations by one page.
> TLS accounting is not affected, that will require a separate fix.
> 
> 2017-12-12  Szabolcs Nagy  <szabolcs.nagy@arm.com>
> 
> 	[BZ #11787]
> 	* nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
> 	* nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
> 	stacksize.
> 	* nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
> 

Any status on this?

I would like to get this in as soon as possible because the current
Intel fxsave/xsave/xsavec consume more stack than before, and we have
had at least one report of an application failing because of the additional
stack usage (breaks ntpd helper threads running with PTHREAD_STACK_MIN, see
the other discussion about what good is PTHREADS_STACK_MIN).

It looks like a next step would be:

* Split into two patches.
* First patch is as you intended (and can be checked in right away)
* Second patch removed the unused pthread_attr_r and renames __pthread_get_minstack
  to avoid usage by other programs.
  
Florian Weimer Jan. 8, 2018, 3:22 p.m. UTC | #7
On 01/08/2018 04:20 PM, Carlos O'Donell wrote:
> On 12/12/2017 06:21 AM, Szabolcs Nagy wrote:
>> Previously if user requested S stack and G guard when creating a
>> thread, the total mapping was S and the actual available stack was
>> S - G - static_tls, which is not what the user requested.
>>
>> This patch fixes the guard size accounting by pretending the user
>> requested S + G stack.  This way all later logic works out except
>> when reporting the user requested stack size (pthread_getattr_np)
>> or when computing the minimal stack size (__pthread_get_minstack).
>>
>> Normally this will increase thread stack allocations by one page.
>> TLS accounting is not affected, that will require a separate fix.
>>
>> 2017-12-12  Szabolcs Nagy  <szabolcs.nagy@arm.com>
>>
>> 	[BZ #11787]
>> 	* nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
>> 	* nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
>> 	stacksize.
>> 	* nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
>>
> 
> Any status on this?
> 
> I would like to get this in as soon as possible because the current
> Intel fxsave/xsave/xsavec consume more stack than before, and we have
> had at least one report of an application failing because of the additional
> stack usage (breaks ntpd helper threads running with PTHREAD_STACK_MIN, see
> the other discussion about what good is PTHREADS_STACK_MIN).
> 
> It looks like a next step would be:
> 
> * Split into two patches.
> * First patch is as you intended (and can be checked in right away)

Agreed.

> * Second patch removed the unused pthread_attr_r and renames __pthread_get_minstack
>    to avoid usage by other programs.

The proposed second patch *still* breaks Rust programs with large static 
TLS data because it will switch to PTHREAD_STACK_MIN.  We should do this 
only after __pthread_get_minstack turns constant (that is, returns 
PTHREAD_STACK_MIN).

Thanks,
Florian
  
Carlos O'Donell Jan. 8, 2018, 3:37 p.m. UTC | #8
On 01/08/2018 07:22 AM, Florian Weimer wrote:
>> * Second patch removed the unused pthread_attr_r and renames __pthread_get_minstack
>>    to avoid usage by other programs.
> 
> The proposed second patch *still* breaks Rust programs with large
> static TLS data because it will switch to PTHREAD_STACK_MIN.  We
> should do this only after __pthread_get_minstack turns constant (that
> is, returns PTHREAD_STACK_MIN).

I had not considered that.

Yes, applications are basically working around bug 11787 by calling
__ptrhead_get_minstack.

Perhaps you are right, until we disentangle TLS, such applications
need the workaround.

OK, I'm happy with the patch going in as-is then, and leaving the
second half of the cleanup until later.
  
Szabolcs Nagy Jan. 8, 2018, 3:41 p.m. UTC | #9
On 08/01/18 15:22, Florian Weimer wrote:
> On 01/08/2018 04:20 PM, Carlos O'Donell wrote:
>> On 12/12/2017 06:21 AM, Szabolcs Nagy wrote:
>>> Previously if user requested S stack and G guard when creating a
>>> thread, the total mapping was S and the actual available stack was
>>> S - G - static_tls, which is not what the user requested.
>>>
>>> This patch fixes the guard size accounting by pretending the user
>>> requested S + G stack.  This way all later logic works out except
>>> when reporting the user requested stack size (pthread_getattr_np)
>>> or when computing the minimal stack size (__pthread_get_minstack).
>>>
>>> Normally this will increase thread stack allocations by one page.
>>> TLS accounting is not affected, that will require a separate fix.
>>>
>>> 2017-12-12  Szabolcs Nagy  <szabolcs.nagy@arm.com>
>>>
>>>     [BZ #11787]
>>>     * nptl/allocatestack.c (allocate_stack): Add guardsize to stacksize.
>>>     * nptl/nptl-init.c (__pthread_get_minstack): Remove guardsize from
>>>     stacksize.
>>>     * nptl/pthread_getattr_np.c (pthread_getattr_np): Likewise.
>>>
>>
>> Any status on this?
>>
>> I would like to get this in as soon as possible because the current
>> Intel fxsave/xsave/xsavec consume more stack than before, and we have
>> had at least one report of an application failing because of the additional
>> stack usage (breaks ntpd helper threads running with PTHREAD_STACK_MIN, see
>> the other discussion about what good is PTHREADS_STACK_MIN).
>>
>> It looks like a next step would be:
>>
>> * Split into two patches.
>> * First patch is as you intended (and can be checked in right away)
> 
> Agreed.
> 

sorry i did not work on this while on holiday
will submit an updated patch

>> * Second patch removed the unused pthread_attr_r and renames __pthread_get_minstack
>>    to avoid usage by other programs.
> 
> The proposed second patch *still* breaks Rust programs with large static TLS data because it will switch to
> PTHREAD_STACK_MIN.  We should do this only after __pthread_get_minstack turns constant (that is, returns
> PTHREAD_STACK_MIN).

i knew __pthread_get_minstack was used outside of glibc
but thought removing the guardsize was safe since we
add it back later.

the patch increases stack allocations by +guardsize,
so if user computes its stack via __pthread_get_minstack
that may now return a smaller value to get the same stack
allocation as before in the end.
  
Florian Weimer Jan. 8, 2018, 3:43 p.m. UTC | #10
On 01/08/2018 04:41 PM, Szabolcs Nagy wrote:
> i knew __pthread_get_minstack was used outside of glibc
> but thought removing the guardsize was safe since we
> add it back later.

I agree that the patch as posted should be safe because the return value 
of __pthread_get_minstack does not change.

However, renaming the function would trigger the Rust default 
implementation, which is PTHREAD_STACK_MIN, and that doesn't take the 
static TLS size into account.  This is why I think we shouldn't do the 
renaming right now.

Thanks,
Florian
  

Patch

diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 1cc789319564b468cf07bdb1304b27dc5a91e86f..9525322b1f92bb34aa21dcab28566aecd7434e90 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -532,6 +532,7 @@  allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
       /* Make sure the size of the stack is enough for the guard and
 	 eventually the thread descriptor.  */
       guardsize = (attr->guardsize + pagesize_m1) & ~pagesize_m1;
+      size += guardsize;
       if (__builtin_expect (size < ((guardsize + __static_tls_size
 				     + MINIMAL_REST_STACK + pagesize_m1)
 				    & ~pagesize_m1),
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 869e926f17aa218ab0b122eeda935069ef419ae5..e5c0bdfbebbcc501e4135280f09d55d021943363 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -473,8 +473,5 @@  strong_alias (__pthread_initialize_minimal_internal,
 size_t
 __pthread_get_minstack (const pthread_attr_t *attr)
 {
-  struct pthread_attr *iattr = (struct pthread_attr *) attr;
-
-  return (GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN
-	  + iattr->guardsize);
+  return GLRO(dl_pagesize) + __static_tls_size + PTHREAD_STACK_MIN;
 }
diff --git a/nptl/pthread_getattr_np.c b/nptl/pthread_getattr_np.c
index 06093b3d9270c2fcaa5af191852c9eca25dd506f..c5f82f8d22c74fd9e2433dff0a0d98d78e272484 100644
--- a/nptl/pthread_getattr_np.c
+++ b/nptl/pthread_getattr_np.c
@@ -57,9 +57,10 @@  pthread_getattr_np (pthread_t thread_id, pthread_attr_t *attr)
   /* The sizes are subject to alignment.  */
   if (__glibc_likely (thread->stackblock != NULL))
     {
-      iattr->stacksize = thread->stackblock_size;
+      iattr->stacksize = thread->stackblock_size - thread->guardsize;
 #if _STACK_GROWS_DOWN
-      iattr->stackaddr = (char *) thread->stackblock + iattr->stacksize;
+      iattr->stackaddr = (char *) thread->stackblock
+			 + thread->stackblock_size;
 #else
       iattr->stackaddr = (char *) thread->stackblock;
 #endif