test-container: Use nftw instead of rm -rf

Message ID 20230927192013.2071605-1-adhemerval.zanella@linaro.org
State Committed
Commit aea4ddb87168d0475777e605f3bb576b0f62b3a2
Headers
Series test-container: Use nftw instead of rm -rf |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed

Commit Message

Adhemerval Zanella Netto Sept. 27, 2023, 7:20 p.m. UTC
  If the binary to run is 'env', test-containers skips it and adds
any required environment variable on the process envs variables.
This simplifies the required code to spawn new process (no need
to build an env-like program).

However, this is an issue for recursive_remove if there is any
LD_PRELOAD, since test-container will not prepend the loader command
along with required paths.  If the required preloaded library can
not be loaded by the system glibc, the 'post-clean rsync' will
eventually fail.

One example is if system glibc does not support DT_RELR and the
built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
with:

../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)

Instead trying to figure out the required loader arguments on how
to spawn the 'rm -rf', replace the command with a nftw call.

Checked on x86_64-linux-gnu.
---
 support/test-container.c | 34 +++++++++++-----------------------
 1 file changed, 11 insertions(+), 23 deletions(-)
  

Comments

Stefan Liebler Sept. 28, 2023, 9:40 a.m. UTC | #1
On 27.09.23 21:20, Adhemerval Zanella wrote:
> If the binary to run is 'env', test-containers skips it and adds
> any required environment variable on the process envs variables.
> This simplifies the required code to spawn new process (no need
> to build an env-like program).
> 
> However, this is an issue for recursive_remove if there is any
> LD_PRELOAD, since test-container will not prepend the loader command
> along with required paths.  If the required preloaded library can
> not be loaded by the system glibc, the 'post-clean rsync' will
> eventually fail.
> 
> One example is if system glibc does not support DT_RELR and the
> built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
> with:
> 
> ../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
> 86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
> rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
> found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
> 
> Instead trying to figure out the required loader arguments on how
> to spawn the 'rm -rf', replace the command with a nftw call.
> 
Just as information, I've also recognized this issue. For me (on
x86_64/s390x), I got "*** stack smashing detected ***: terminated" in
this rm due to preloading with the fresh build libc_malloc_debug.so.
In malloc_hook_ini -> generic_hook_ini -> initialize_malloc_check, we have:
TUNABLE_GET (check, int32_t, TUNABLE_CALLBACK (set_mallopt_check));
In __tunable_get_val, the tunables ID from current-build and system for
glibc.malloc.check differs. In my case it is 30, which is beyond the
range of the systems tunable_list[]. This leads to storing a 8byte value
to passed valp. Unfortunately it points to a 4 byte int32_t and the
other 4 bytes belong to the stack smashing canary.

Your patch solves the issue for me.
> Checked on x86_64-linux-gnu.
> ---
>  support/test-container.c | 34 +++++++++++-----------------------
>  1 file changed, 11 insertions(+), 23 deletions(-)
> 
> diff --git a/support/test-container.c b/support/test-container.c
> index 788b091ea0..95dfef1a99 100644
> --- a/support/test-container.c
> +++ b/support/test-container.c
> @@ -37,6 +37,7 @@
>  #include <errno.h>
>  #include <error.h>
>  #include <libc-pointer-arith.h>
> +#include <ftw.h>
> 
>  #ifdef __linux__
>  #include <sys/mount.h>
> @@ -405,32 +406,19 @@ file_exists (char *path)
>    return 0;
>  }
> 
> +static int
> +unlink_cb (const char *fpath, const struct stat *sb, int typeflag,
> +	   struct FTW *ftwbuf)
> +{
> +  return remove (fpath);
> +}
> +
>  static void
>  recursive_remove (char *path)
>  {
> -  pid_t child;
> -  int status;
> -
> -  child = fork ();
> -
> -  switch (child) {
> -  case -1:
> -    perror("fork");
> -    FAIL_EXIT1 ("Unable to fork");
> -  case 0:
> -    /* Child.  */
> -    execlp ("rm", "rm", "-rf", path, NULL);
> -    FAIL_EXIT1 ("exec rm: %m");
> -  default:
> -    /* Parent.  */
> -    waitpid (child, &status, 0);
> -    /* "rm" would have already printed a suitable error message.  */
> -    if (! WIFEXITED (status)
> -	|| WEXITSTATUS (status) != 0)
> -      FAIL_EXIT1 ("exec child returned status: %d", status);
> -
> -    break;
> -  }
> +  int r = nftw (path, unlink_cb, 1000, FTW_DEPTH | FTW_PHYS);
> +  if (r == -1)
> +    FAIL_EXIT1 ("recursive_remove failed");
>  }
> 
>  /* Used for both rsync and the mytest.script "cp" command.  */
  
Adhemerval Zanella Netto Sept. 28, 2023, 11:11 a.m. UTC | #2
On 28/09/23 06:40, Stefan Liebler wrote:
> On 27.09.23 21:20, Adhemerval Zanella wrote:
>> If the binary to run is 'env', test-containers skips it and adds
>> any required environment variable on the process envs variables.
>> This simplifies the required code to spawn new process (no need
>> to build an env-like program).
>>
>> However, this is an issue for recursive_remove if there is any
>> LD_PRELOAD, since test-container will not prepend the loader command
>> along with required paths.  If the required preloaded library can
>> not be loaded by the system glibc, the 'post-clean rsync' will
>> eventually fail.
>>
>> One example is if system glibc does not support DT_RELR and the
>> built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
>> with:
>>
>> ../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
>> 86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
>> rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
>> found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
>>
>> Instead trying to figure out the required loader arguments on how
>> to spawn the 'rm -rf', replace the command with a nftw call.
>>
> Just as information, I've also recognized this issue. For me (on
> x86_64/s390x), I got "*** stack smashing detected ***: terminated" in
> this rm due to preloading with the fresh build libc_malloc_debug.so.
> In malloc_hook_ini -> generic_hook_ini -> initialize_malloc_check, we have:
> TUNABLE_GET (check, int32_t, TUNABLE_CALLBACK (set_mallopt_check));
> In __tunable_get_val, the tunables ID from current-build and system for
> glibc.malloc.check differs. In my case it is 30, which is beyond the
> range of the systems tunable_list[]. This leads to storing a 8byte value
> to passed valp. Unfortunately it points to a 4 byte int32_t and the
> other 4 bytes belong to the stack smashing canary.

Indeed the tunables might another problem, thanks for checking it.

> 
> Your patch solves the issue for me.

Is it a reviewed-by ;) ?

>> Checked on x86_64-linux-gnu.
>> ---
>>  support/test-container.c | 34 +++++++++++-----------------------
>>  1 file changed, 11 insertions(+), 23 deletions(-)
>>
>> diff --git a/support/test-container.c b/support/test-container.c
>> index 788b091ea0..95dfef1a99 100644
>> --- a/support/test-container.c
>> +++ b/support/test-container.c
>> @@ -37,6 +37,7 @@
>>  #include <errno.h>
>>  #include <error.h>
>>  #include <libc-pointer-arith.h>
>> +#include <ftw.h>
>>
>>  #ifdef __linux__
>>  #include <sys/mount.h>
>> @@ -405,32 +406,19 @@ file_exists (char *path)
>>    return 0;
>>  }
>>
>> +static int
>> +unlink_cb (const char *fpath, const struct stat *sb, int typeflag,
>> +	   struct FTW *ftwbuf)
>> +{
>> +  return remove (fpath);
>> +}
>> +
>>  static void
>>  recursive_remove (char *path)
>>  {
>> -  pid_t child;
>> -  int status;
>> -
>> -  child = fork ();
>> -
>> -  switch (child) {
>> -  case -1:
>> -    perror("fork");
>> -    FAIL_EXIT1 ("Unable to fork");
>> -  case 0:
>> -    /* Child.  */
>> -    execlp ("rm", "rm", "-rf", path, NULL);
>> -    FAIL_EXIT1 ("exec rm: %m");
>> -  default:
>> -    /* Parent.  */
>> -    waitpid (child, &status, 0);
>> -    /* "rm" would have already printed a suitable error message.  */
>> -    if (! WIFEXITED (status)
>> -	|| WEXITSTATUS (status) != 0)
>> -      FAIL_EXIT1 ("exec child returned status: %d", status);
>> -
>> -    break;
>> -  }
>> +  int r = nftw (path, unlink_cb, 1000, FTW_DEPTH | FTW_PHYS);
>> +  if (r == -1)
>> +    FAIL_EXIT1 ("recursive_remove failed");
>>  }
>>
>>  /* Used for both rsync and the mytest.script "cp" command.  */
>
  
Siddhesh Poyarekar Sept. 28, 2023, 11:30 a.m. UTC | #3
On 2023-09-27 20:20, Adhemerval Zanella wrote:
> If the binary to run is 'env', test-containers skips it and adds
> any required environment variable on the process envs variables.
> This simplifies the required code to spawn new process (no need
> to build an env-like program).
> 
> However, this is an issue for recursive_remove if there is any
> LD_PRELOAD, since test-container will not prepend the loader command
> along with required paths.  If the required preloaded library can
> not be loaded by the system glibc, the 'post-clean rsync' will
> eventually fail.
> 
> One example is if system glibc does not support DT_RELR and the
> built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
> with:
> 
> ../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
> 86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
> rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
> found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
> 
> Instead trying to figure out the required loader arguments on how
> to spawn the 'rm -rf', replace the command with a nftw call.
> 
> Checked on x86_64-linux-gnu.
> ---

Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>

>   support/test-container.c | 34 +++++++++++-----------------------
>   1 file changed, 11 insertions(+), 23 deletions(-)
> 
> diff --git a/support/test-container.c b/support/test-container.c
> index 788b091ea0..95dfef1a99 100644
> --- a/support/test-container.c
> +++ b/support/test-container.c
> @@ -37,6 +37,7 @@
>   #include <errno.h>
>   #include <error.h>
>   #include <libc-pointer-arith.h>
> +#include <ftw.h>
>   
>   #ifdef __linux__
>   #include <sys/mount.h>
> @@ -405,32 +406,19 @@ file_exists (char *path)
>     return 0;
>   }
>   
> +static int
> +unlink_cb (const char *fpath, const struct stat *sb, int typeflag,
> +	   struct FTW *ftwbuf)
> +{
> +  return remove (fpath);
> +}
> +
>   static void
>   recursive_remove (char *path)
>   {
> -  pid_t child;
> -  int status;
> -
> -  child = fork ();
> -
> -  switch (child) {
> -  case -1:
> -    perror("fork");
> -    FAIL_EXIT1 ("Unable to fork");
> -  case 0:
> -    /* Child.  */
> -    execlp ("rm", "rm", "-rf", path, NULL);
> -    FAIL_EXIT1 ("exec rm: %m");
> -  default:
> -    /* Parent.  */
> -    waitpid (child, &status, 0);
> -    /* "rm" would have already printed a suitable error message.  */
> -    if (! WIFEXITED (status)
> -	|| WEXITSTATUS (status) != 0)
> -      FAIL_EXIT1 ("exec child returned status: %d", status);
> -
> -    break;
> -  }
> +  int r = nftw (path, unlink_cb, 1000, FTW_DEPTH | FTW_PHYS);
> +  if (r == -1)
> +    FAIL_EXIT1 ("recursive_remove failed");
>   }
>   
>   /* Used for both rsync and the mytest.script "cp" command.  */
  
Stefan Liebler Sept. 28, 2023, 11:44 a.m. UTC | #4
On 28.09.23 13:11, Adhemerval Zanella Netto wrote:
> 
> 
> On 28/09/23 06:40, Stefan Liebler wrote:
>> On 27.09.23 21:20, Adhemerval Zanella wrote:
>>> If the binary to run is 'env', test-containers skips it and adds
>>> any required environment variable on the process envs variables.
>>> This simplifies the required code to spawn new process (no need
>>> to build an env-like program).
>>>
>>> However, this is an issue for recursive_remove if there is any
>>> LD_PRELOAD, since test-container will not prepend the loader command
>>> along with required paths.  If the required preloaded library can
>>> not be loaded by the system glibc, the 'post-clean rsync' will
>>> eventually fail.
>>>
>>> One example is if system glibc does not support DT_RELR and the
>>> built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
>>> with:
>>>
>>> ../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
>>> 86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
>>> rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
>>> found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
>>>
>>> Instead trying to figure out the required loader arguments on how
>>> to spawn the 'rm -rf', replace the command with a nftw call.
>>>
>> Just as information, I've also recognized this issue. For me (on
>> x86_64/s390x), I got "*** stack smashing detected ***: terminated" in
>> this rm due to preloading with the fresh build libc_malloc_debug.so.
>> In malloc_hook_ini -> generic_hook_ini -> initialize_malloc_check, we have:
>> TUNABLE_GET (check, int32_t, TUNABLE_CALLBACK (set_mallopt_check));
>> In __tunable_get_val, the tunables ID from current-build and system for
>> glibc.malloc.check differs. In my case it is 30, which is beyond the
>> range of the systems tunable_list[]. This leads to storing a 8byte value
>> to passed valp. Unfortunately it points to a 4 byte int32_t and the
>> other 4 bytes belong to the stack smashing canary.
> 
> Indeed the tunables might another problem, thanks for checking it.
> 
>>
>> Your patch solves the issue for me.
> 
> Is it a reviewed-by ;) ?
Yes, the patch is fine.
Reviewed-by: Stefan Liebler <stli@linux.ibm.com>
> 
>>> Checked on x86_64-linux-gnu.
>>> ---
>>>  support/test-container.c | 34 +++++++++++-----------------------
>>>  1 file changed, 11 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/support/test-container.c b/support/test-container.c
>>> index 788b091ea0..95dfef1a99 100644
>>> --- a/support/test-container.c
>>> +++ b/support/test-container.c
>>> @@ -37,6 +37,7 @@
>>>  #include <errno.h>
>>>  #include <error.h>
>>>  #include <libc-pointer-arith.h>
>>> +#include <ftw.h>
>>>
>>>  #ifdef __linux__
>>>  #include <sys/mount.h>
>>> @@ -405,32 +406,19 @@ file_exists (char *path)
>>>    return 0;
>>>  }
>>>
>>> +static int
>>> +unlink_cb (const char *fpath, const struct stat *sb, int typeflag,
>>> +	   struct FTW *ftwbuf)
>>> +{
>>> +  return remove (fpath);
>>> +}
>>> +
>>>  static void
>>>  recursive_remove (char *path)
>>>  {
>>> -  pid_t child;
>>> -  int status;
>>> -
>>> -  child = fork ();
>>> -
>>> -  switch (child) {
>>> -  case -1:
>>> -    perror("fork");
>>> -    FAIL_EXIT1 ("Unable to fork");
>>> -  case 0:
>>> -    /* Child.  */
>>> -    execlp ("rm", "rm", "-rf", path, NULL);
>>> -    FAIL_EXIT1 ("exec rm: %m");
>>> -  default:
>>> -    /* Parent.  */
>>> -    waitpid (child, &status, 0);
>>> -    /* "rm" would have already printed a suitable error message.  */
>>> -    if (! WIFEXITED (status)
>>> -	|| WEXITSTATUS (status) != 0)
>>> -      FAIL_EXIT1 ("exec child returned status: %d", status);
>>> -
>>> -    break;
>>> -  }
>>> +  int r = nftw (path, unlink_cb, 1000, FTW_DEPTH | FTW_PHYS);
>>> +  if (r == -1)
>>> +    FAIL_EXIT1 ("recursive_remove failed");
>>>  }
>>>
>>>  /* Used for both rsync and the mytest.script "cp" command.  */
>>
  
Andreas K. Huettel Oct. 1, 2023, 5:27 p.m. UTC | #5
> 
> One example is if system glibc does not support DT_RELR and the
> built glibc does, the nss/tst-nss-gai-hv2-canonname test fails
> with:
> 
> ../scripts/evaluate-test.sh nss/tst-nss-gai-hv2-canonname $? false false
> 86_64-linux-gnu/nss/tst-nss-gai-hv2-canonname.test-result
> rm: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_ABI_DT_RELR' not
> found (required by x86_64-linux-gnu/malloc/libc_malloc_debug.so)
> 
> Instead trying to figure out the required loader arguments on how
> to spawn the 'rm -rf', replace the command with a nftw call.
> 

Great. Ran into this and was just checking if somebody else... :)
  

Patch

diff --git a/support/test-container.c b/support/test-container.c
index 788b091ea0..95dfef1a99 100644
--- a/support/test-container.c
+++ b/support/test-container.c
@@ -37,6 +37,7 @@ 
 #include <errno.h>
 #include <error.h>
 #include <libc-pointer-arith.h>
+#include <ftw.h>
 
 #ifdef __linux__
 #include <sys/mount.h>
@@ -405,32 +406,19 @@  file_exists (char *path)
   return 0;
 }
 
+static int
+unlink_cb (const char *fpath, const struct stat *sb, int typeflag,
+	   struct FTW *ftwbuf)
+{
+  return remove (fpath);
+}
+
 static void
 recursive_remove (char *path)
 {
-  pid_t child;
-  int status;
-
-  child = fork ();
-
-  switch (child) {
-  case -1:
-    perror("fork");
-    FAIL_EXIT1 ("Unable to fork");
-  case 0:
-    /* Child.  */
-    execlp ("rm", "rm", "-rf", path, NULL);
-    FAIL_EXIT1 ("exec rm: %m");
-  default:
-    /* Parent.  */
-    waitpid (child, &status, 0);
-    /* "rm" would have already printed a suitable error message.  */
-    if (! WIFEXITED (status)
-	|| WEXITSTATUS (status) != 0)
-      FAIL_EXIT1 ("exec child returned status: %d", status);
-
-    break;
-  }
+  int r = nftw (path, unlink_cb, 1000, FTW_DEPTH | FTW_PHYS);
+  if (r == -1)
+    FAIL_EXIT1 ("recursive_remove failed");
 }
 
 /* Used for both rsync and the mytest.script "cp" command.  */