Message ID | 20180425192637.5977-1-tuliom@linux.ibm.com |
---|---|
State | Superseded |
Delegated to: | Carlos O'Donell |
Headers |
Received: (qmail 39200 invoked by alias); 25 Apr 2018 19:27:12 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 39188 invoked by uid 89); 25 Apr 2018 19:27:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-27.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com From: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> To: "Carlos O'Donell" <carlos@redhat.com>, libc-alpha@sourceware.org, szabolcs.nagy@arm.com, Florian Weimer <fweimer@redhat.com> Subject: [PATCHv2] Increase robustness of internal dlopen() by using RTLD_NOW [BZ #22766] Date: Wed, 25 Apr 2018 16:26:37 -0300 In-Reply-To: <72bf4b7b-3129-21a7-af89-24aea13c0b04@redhat.com> References: <72bf4b7b-3129-21a7-af89-24aea13c0b04@redhat.com> X-TM-AS-GCONF: 00 x-cbid: 18042519-0024-0000-0000-0000184B11CC X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008920; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000257; SDB=6.01023186; UDB=6.00522311; IPR=6.00802419; MB=3.00020780; MTD=3.00000008; XFM=3.00000015; UTC=2018-04-25 19:27:05 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18042519-0025-0000-0000-00004FAF2B8C Message-Id: <20180425192637.5977-1-tuliom@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-04-25_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804250177 |
Commit Message
Tulio Magno Quites Machado Filho
April 25, 2018, 7:26 p.m. UTC
Carlos O'Donell <carlos@redhat.com> writes: > On 04/25/2018 07:42 AM, Tulio Magno Quites Machado Filho wrote: >> Prevent random runtime crashes due to missing symbols caused by mixed >> libnss_* versions. > > You are arguing that this should fail at the first call into the NSS > service, which causes the initial dlopen? That is OK with me. Yes. I'm agreeing with Szabolcs' proposal [1]. [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22766 > An even further patch would be to load all NSS modules early, and that's > something Florian and I have discussed. However, I'm not recommending > you do that now, but I wanted to be clear about a direction that this > code might take. Ack. Makes sense. > We should cleanup more. > > sysdeps/gnu/unwind-resume.c: handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); > sysdeps/nptl/unwind-forcedunwind.c: handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); Done. > Since we are now using RTLD_NOW: > > * Switch back to __libc_dlopen (effectively reverting part of > Florian's 08c6e95234c, but leave the comments) Done. > * Add a big comment in dlfcn.h to explain why RTLD_NOW is needed: > * Same comment as in commit 08c6e95234c, plus NSS case. Done. --- 8< --- Prevent random runtime crashes due to missing symbols caused by mixed libnss_* versions. 2018-04-25 Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> [BZ #22766] * include/dlfcn.h [__libc_dl_open]: Replace RTLD_LAZY with RTLD_NOW. * sysdeps/gnu/unwind-resume.c (__lib_gcc_s_init): Replace __libc_dlopen_mode() using RTLD_NOW with __libc_dlopen. * sysdeps/nptl/unwind-forcedunwind.c: Likewise. Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> --- include/dlfcn.h | 7 ++++++- sysdeps/gnu/unwind-resume.c | 2 +- sysdeps/nptl/unwind-forcedunwind.c | 3 ++- 3 files changed, 9 insertions(+), 3 deletions(-)
Comments
On 04/25/2018 03:26 PM, Tulio Magno Quites Machado Filho wrote: > Carlos O'Donell <carlos@redhat.com> writes: > >> On 04/25/2018 07:42 AM, Tulio Magno Quites Machado Filho wrote: >>> Prevent random runtime crashes due to missing symbols caused by mixed >>> libnss_* versions. >> >> You are arguing that this should fail at the first call into the NSS >> service, which causes the initial dlopen? That is OK with me. > > Yes. I'm agreeing with Szabolcs' proposal [1]. > > [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22766 > >> An even further patch would be to load all NSS modules early, and that's >> something Florian and I have discussed. However, I'm not recommending >> you do that now, but I wanted to be clear about a direction that this >> code might take. > > Ack. Makes sense. > >> We should cleanup more. >> >> sysdeps/gnu/unwind-resume.c: handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); >> sysdeps/nptl/unwind-forcedunwind.c: handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); > > Done. > >> Since we are now using RTLD_NOW: >> >> * Switch back to __libc_dlopen (effectively reverting part of >> Florian's 08c6e95234c, but leave the comments) > > Done. > >> * Add a big comment in dlfcn.h to explain why RTLD_NOW is needed: >> * Same comment as in commit 08c6e95234c, plus NSS case. > > Done. > > --- 8< --- > > Prevent random runtime crashes due to missing symbols caused by mixed > libnss_* versions. > > 2018-04-25 Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> > > [BZ #22766] > * include/dlfcn.h [__libc_dl_open]: Replace RTLD_LAZY with RTLD_NOW. > * sysdeps/gnu/unwind-resume.c (__lib_gcc_s_init): Replace > __libc_dlopen_mode() using RTLD_NOW with __libc_dlopen. > * sysdeps/nptl/unwind-forcedunwind.c: Likewise. > > Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> OK with the changes below. Reviewed-by: Carlos O'Donell <carlos@redhat.com> > --- > include/dlfcn.h | 7 ++++++- > sysdeps/gnu/unwind-resume.c | 2 +- > sysdeps/nptl/unwind-forcedunwind.c | 3 ++- > 3 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/include/dlfcn.h b/include/dlfcn.h > index 12ef913e19..558c9bfe17 100644 > --- a/include/dlfcn.h > +++ b/include/dlfcn.h > @@ -31,8 +31,13 @@ extern char **__libc_argv attribute_hidden; > > /* Now define the internal interfaces. */ > > +/* Use RTLD_NOW here because: > + 1. For consistency between __libgcc_s_init and pthread_cancel_init; > + 2. It allows libnss_* to provide a fallback code at dlopen() time in order > + to prevent missing symbols caused by mixed glibc versions. Using > + RTLD_LAZY here could cause random failures at runtime. */ Suggest: /* Use RTLD_NOW here because: 1. In pthread_cancel_init we want to use RTLD_NOW to reduce the stack usage of future cancellation operations, particularly when the target thread is running with a small stack. Likewise for consistency we do the same thing in __libgcc_s_init. RTLD_NOW will rarely make a difference for __libgcc_s_init because unwinding is already in progress, so libgcc_s.so has already been loaded if its unwinder is used (Bug 22636). 2. It allows us to provide robust fallback code at dlopen time for incorrectly configured systems that mix old libnss_* modules with newly installed libraries e.g. old libnss_nis.so.2 with new libnsl.so.1. Using RTLD_LAZY here causes a failure at the time the symbol is called and at that point it is much harder to safely return an error (Bug 22766). The use of RTLD_NOW also impacts gconv module loading, backtracing (where the unwinder form libgcc_s.so is used), and IDNA functions (which load libidn), all of which load their respective DSOs on demand, and so should not impact program startup. That is to say that the DSOs are loaded as part of an API call and therefore we will be calling that family of API functions shortly so RTLD_NOW or RTLD_LAZY is not a big difference in performance, but RTLD_NOW has better error handling semantics for the library. */ > #define __libc_dlopen(name) \ > - __libc_dlopen_mode (name, RTLD_LAZY | __RTLD_DLOPEN) > + __libc_dlopen_mode (name, RTLD_NOW | __RTLD_DLOPEN) > extern void *__libc_dlopen_mode (const char *__name, int __mode); > extern void *__libc_dlsym (void *__map, const char *__name); > extern void *__libc_dlvsym (void *map, const char *name, const char *version); > diff --git a/sysdeps/gnu/unwind-resume.c b/sysdeps/gnu/unwind-resume.c > index 7f9a1bf2c7..72a04c7969 100644 > --- a/sysdeps/gnu/unwind-resume.c > +++ b/sysdeps/gnu/unwind-resume.c > @@ -39,7 +39,7 @@ __libgcc_s_init (void) > RTLD_NOW will rarely make a difference here because unwinding is > already in progress, so libgcc_s.so has already been loaded if > its unwinder is used. */ ^^^ This should be adjusted to: /* See include/dlfcn.h. Use of __libc_dlopen requires RTLD_NOW. */ > - handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); > + handle = __libc_dlopen (LIBGCC_S_SO); > > if (handle == NULL > || (resume = __libc_dlsym (handle, "_Unwind_Resume")) == NULL > diff --git a/sysdeps/nptl/unwind-forcedunwind.c b/sysdeps/nptl/unwind-forcedunwind.c > index 67b8e74b53..dbd8d66cda 100644 > --- a/sysdeps/nptl/unwind-forcedunwind.c > +++ b/sysdeps/nptl/unwind-forcedunwind.c > @@ -49,7 +49,8 @@ pthread_cancel_init (void) > return; > } > > - handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); > + /* __libgcc_s_init depends on RTLD_NOW being used here. */ Suggest: /* See include/dlfcn.h. Use of __libc_dlopen requires RTLD_NOW. */ > + handle = __libc_dlopen (LIBGCC_S_SO); > > if (handle == NULL > || (resume = __libc_dlsym (handle, "_Unwind_Resume")) == NULL >
Carlos O'Donell <carlos@redhat.com> writes: > On 04/25/2018 03:26 PM, Tulio Magno Quites Machado Filho wrote: >> >> Prevent random runtime crashes due to missing symbols caused by mixed >> libnss_* versions. >> >> 2018-04-25 Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> >> >> [BZ #22766] >> * include/dlfcn.h [__libc_dl_open]: Replace RTLD_LAZY with RTLD_NOW. >> * sysdeps/gnu/unwind-resume.c (__lib_gcc_s_init): Replace >> __libc_dlopen_mode() using RTLD_NOW with __libc_dlopen. >> * sysdeps/nptl/unwind-forcedunwind.c: Likewise. >> >> Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> > > OK with the changes below. > > Reviewed-by: Carlos O'Donell <carlos@redhat.com> Ack. >> diff --git a/include/dlfcn.h b/include/dlfcn.h >> index 12ef913e19..558c9bfe17 100644 >> --- a/include/dlfcn.h >> +++ b/include/dlfcn.h >> @@ -31,8 +31,13 @@ extern char **__libc_argv attribute_hidden; >> >> /* Now define the internal interfaces. */ >> >> +/* Use RTLD_NOW here because: >> + 1. For consistency between __libgcc_s_init and pthread_cancel_init; >> + 2. It allows libnss_* to provide a fallback code at dlopen() time in order >> + to prevent missing symbols caused by mixed glibc versions. Using >> + RTLD_LAZY here could cause random failures at runtime. */ > > Suggest: > > /* Use RTLD_NOW here because: > 1. In pthread_cancel_init we want to use RTLD_NOW to reduce the stack usage > of future cancellation operations, particularly when the target thread > is running with a small stack. Likewise for consistency we do the same > thing in __libgcc_s_init. RTLD_NOW will rarely make a difference for > __libgcc_s_init because unwinding is already in progress, so libgcc_s.so > has already been loaded if its unwinder is used (Bug 22636). > 2. It allows us to provide robust fallback code at dlopen time for incorrectly > configured systems that mix old libnss_* modules with newly installed > libraries e.g. old libnss_nis.so.2 with new libnsl.so.1. Using RTLD_LAZY > here causes a failure at the time the symbol is called and at that point > it is much harder to safely return an error (Bug 22766). > > The use of RTLD_NOW also impacts gconv module loading, backtracing (where the > unwinder form libgcc_s.so is used), and IDNA functions (which load libidn), all > of which load their respective DSOs on demand, and so should not impact program > startup. That is to say that the DSOs are loaded as part of an API call and > therefore we will be calling that family of API functions shortly so RTLD_NOW or > RTLD_LAZY is not a big difference in performance, but RTLD_NOW has better error > handling semantics for the library. */ Much better! :-D >> #define __libc_dlopen(name) \ >> - __libc_dlopen_mode (name, RTLD_LAZY | __RTLD_DLOPEN) >> + __libc_dlopen_mode (name, RTLD_NOW | __RTLD_DLOPEN) >> extern void *__libc_dlopen_mode (const char *__name, int __mode); >> extern void *__libc_dlsym (void *__map, const char *__name); >> extern void *__libc_dlvsym (void *map, const char *name, const char *version); >> diff --git a/sysdeps/gnu/unwind-resume.c b/sysdeps/gnu/unwind-resume.c >> index 7f9a1bf2c7..72a04c7969 100644 >> --- a/sysdeps/gnu/unwind-resume.c >> +++ b/sysdeps/gnu/unwind-resume.c >> @@ -39,7 +39,7 @@ __libgcc_s_init (void) >> RTLD_NOW will rarely make a difference here because unwinding is >> already in progress, so libgcc_s.so has already been loaded if >> its unwinder is used. */ > > ^^^ This should be adjusted to: > > /* See include/dlfcn.h. Use of __libc_dlopen requires RTLD_NOW. */ Fixed. >> - handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); >> + handle = __libc_dlopen (LIBGCC_S_SO); >> >> if (handle == NULL >> || (resume = __libc_dlsym (handle, "_Unwind_Resume")) == NULL >> diff --git a/sysdeps/nptl/unwind-forcedunwind.c b/sysdeps/nptl/unwind-forcedunwind.c >> index 67b8e74b53..dbd8d66cda 100644 >> --- a/sysdeps/nptl/unwind-forcedunwind.c >> +++ b/sysdeps/nptl/unwind-forcedunwind.c >> @@ -49,7 +49,8 @@ pthread_cancel_init (void) >> return; >> } >> >> - handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); >> + /* __libgcc_s_init depends on RTLD_NOW being used here. */ > > Suggest: > > /* See include/dlfcn.h. Use of __libc_dlopen requires RTLD_NOW. */ Fixed. I'm pushing it. Thanks!
diff --git a/include/dlfcn.h b/include/dlfcn.h index 12ef913e19..558c9bfe17 100644 --- a/include/dlfcn.h +++ b/include/dlfcn.h @@ -31,8 +31,13 @@ extern char **__libc_argv attribute_hidden; /* Now define the internal interfaces. */ +/* Use RTLD_NOW here because: + 1. For consistency between __libgcc_s_init and pthread_cancel_init; + 2. It allows libnss_* to provide a fallback code at dlopen() time in order + to prevent missing symbols caused by mixed glibc versions. Using + RTLD_LAZY here could cause random failures at runtime. */ #define __libc_dlopen(name) \ - __libc_dlopen_mode (name, RTLD_LAZY | __RTLD_DLOPEN) + __libc_dlopen_mode (name, RTLD_NOW | __RTLD_DLOPEN) extern void *__libc_dlopen_mode (const char *__name, int __mode); extern void *__libc_dlsym (void *__map, const char *__name); extern void *__libc_dlvsym (void *map, const char *name, const char *version); diff --git a/sysdeps/gnu/unwind-resume.c b/sysdeps/gnu/unwind-resume.c index 7f9a1bf2c7..72a04c7969 100644 --- a/sysdeps/gnu/unwind-resume.c +++ b/sysdeps/gnu/unwind-resume.c @@ -39,7 +39,7 @@ __libgcc_s_init (void) RTLD_NOW will rarely make a difference here because unwinding is already in progress, so libgcc_s.so has already been loaded if its unwinder is used. */ - handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); + handle = __libc_dlopen (LIBGCC_S_SO); if (handle == NULL || (resume = __libc_dlsym (handle, "_Unwind_Resume")) == NULL diff --git a/sysdeps/nptl/unwind-forcedunwind.c b/sysdeps/nptl/unwind-forcedunwind.c index 67b8e74b53..dbd8d66cda 100644 --- a/sysdeps/nptl/unwind-forcedunwind.c +++ b/sysdeps/nptl/unwind-forcedunwind.c @@ -49,7 +49,8 @@ pthread_cancel_init (void) return; } - handle = __libc_dlopen_mode (LIBGCC_S_SO, RTLD_NOW | __RTLD_DLOPEN); + /* __libgcc_s_init depends on RTLD_NOW being used here. */ + handle = __libc_dlopen (LIBGCC_S_SO); if (handle == NULL || (resume = __libc_dlsym (handle, "_Unwind_Resume")) == NULL