From patchwork Sun Jul 20 21:08:10 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philippe Waroquiers X-Patchwork-Id: 2120 Received: (qmail 28929 invoked by alias); 20 Jul 2014 21:08:26 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 28445 invoked by uid 89); 20 Jul 2014 21:08:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=3.8 required=5.0 tests=AWL, BAYES_00, KAM_FROM_URIBL_PCCC, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=no version=3.3.2 X-HELO: mailrelay002.isp.belgacom.be X-Belgacom-Dynamic: yes X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApQBAJcuzFNtgbO5/2dsb2JhbAANTINgg0/CIIdMAYEchHkBAQEDARoJBEcLBQsLGCoCAlcGLogfDahJd5Z1F4l+hGoaSQeCeIFOBZJAgUWIbZRkgURqgQQ Subject: Re: pthread stack cache and valgrind helgrind false positive race error on tls variables From: Philippe Waroquiers To: Roland McGrath Cc: libc-alpha@sourceware.org In-Reply-To: <20140720000209.2B6322C3976@topped-with-meat.com> References: <1405714819.2181.30.camel@soleil> <20140720000209.2B6322C3976@topped-with-meat.com> Date: Sun, 20 Jul 2014 23:08:10 +0200 Message-ID: <1405890490.27752.49.camel@soleil> Mime-Version: 1.0 On Sat, 2014-07-19 at 17:02 -0700, Roland McGrath wrote: > Separate from the implementation details of the workaround you propose, I > think we would like to understand more clearly what is going on to confuse > helgrind. I have attached the (under review) valgrind patch. Whatever better solution we might have in a future glibc (see below discussion), we should have a solution for old/current glibc. Note that this patch applies to the valgrind SVN (it also contains the test program that shows the tls false positive). > Though it's all unclear to me so far, I tend to suspect both > that this workaround is not what we'd consider the ideal fix and that the > underlying problem is likely to have other symptoms (for other parts of the > code, other usage patterns, and/or other code outside of libc that rolls > its own synchronization primitives). Yes, helgrind has several tricks (kludges?) to avoid raising false positive in various "low level" code: * ld.so is not instrumented, otherwise this causes huge nr of false positive to (expensively) suppress. * the default valgrind error suppression file contains very general suppression entries, suppressing all races that have their top frame in libc or libpthread. Search for 'not very clever' or 'ugly' in valgrind default.supp. A.o., default.supp mentions that helgrind does not understand stdio locking. The 'not clever/ugly' is mostly because these supp can cause false negative. * several more precise suppression entries for various false + in glibc and others. The above techniques do not work with the false positive caused by the stack cache/tls internal structures, as (some of) the races appears in user code. Also, pthread cond var cannot be properly handled by helgrind. For sure, it would be nice to have a general solution for all that (or at least a solution that decreases the nr of needed kludges in helgrind). An approach could be to add in glibc specific helgrind requests to describe the low level synchronisation primitives (e.g. describe the 'create lock/lock/unlock'operations). This has some drawbacks: * it introduces a few instructions in the 'hot path' of the low level glibc primitives. (this might imply to have library versions compiled with/without these requests, which makes it more cumbersome for the end user, which has then to point at the good library when using helgrind). * Such annotations are valgrind specific. They will probably work also for the other valgrind race detector (drd). But there are other race detection tools which might have similar difficulties. * It is not sure that all helgrind kludges can be eliminated by using these requests. Someone on valgrind dev mailing list suggested to have specific debug information used to mark synchronisation primitives. Interesting idea, but not sure if/how that can work (and that is for sure a very long term solution). In summary, have a better support for race detection tools in glibc is attractive, but looks not trivial. So, several various things can be done: * short term: the valgrind hack patch to disable stack cache in to support old and current glibc * when glibc provides a tuning infrastructure, replace the hack by a cleaner way to disable the cache * more longer term: see how glibc could "describe" its behaviour to race detection tools * Maybe at somewhat shorter term, solve the problem of pthread cond var, for which there is currently no known hack/kludge (see helgrind user manual section 7.5 for a description of the difficulty) Thanks Philippe Index: coregrind/m_clientstate.c =================================================================== --- coregrind/m_clientstate.c (revision 14176) +++ coregrind/m_clientstate.c (working copy) @@ -107,6 +107,8 @@ VG_(get_StackTrace) in m_stacktrace.c for further info. */ Addr VG_(client__dl_sysinfo_int80) = 0; +/* Address of the (internal) glibc nptl pthread stack cache size. */ +SizeT* VG_(client__stack_cache_actsize) = 0; /*--------------------------------------------------------------------*/ /*--- end ---*/ Index: coregrind/pub_core_clientstate.h =================================================================== --- coregrind/pub_core_clientstate.h (revision 14176) +++ coregrind/pub_core_clientstate.h (working copy) @@ -93,6 +93,24 @@ extern Addr VG_(client__dl_sysinfo_int80); +/* glibc pthread systems only, when --deactivate-pthread-stack-cache=viahack + Used for a (hacky) way to disable the cache of stacks as implemented in nptl + glibc. + Based on internal knowledge of the pthread glibc code: + a huge value in stack_cache_actsize (bigger than the constant + stack_cache_maxsize) makes glibc believes the cache is full + and so stack are always released when a pthread terminates. + See glibc nptl/allocatestack.c. + Several ugliness in this hack: + * hardcodes private glibc var name "stack_cache_maxsize" + * based on knowledge of the code of the functions + queue_stack and __free_stacks + * static symbol for "stack_cache_maxsize" must be in + the symbol. + It would be much cleaner to have a documented and supported + way to disable the pthread stack cache. */ +extern SizeT* VG_(client__stack_cache_actsize); + #endif // __PUB_CORE_CLIENTSTATE_H /*--------------------------------------------------------------------*/ Index: coregrind/m_options.c =================================================================== --- coregrind/m_options.c (revision 14176) +++ coregrind/m_options.c (working copy) @@ -120,6 +120,7 @@ const HChar* VG_(clo_req_tsyms)[VG_CLO_MAX_REQ_TSYMS]; HChar* VG_(clo_require_text_symbol) = NULL; Bool VG_(clo_run_libc_freeres) = True; +Bool VG_(sim_hint_deactivate_pthread_stack_cache_via_hack) = False; Bool VG_(clo_track_fds) = False; Bool VG_(clo_show_below_main)= False; Bool VG_(clo_show_emwarns) = False; Index: coregrind/pub_core_options.h =================================================================== --- coregrind/pub_core_options.h (revision 14176) +++ coregrind/pub_core_options.h (working copy) @@ -279,6 +279,15 @@ cannot be overridden from the command line. */ extern Bool VG_(clo_run_libc_freeres); +/* Should we disable glibc pthread stack cache (via a hack) ? + Currently, we can only disable it via a hack. Maybe one day, + glibc will provide a clean way to disable it. + => using a Bool for now, but might become an 'enum' then. + The boolean is set to True if deactivate-pthread-stack-cache-via-hack is + given in --sim-hints. */ +extern Bool VG_(sim_hint_deactivate_pthread_stack_cache_via_hack); + + /* Should we show VEX emulation warnings? Default: NO */ extern Bool VG_(clo_show_emwarns); Index: coregrind/m_redir.c =================================================================== --- coregrind/m_redir.c (revision 14176) +++ coregrind/m_redir.c (working copy) @@ -403,6 +403,10 @@ Bool check_ppcTOCs = False; Bool isText; const HChar* newdi_soname; + Bool dehacktivate_pthread_stack_cache_var_search = False; + const HChar* const pthread_soname = "libpthread.so.0"; + const HChar* const pthread_stack_cache_actsize_varname + = "stack_cache_actsize"; # if defined(VG_PLAT_USES_PPCTOC) check_ppcTOCs = True; @@ -497,6 +501,10 @@ specList = NULL; /* the spec list we're building up */ + dehacktivate_pthread_stack_cache_var_search = + VG_(sim_hint_deactivate_pthread_stack_cache_via_hack) + && 0 == VG_(strcmp)(newdi_soname, pthread_soname); + nsyms = VG_(DebugInfo_syms_howmany)( newdi ); for (i = 0; i < nsyms; i++) { VG_(DebugInfo_syms_getidx)( newdi, i, &sym_addr, &sym_toc, @@ -513,8 +521,22 @@ demangled_fnpatt, N_DEMANGLED, &isWrap, &becTag, &becPrio ); /* ignore data symbols */ - if (!isText) + if (!isText) { + /* But search for dehacktivate stack cache var if needed. */ + if (dehacktivate_pthread_stack_cache_var_search + && 0 == VG_(strcmp)(*names, + pthread_stack_cache_actsize_varname)) { + if ( VG_(clo_verbosity) > 1 ) { + VG_(message)( Vg_DebugMsg, + "deactivate pthread_stack_cache via hack:" + " found symbol %s at addr %p\n", + *names, (void*) sym_addr); + } + VG_(client__stack_cache_actsize) = (SizeT*) sym_addr; + dehacktivate_pthread_stack_cache_var_search = False; + } continue; + } if (!ok) { /* It's not a full-scale redirect, but perhaps it is a load-notify fn? Let the load-notify department see it. */ @@ -589,6 +611,13 @@ } free_symname_array(names_init, &twoslots[0]); } + if (dehacktivate_pthread_stack_cache_var_search) { + VG_(message)(Vg_DebugMsg, + "WARNING: could not find symbol for var %s in %s\n", + pthread_stack_cache_actsize_varname, pthread_soname); + VG_(message)(Vg_DebugMsg, + "=> pthread cache stack cannot be disabled!\n"); + } if (check_ppcTOCs) { for (i = 0; i < nsyms; i++) { Index: coregrind/m_main.c =================================================================== --- coregrind/m_main.c (revision 14176) +++ coregrind/m_main.c (working copy) @@ -175,7 +175,8 @@ " --vgdb-prefix= prefix for vgdb FIFOs [%s]\n" " --run-libc-freeres=no|yes free up glibc memory at exit on Linux? [yes]\n" " --sim-hints=hint1,hint2,... known hints:\n" -" lax-ioctls, enable-outer, fuse-compatible [none]\n" +" deactivate-pthread-stack-cache-via-hack, lax-ioctls, enable-outer,\n" +" no-inner-prefix, fuse-compatible [none]\n" " --fair-sched=no|yes|try schedule threads fairly on multicore systems [no]\n" " --kernel-variant=variant1,variant2,... known variants: bproc [none]\n" " handle non-standard kernel variants\n" @@ -376,7 +377,12 @@ // Set up VG_(clo_sim_hints). This is needed a.o. for an inner // running in an outer, to have "no-inner-prefix" enabled // as early as possible. - else if VG_STR_CLO (str, "--sim-hints", VG_(clo_sim_hints)) {} + else if VG_STR_CLO (str, "--sim-hints", VG_(clo_sim_hints)) { + /* Set initial value if hint selected. */ + VG_(sim_hint_deactivate_pthread_stack_cache_via_hack) = + 0 != VG_(strstr)(VG_(clo_sim_hints), + "deactivate-pthread-stack-cache-via-hack"); + } } } @@ -563,6 +569,7 @@ else if VG_BOOL_CLO(arg, "--show-emwarns", VG_(clo_show_emwarns)) {} else if VG_BOOL_CLO(arg, "--run-libc-freeres", VG_(clo_run_libc_freeres)) {} + else if VG_XACT_CLO(arg, "--vgdb=yes", VG_(clo_vgdb), Vg_VgdbYes) {} else if VG_BOOL_CLO(arg, "--show-below-main", VG_(clo_show_below_main)) {} else if VG_BOOL_CLO(arg, "--time-stamp", VG_(clo_time_stamp)) {} else if VG_BOOL_CLO(arg, "--track-fds", VG_(clo_track_fds)) {} Index: coregrind/m_scheduler/scheduler.c =================================================================== --- coregrind/m_scheduler/scheduler.c (revision 14176) +++ coregrind/m_scheduler/scheduler.c (working copy) @@ -61,14 +61,15 @@ #include "pub_core_basics.h" #include "pub_core_debuglog.h" #include "pub_core_vki.h" -#include "pub_core_vkiscnums.h" // __NR_sched_yield -#include "pub_core_libcsetjmp.h" // to keep _threadstate.h happy +#include "pub_core_vkiscnums.h" // __NR_sched_yield +#include "pub_core_libcsetjmp.h" // to keep _threadstate.h happy #include "pub_core_threadstate.h" +#include "pub_core_clientstate.h" #include "pub_core_aspacemgr.h" -#include "pub_core_clreq.h" // for VG_USERREQ__* +#include "pub_core_clreq.h" // for VG_USERREQ__* #include "pub_core_dispatch.h" -#include "pub_core_errormgr.h" // For VG_(get_n_errs_found)() -#include "pub_core_gdbserver.h" // for VG_(gdbserver) and VG_(gdbserver_activity) +#include "pub_core_errormgr.h" // For VG_(get_n_errs_found)() +#include "pub_core_gdbserver.h" // for VG_(gdbserver)/VG_(gdbserver_activity) #include "pub_core_libcbase.h" #include "pub_core_libcassert.h" #include "pub_core_libcprint.h" @@ -1579,6 +1580,29 @@ if (VG_(clo_trace_sched)) print_sched_event(tid, "exiting VG_(scheduler)"); + if (VG_(sim_hint_deactivate_pthread_stack_cache_via_hack)) { + if (VG_(client__stack_cache_actsize)) { + if (*VG_(client__stack_cache_actsize) == 0) { + /* We activate the stack cache disabling just before the first + thread exits. At this moment, we are sure the pthread lib + loading is terminated/variable was initialised by + pthread lib/... */ + VG_(debugLog)(1,"sched", + "activating pthread stack cache size disabling" + " via hack\n"); + *VG_(client__stack_cache_actsize) = 1000 * 1000 * 1000; + /* Set a value big enough to be above the hardcoded maximum stack cache + size in glibc, small enough to allow a pthread stack size to be added + without risk of overflow. */ + } + } else { + VG_(debugLog)(0,"sched", + "WARNING: pthread cache stack cannot be disabled!\n"); + VG_(sim_hint_deactivate_pthread_stack_cache_via_hack) = False; + // Assign False to avoid having a msg for each thread termination. + } + } + vg_assert(VG_(is_exiting)(tid)); return tst->exitreason; Index: docs/xml/manual-core.xml =================================================================== --- docs/xml/manual-core.xml (revision 14176) +++ docs/xml/manual-core.xml (working copy) @@ -1958,6 +1958,39 @@ are enabled. Use with caution! Currently known hints are: + + This hint is only relevant when running Valgrind on Linux. + + The GNU C glibc pthread library + (libpthread.so), which is used by all + pthread programs, maintains a cache of pthread stacks. + When a pthread terminates, the memory used for the pthread + stack and some thread local storage related data structure + are not always directly released. This memory is kept in + a cache (up to a certain size), and is re-used if a new + thread is started. + + This cache causes the helgrind tool to report some + false positive race condition errors on this cached + memory, as helgrind does not understand the internal + pthread cache synchronisation. So, when using helgrind, + disabling the pthread cache stack helps to avoid false + positive race conditions, in particular when using thread + local storage variables (e.g. variables using the + __thread qualifier). + + Disabling the cache might also help applications that + are running out of memory under Valgrind. + + Note: Valgrind disables the cache using a hack based on + some internal knowledge of the glibc stack cache + implementation and by examining the debug information of + the pthread library. This technique is thus somewhat fragile + and might not work depending on the glibc version. This hack + has been tested with glibc versions 2.11 and 2.16. + + + Be very lax about ioctl handling; the only assumption is that the size is correct. Doesn't require the full buffer to be initialized Index: helgrind/docs/hg-manual.xml =================================================================== --- helgrind/docs/hg-manual.xml (revision 14176) +++ helgrind/docs/hg-manual.xml (working copy) @@ -972,6 +972,16 @@ + If your application is using thread local variables, + helgrind might report false positive race conditions on these + variables, despite being very probably race free. On Linux, you can + use + to avoid such false positive error messages + (see ). + + + + Round up all finished threads using pthread_join. Avoid detaching threads: don't create threads in the detached state, and Index: helgrind/tests/tls_threads.stdout.exp =================================================================== Index: helgrind/tests/tls_threads.c =================================================================== --- helgrind/tests/tls_threads.c (revision 0) +++ helgrind/tests/tls_threads.c (revision 0) @@ -0,0 +1,109 @@ +#include +#include +#include + +#ifdef HAVE_TLS + +static int only_touch_stackvar; + +/* We should have no error on local and global + as these are both thread local variables. */ +static __thread int local; +__thread int global; + +/* We will wrongly share this variable indirectly through a pointer + We should have an error for this. */ +static __thread int badly_shared_local; + +/* ptr_to_badly_shared_local allows to have multiple threads seeing + the same thread local storage. This is however really bad sharing + as this can cause SEGV or whatever, as when the thread disappears, + the badly_shared_local var pointed to can also disappear. + By default, the regtest does not test this really bad sharing. */ +pthread_mutex_t protect_ptr_to_badly_shared_local = PTHREAD_MUTEX_INITIALIZER; +int *ptr_to_badly_shared_local; + +static void local_false_positive(void) +{ + local = local + 1; // no error is expected +} + +static void global_false_positive(void) +{ + global = global + 1; // no error is expected +} + +static void badly_shared_local_error_expected(void) +{ + *ptr_to_badly_shared_local = *ptr_to_badly_shared_local + 1; // an error is expected + // This can cause a SIGSEGV. +} + +static void *level2(void *p) +{ + int stackvar = 0; + + stackvar = stackvar + only_touch_stackvar; + + local_false_positive(); + global_false_positive(); + if (only_touch_stackvar != 0) { + badly_shared_local_error_expected(); + } + return 0; +} + +#define NLEVEL2 10 +static void *level1(void *p) +{ + pthread_t threads[NLEVEL2]; + int curthread = 0; + int i; + + pthread_mutex_lock(&protect_ptr_to_badly_shared_local); + if (ptr_to_badly_shared_local == NULL) + ptr_to_badly_shared_local = &badly_shared_local; + pthread_mutex_unlock(&protect_ptr_to_badly_shared_local); + + for(i = 0; i < NLEVEL2; i++) { + pthread_create(&threads[curthread++], NULL, level2, NULL); + } + + for(i = 0; i < curthread; i++) + pthread_join(threads[i], NULL); + + return 0; +} + +#define NLEVEL1 3 +int main(int argc, char*argv[]) +{ + pthread_t threads[NLEVEL1]; + int curthread = 0; + int i; + + only_touch_stackvar = argc > 1; + + for(i = 0; i < NLEVEL1; i++) { + pthread_create(&threads[curthread++], NULL, level1, NULL); + } + + fprintf(stderr, "starting join in main\n"); + fflush(stderr); + for(i = 0; i < curthread; i++) + pthread_join(threads[i], NULL); + fprintf(stderr, "finished join in main\n"); + fflush(stderr); + return 0; +} +#else +int main() +{ + fprintf(stderr, "starting join in main\n"); + fflush(stderr); + /* do nothing */ + fprintf(stderr, "finished join in main\n"); + fflush(stderr); + return 0; +} +#endif Index: helgrind/tests/Makefile.am =================================================================== --- helgrind/tests/Makefile.am (revision 14176) +++ helgrind/tests/Makefile.am (working copy) @@ -102,7 +102,9 @@ tc23_bogus_condwait.stderr.exp \ tc23_bogus_condwait.stderr.exp-mips32 \ tc24_nonzero_sem.vgtest tc24_nonzero_sem.stdout.exp \ - tc24_nonzero_sem.stderr.exp + tc24_nonzero_sem.stderr.exp \ + tls_threads.vgtest tls_threads.stdout.exp \ + tls_threads.stderr.exp # XXX: tc18_semabuse uses operations that are unsupported on Darwin. It # should be conditionally compiled like tc20_verifywrap is. @@ -144,7 +146,8 @@ tc19_shadowmem \ tc21_pthonce \ tc23_bogus_condwait \ - tc24_nonzero_sem + tc24_nonzero_sem \ + tls_threads # DDD: it seg faults, and then the Valgrind exit path hangs # JRS 29 July 09: it craps out in the stack unwinder, in Index: helgrind/tests/tls_threads.stderr.exp =================================================================== --- helgrind/tests/tls_threads.stderr.exp (revision 0) +++ helgrind/tests/tls_threads.stderr.exp (revision 0) @@ -0,0 +1,2 @@ +starting join in main +finished join in main Index: helgrind/tests/tls_threads.vgtest =================================================================== --- helgrind/tests/tls_threads.vgtest (revision 0) +++ helgrind/tests/tls_threads.vgtest (revision 0) @@ -0,0 +1,2 @@ +prog: tls_threads +vgopts: -q --sim-hints=deactivate-pthread-stack-cache-via-hack Index: none/tests/cmdline2.stdout.exp =================================================================== --- none/tests/cmdline2.stdout.exp (revision 14176) +++ none/tests/cmdline2.stdout.exp (working copy) @@ -88,7 +88,8 @@ --vgdb-prefix= prefix for vgdb FIFOs [/tmp/vgdb-pipe] --run-libc-freeres=no|yes free up glibc memory at exit on Linux? [yes] --sim-hints=hint1,hint2,... known hints: - lax-ioctls, enable-outer, fuse-compatible [none] + deactivate-pthread-stack-cache-via-hack, lax-ioctls, enable-outer, + no-inner-prefix, fuse-compatible [none] --fair-sched=no|yes|try schedule threads fairly on multicore systems [no] --kernel-variant=variant1,variant2,... known variants: bproc [none] handle non-standard kernel variants Index: none/tests/cmdline1.stdout.exp =================================================================== --- none/tests/cmdline1.stdout.exp (revision 14176) +++ none/tests/cmdline1.stdout.exp (working copy) @@ -88,7 +88,8 @@ --vgdb-prefix= prefix for vgdb FIFOs [/tmp/vgdb-pipe] --run-libc-freeres=no|yes free up glibc memory at exit on Linux? [yes] --sim-hints=hint1,hint2,... known hints: - lax-ioctls, enable-outer, fuse-compatible [none] + deactivate-pthread-stack-cache-via-hack, lax-ioctls, enable-outer, + no-inner-prefix, fuse-compatible [none] --fair-sched=no|yes|try schedule threads fairly on multicore systems [no] --kernel-variant=variant1,variant2,... known variants: bproc [none] handle non-standard kernel variants