From patchwork Wed Mar 28 11:58:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Liebler X-Patchwork-Id: 26500 Received: (qmail 98576 invoked by alias); 28 Mar 2018 11:58:24 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 97589 invoked by uid 89); 28 Mar 2018 11:58:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 spammy=transaction X-HELO: mx0a-001b2d01.pphosted.com From: Stefan Liebler Subject: Questions about failing testcase nptl/test-mutex-printers To: GNU C Library Cc: Martin Galvan , Andi Kleen , Adhemerval Zanella , tuliom@linux.vnet.ibm.com, raji@linux.vnet.ibm.com, Andreas Arnez Date: Wed, 28 Mar 2018 13:58:14 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18032811-0040-0000-0000-000004464F47 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18032811-0041-0000-0000-000020EA5EB1 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-03-28_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1031 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1803280130 Hi, I'm running the testcase nptl/test-mutex-printers with all the needed prerequirements (build with debug-info, python-pexpect, gdb, ...). The test passes if I start it without lock-elision. If I run it with lock-elision (export GLIBC_TUNABLES=glibc.elision.enable=1) on s390x, it fails with: Error: Response does not match the expected pattern. Command: print *mutex Expected pattern: pthread_mutex_t Response: No symbol "mutex" in current context. (gdb) The program built with test-mutex-printers.c is started in gdb via test-mutex-printers.py. But you don't need the script to reproduce the issue. Just start the test-binary with gdb: -gdb --args nptl/test-mutex-printers --direct -(gdb) set environment GLIBC_TUNABLES glibc.elision.enable=1 -(gdb) b 79 (gdb) r Now we break at line 79:" && pthread_mutex_lock (mutex) == 0 /* Test status (non-robust). */" just before the call of pthread_mutex_lock(). -"next" shows us the real issue: (gdb) n Program received signal SIGILL, Illegal instruction. 0x000003fffdf14f1c in __lll_lock_elision (futex=0x3ffffffe9b0, adapt_count=0x3ffffffe9c6, private=0) at ../sysdeps/unix/sysv/linux/s390/elision-lock.c:59 (gdb) disassemble Dump of assembler code for function __lll_lock_elision: ... 0x000003fffdf14f12 <+170>: tbegin 0,65294 0x000003fffdf14f18 <+176>: jne 0x3fffdf14f24 <__lll_lock_elision+188> => 0x000003fffdf14f1c <+180>: lhi %r0,0 0x000003fffdf14f20 <+184>: j 0x3fffdf14f66 <__lll_lock_elision+254> ... -Note: If we are now following the commands in test-mutex-printers.py script, we get the same response: (gdb) print *mutex No symbol "mutex" in current context. I'm currently using GNU gdb (GDB) Fedora 8.0.1-30.fc27. And yes, this is definitively a bug in gdb. As I don't have access to other lock-elision enabled machines, can somebody test this on power / intel? What is the behaviour if you step over pthread_mutex_lock() with "next" if lock-elision is enabled? Does the transaction abort and we are debugging the fallback-path (without a transaction) or does it stop just after pthread_mutex_unlock()? If I step manually to the tbegin-instruction (which starts the transaction on s390x) and step over it, then gdb steps over the whole transaction and we are just after the tend-instruction. Does it make sense to disable lock-elision for the pretty-printer-tests? E.g. with the following patch: Bye. Stefan diff --git a/scripts/test_printers_common.py b/scripts/test_printers_common.py index 73ca525556..d74a8b4d4b 100644 --- a/scripts/test_printers_common.py +++ b/scripts/test_printers_common.py @@ -171,6 +171,9 @@ def init_test(test_bin, printer_files, printer_names): # Finally, load the test binary. test('file {0}'.format(test_bin)) + # Disable lock elision. + test('set environment GLIBC_TUNABLES glibc.elision.enable=0') + def go_to_main(): """Executes a gdb 'start' command, which takes us to main."""