From patchwork Wed Apr 10 02:29:16 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Kevin Buettner <kevinb@redhat.com>
X-Patchwork-Id: 32240
Received: (qmail 32036 invoked by alias); 10 Apr 2019 02:29:21 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Delivered-To: mailing list gdb-patches@sourceware.org
Received: (qmail 31715 invoked by uid 89); 10 Apr 2019 02:29:20 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-16.4 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,
	SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=placing,
	showing
X-HELO: mx1.redhat.com
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Wed, 10 Apr 2019 02:29:18 +0000
Received: from smtp.corp.redhat.com
	(int-mx08.intmail.prod.int.phx2.redhat.com
	[10.5.11.23])	(using TLSv1.2 with cipher AECDH-AES256-SHA
	(256/256 bits))	(No client certificate requested)	by
	mx1.redhat.com (Postfix) with ESMTPS id 8D71D30832CB	for
	<gdb-patches@sourceware.org>; Wed, 10 Apr 2019 02:29:17 +0000 (UTC)
Received: from f29-4.lan (ovpn-116-111.phx2.redhat.com [10.3.116.111])	by
	smtp.corp.redhat.com (Postfix) with ESMTPS id 6227619C4F;
	Wed, 10 Apr 2019 02:29:17 +0000 (UTC)
Date: Tue, 9 Apr 2019 19:29:16 -0700
From: Kevin Buettner <kevinb@redhat.com>
To: gdb-patches@sourceware.org
Cc: Pedro Alves <palves@redhat.com>
Subject: Re: [PATCH] Fix amd64->i386 linux syscall restart problem
Message-ID: <20190409192916.79fcb539@f29-4.lan>
In-Reply-To: <975e7120-1b61-2807-5aab-b961e07a80d0@redhat.com>
References: <20190316221341.021f7c62@f29-4.lan>
	<975e7120-1b61-2807-5aab-b961e07a80d0@redhat.com>
MIME-Version: 1.0
X-IsSubscribed: yes

On Tue, 9 Apr 2019 18:46:45 +0100
Pedro Alves <palves@redhat.com> wrote:

> > +	  void *ptr = (gdb_byte *) gregs 
> > +		    + amd64_linux_gregset32_reg_offset[I386_EAX_REGNUM];  
> 
> Need to wrap the expression that spawns two lines in ()s.

I've made this change.  And the other one too.

> > +
> > +  /* Sign extend EAX value to avoid potential syscall restart problems.  */  
> 
> I'd rather see both implementations use the same comment.  Could you
> merge them?

I started to merge them and then decided to write a more detailed
comment based on the text that I wrote for the commit comment.  I
have, for the moment anyway, copied the comment to both locations with
only slight changes which reflect where the comment is located.  The
problem with having copies of the same long comment in two or more
places is making sure that if one gets updated, then the rest do too. 
It might be better to have one refer to the other.

I'm thinking that it might be preferable to have something like this
in gdb/gdbserver/linux-x86-low.c:

  /* Sign extend EAX value to avoid potential syscall restart problems.
  
     See amd64_linux_collect_native_gregset() in gdb/amd64-linux-nat.c
     for a detailed explanation.  */

Below is a diff showing the new comments.  It also includes the
changes which wrap the multi-line expressions in parens.

diff --git a/gdb/amd64-linux-nat.c b/gdb/amd64-linux-nat.c
index 06018c8f1c..8d0e8eb35c 100644
--- a/gdb/amd64-linux-nat.c
+++ b/gdb/amd64-linux-nat.c
@@ -106,14 +106,51 @@ amd64_linux_collect_native_gregset (const struct regcache *regcache,
   struct gdbarch *gdbarch = regcache->arch ();
   if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
     {
-      /* Sign-extend %eax as during return from a syscall it is being checked
-	 for -ERESTART* values -512 being above 0xfffffffffffffe00; tested by
-	 interrupt.exp.	 */
+      /* Sign extend EAX value to avoid potential syscall restart
+	 problems.  
+
+	 On Linux, when a syscall is interrupted by a signal, the
+	 (kernel function implementing the) syscall may return
+	 -ERESTARTSYS when a signal occurs.  Doing so indicates that
+	 the syscall is restartable.  Then, depending on settings
+	 associated with the signal handler, and after the signal
+	 handler is called, the kernel can then either return -EINTR
+	 or it can cause the syscall to be restarted.  We are
+	 concerned with the latter case here.
+	 
+	 On (32-bit) i386, the status (-ERESTARTSYS) is placed in the
+	 EAX register.  When debugging a 32-bit process from a 64-bit
+	 (amd64) GDB, the debugger fetches 64-bit registers even
+	 though the process being debugged is only 32-bit.  The
+	 register cache is only 32 bits wide though; GDB discards the
+	 high 32 bits when placing 64-bit values in the 32-bit
+	 regcache.  Normally, this is not a problem since the 32-bit
+	 process should only care about the lower 32-bit portions of
+	 these registers.  That said, it can happen that the 64-bit
+	 value being restored will be different from the 64-bit value
+	 that was originally retrieved from the kernel.  The one place
+	 (that we know of) where it does matter is in the kernel's
+	 syscall restart code.  The kernel's code for restarting a
+	 syscall after a signal expects to see a negative value
+	 (specifically -ERESTARTSYS) in the 64-bit RAX register in
+	 order to correctly cause a syscall to be restarted.
+	 
+	 The call to amd64_collect_native_gregset, above, is setting
+	 the high 32 bits of RAX (and other registers too) to 0.  For
+	 syscall restart, we need to sign extend EAX so that RAX will
+	 appear as a negative value when EAX is set to -ERESTARTSYS. 
+	 This in turn will cause the signal handling code in the
+	 kernel to recognize -ERESTARTSYS which will in turn cause the
+	 syscall to be restarted.
+
+	 The test case gdb.base/interrupt.exp tests for this problem.
+	 Without this sign extension code in place, it'll show
+	 a number of failures when testing against unix/-m32.  */
 
       if (regnum == -1 || regnum == I386_EAX_REGNUM)
 	{
-	  void *ptr = (gdb_byte *) gregs 
-		    + amd64_linux_gregset32_reg_offset[I386_EAX_REGNUM];
+	  void *ptr = ((gdb_byte *) gregs 
+		       + amd64_linux_gregset32_reg_offset[I386_EAX_REGNUM]);
 
 	  *(int64_t *) ptr = *(int32_t *) ptr;
 	}
diff --git a/gdb/gdbserver/linux-x86-low.c b/gdb/gdbserver/linux-x86-low.c
index 20f369c496..9a97cdf5c0 100644
--- a/gdb/gdbserver/linux-x86-low.c
+++ b/gdb/gdbserver/linux-x86-low.c
@@ -339,11 +339,48 @@ x86_fill_gregset (struct regcache *regcache, void *buf)
   collect_register_by_name (regcache, "orig_eax",
 			    ((char *) buf) + ORIG_EAX * REGSIZE);
 
-  /* Sign extend EAX value to avoid potential syscall restart problems.  */
+  /* Sign extend EAX value to avoid potential syscall restart
+     problems. 
+
+     On Linux, when a syscall is interrupted by a signal, the (kernel
+     function implementing the) syscall may return -ERESTARTSYS when a
+     signal occurs.  Doing so indicates that the syscall is
+     restartable.  Then, depending on settings associated with the
+     signal handler, and after the signal handler is called, the
+     kernel can then either return -EINTR or it can cause the syscall
+     to be restarted.  We are concerned with the latter case here.
+     
+     On (32-bit) i386, the status (-ERESTARTSYS) is placed in the EAX
+     register.  When debugging a 32-bit process from a 64-bit (amd64)
+     GDB, the debugger fetches 64-bit registers even though the
+     process being debugged is only 32-bit.  The register cache is
+     only 32 bits wide though; GDB discards the high 32 bits when
+     placing 64-bit values in the 32-bit regcache.  Normally, this is
+     not a problem since the 32-bit process should only care about the
+     lower 32-bit portions of these registers.  That said, it can
+     happen that the 64-bit value being restored will be different
+     from the 64-bit value that was originally retrieved from the
+     kernel.  The one place (that we know of) where it does matter is
+     in the kernel's syscall restart code.  The kernel's code for
+     restarting a syscall after a signal expects to see a negative
+     value (specifically -ERESTARTSYS) in the 64-bit RAX register in
+     order to correctly cause a syscall to be restarted.
+     
+     The call to collect_register, above, is setting the high 32 bits
+     of RAX (and other registers too) to 0.  For syscall restart, we
+     need to sign extend EAX so that RAX will appear as a negative
+     value when EAX is set to -ERESTARTSYS.  This in turn will cause
+     the signal handling code in the kernel to recognize -ERESTARTSYS
+     which will in turn cause the syscall to be restarted.
+
+     The test case gdb.base/interrupt.exp tests for this problem. 
+     Without this sign extension code in place, it'll show a number of
+     failures when testing against native-gdbserver/-m32.  */
+
   if (register_size (regcache->tdesc, 0) == 4)
     {
-      void *ptr = (gdb_byte *) buf
-                + i386_regmap[find_regno (regcache->tdesc, "eax")];
+      void *ptr = ((gdb_byte *) buf
+                   + i386_regmap[find_regno (regcache->tdesc, "eax")]);
 
       *(int64_t *) ptr = *(int32_t *) ptr;
     }