Message ID | 5461DBC4.9090508@redhat.com |
---|---|
State | Superseded |
Headers |
Received: (qmail 2566 invoked by alias); 11 Nov 2014 09:50:03 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 2459 invoked by uid 89); 11 Nov 2014 09:50:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Message-ID: <5461DBC4.9090508@redhat.com> Date: Tue, 11 Nov 2014 04:49:56 -0500 From: "Carlos O'Donell" <carlos@redhat.com> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Richard Henderson <rth@twiddle.net>, Andreas Krebbel <krebbel@linux.vnet.ibm.com>, Siddhesh Poyarekar <siddhesh@redhat.com>, GNU C Library <libc-alpha@sourceware.org> Subject: Re: Misaligned stack on 32-bit s390? References: <54619F3E.8080306@redhat.com> <5461D6CA.9030902@twiddle.net> In-Reply-To: <5461D6CA.9030902@twiddle.net> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit |
Commit Message
Carlos O'Donell
Nov. 11, 2014, 9:49 a.m. UTC
On 11/11/2014 04:28 AM, Richard Henderson wrote: > On 11/11/2014 06:31 AM, Carlos O'Donell wrote: >> Any clever ideas on how to fix this without copying up a large >> portion of the stack? > > Nope, because other targets do in fact have to do just that. Right. > I'm actually surprised that almost all of them don't. I suppose > that just depends on how the ABI is set up to pass parameters to > the user _start... Yes, on hppa the ABI was such that I didn't depend on the layout, but s390 does, thus I think it is non-trivial. The argv might be trivial to copy, but the size of envp is unknown, and _start from the application side in start.S expects argv and envp to be right up against eachother AFAICT. Worse it wants to scan past them in order to find auxv. The saving grace is that the auxv scan parses an arbitrary number of zeros between envp and auxv. So I can move the gap to be between envp and auxv. > Fortunately, s390 has a block copy instruction, so the move > should be trivial to implement. For argv only. What instruction is the block copy? Are you talking about lm/stm? The most naive fix I have working is as follows (I've handed off to Siddhesh to have a look since I'm out of time this evening/morning). --- Cheers, Carlos.
Comments
On 11/11/2014 10:49 AM, Carlos O'Donell wrote: > For argv only. What instruction is the block copy? Are you > talking about lm/stm? I was thinking of EX combined with MVC. But of course you're right about envp and argp, where you'd have to count the number of entries before forming the count. > + # Now we have to zero out the envp entries after NULL to allow\n\ > + # start.S to properly find auxv by skipping zeroes.\n\ > + # zero out loop:\n\ > + lhi %r7,0\n\ > +.L3: st %r7,0(%r6) # Store zero.\n\ > + ahi %r6,4 # Advance dest pointer.\n\ > + ahi %r1,-1 # Subtract one from the word count.\n\ > + ltr %r1,%r1\n\ > + jne .L3 # Keep copying if the word count is non-zero.\n\ In the alpha port, I just copy auxv down as well. r~
On Tue, Nov 11, 2014 at 04:49:56AM -0500, Carlos O'Donell wrote: > + # Set the back chain to zero again\n\ > +.L4: xc 0(4,%r15),0(%r15)\n\ I don't think this is needed now, since you're not shifting the stack pointer anymore. Siddhesh
On 11/11/2014 10:49 AM, Carlos O'Donell wrote: > On 11/11/2014 04:28 AM, Richard Henderson wrote: >> On 11/11/2014 06:31 AM, Carlos O'Donell wrote: >>> Any clever ideas on how to fix this without copying up a large >>> portion of the stack? >> >> Nope, because other targets do in fact have to do just that. > > Right. > >> I'm actually surprised that almost all of them don't. I suppose >> that just depends on how the ABI is set up to pass parameters to >> the user _start... > > Yes, on hppa the ABI was such that I didn't depend on the layout, > but s390 does, thus I think it is non-trivial. > > The argv might be trivial to copy, but the size of envp is unknown, > and _start from the application side in start.S expects argv and > envp to be right up against eachother AFAICT. Worse it wants to > scan past them in order to find auxv. The saving grace is that > the auxv scan parses an arbitrary number of zeros between envp > and auxv. So I can move the gap to be between envp and auxv. > >> Fortunately, s390 has a block copy instruction, so the move >> should be trivial to implement. > > For argv only. What instruction is the block copy? Are you > talking about lm/stm? > > The most naive fix I have working is as follows (I've handed > off to Siddhesh to have a look since I'm out of time this > evening/morning). Thanks for working on this! Really surprising that this stayed unnoticed for that long. I thought the testsuite invokes the testcases as ld.so parameter as well? As I understand it an alignment fix is only required if _dl_skip_args is odd. So what about checking the least significant bit (tmll) and copying argv/envv down by 4 bytes? That way we would not need to copy anything in 50% of the cases? Would you like to continue working on it or should we try to take over? (Stefan or myself) Bye, -Andreas- > > diff --git a/sysdeps/s390/s390-32/dl-machine.h b/sysdeps/s390/s390-32/dl-machine.h > index c56185c..b189552 100644 > --- a/sysdeps/s390/s390-32/dl-machine.h > +++ b/sysdeps/s390/s390-32/dl-machine.h > @@ -166,18 +166,47 @@ _dl_start_user:\n\ > # See if we were run as a command with the executable file\n\ > # name as an extra leading argument.\n\ > l %r1,_dl_skip_args@GOT12(0,%r12)\n\ > - l %r1,0(%r1) # load _dl_skip_args\n\ > + l %r1,0(%r1) # load _dl_skip_args\n\ > + ltr %r1,%r1\n\ > + je .L4 # Skip the arg adjustment if there were none.\n\ > # Get the original argument count.\n\ > l %r0,96(%r15)\n\ > # Subtract _dl_skip_args from it.\n\ > sr %r0,%r1\n\ > - # Adjust the stack pointer to skip _dl_skip_args words.\n\ > - sll %r1,2\n\ > - ar %r15,%r1\n\ > - # Set the back chain to zero again\n\ > - xc 0(4,%r15),0(%r15)\n\ > # Store back the modified argument count.\n\ > st %r0,96(%r15)\n\ > + # Copy argv and envp forward to account for skipped argv entries.\n\ > + # We skipped at least one argument or we would not get here.\n\ > + la %r6,100(%r15) # Destination pointer i.e. &argv[0]\n\ > + lr %r5,%r6\n\ > + lr %r0,%r1\n\ > + sll %r0,2\n\ # Number of skipped bytes.\n\ > + ar %r5,%r0 # Source pointer = Dest + Skipped args.\n\ > + # argv copy loop:\n\ > +.L1: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L1 # Stop after copying the NULL.\n\ > + # envp copy loop:\n\ > +.L2: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L2 # Stop after copying the NULL.\n\ > + # Now we have to zero out the envp entries after NULL to allow\n\ > + # start.S to properly find auxv by skipping zeroes.\n\ > + # zero out loop:\n\ > + lhi %r7,0\n\ > +.L3: st %r7,0(%r6) # Store zero.\n\ > + ahi %r6,4 # Advance dest pointer.\n\ > + ahi %r1,-1 # Subtract one from the word count.\n\ > + ltr %r1,%r1\n\ > + jne .L3 # Keep copying if the word count is non-zero.\n\ > + # Set the back chain to zero again\n\ > +.L4: xc 0(4,%r15),0(%r15)\n\ > # The special initializer gets called with the stack just\n\ > # as the application's entry point will see it; it can\n\ > # switch stacks if it moves these contents over.\n\ > --- > > Cheers, > Carlos. >
Andreas Krebbel <krebbel@linux.vnet.ibm.com> writes: > As I understand it an alignment fix is only required if _dl_skip_args is > odd. So what about > checking the least significant bit (tmll) and copying argv/envv down by 4 > bytes? That way we would > not need to copy anything in 50% of the cases? I don't think it makes sense optimizing for this case, which is rare outside of runing the glibc testsuite. Andreas.
On 11/11/2014 10:49 AM, Carlos O'Donell wrote: > On 11/11/2014 04:28 AM, Richard Henderson wrote: >> On 11/11/2014 06:31 AM, Carlos O'Donell wrote: >>> Any clever ideas on how to fix this without copying up a large >>> portion of the stack? >> >> Nope, because other targets do in fact have to do just that. > > Right. > >> I'm actually surprised that almost all of them don't. I suppose >> that just depends on how the ABI is set up to pass parameters to >> the user _start... > > Yes, on hppa the ABI was such that I didn't depend on the layout, > but s390 does, thus I think it is non-trivial. > > The argv might be trivial to copy, but the size of envp is unknown, > and _start from the application side in start.S expects argv and > envp to be right up against eachother AFAICT. Worse it wants to > scan past them in order to find auxv. The saving grace is that > the auxv scan parses an arbitrary number of zeros between envp > and auxv. So I can move the gap to be between envp and auxv. > >> Fortunately, s390 has a block copy instruction, so the move >> should be trivial to implement. > > For argv only. What instruction is the block copy? Are you > talking about lm/stm? > > The most naive fix I have working is as follows (I've handed > off to Siddhesh to have a look since I'm out of time this > evening/morning). > > diff --git a/sysdeps/s390/s390-32/dl-machine.h b/sysdeps/s390/s390-32/dl-machine.h > index c56185c..b189552 100644 > --- a/sysdeps/s390/s390-32/dl-machine.h > +++ b/sysdeps/s390/s390-32/dl-machine.h > @@ -166,18 +166,47 @@ _dl_start_user:\n\ > # See if we were run as a command with the executable file\n\ > # name as an extra leading argument.\n\ > l %r1,_dl_skip_args@GOT12(0,%r12)\n\ > - l %r1,0(%r1) # load _dl_skip_args\n\ > + l %r1,0(%r1) # load _dl_skip_args\n\ > + ltr %r1,%r1\n\ > + je .L4 # Skip the arg adjustment if there were none.\n\ > # Get the original argument count.\n\ > l %r0,96(%r15)\n\ > # Subtract _dl_skip_args from it.\n\ > sr %r0,%r1\n\ > - # Adjust the stack pointer to skip _dl_skip_args words.\n\ > - sll %r1,2\n\ > - ar %r15,%r1\n\ > - # Set the back chain to zero again\n\ > - xc 0(4,%r15),0(%r15)\n\ > # Store back the modified argument count.\n\ > st %r0,96(%r15)\n\ > + # Copy argv and envp forward to account for skipped argv entries.\n\ > + # We skipped at least one argument or we would not get here.\n\ > + la %r6,100(%r15) # Destination pointer i.e. &argv[0]\n\ > + lr %r5,%r6\n\ > + lr %r0,%r1\n\ > + sll %r0,2\n\ # Number of skipped bytes.\n\ > + ar %r5,%r0 # Source pointer = Dest + Skipped args.\n\ > + # argv copy loop:\n\ > +.L1: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L1 # Stop after copying the NULL.\n\ > + # envp copy loop:\n\ > +.L2: l %r7,0(%r5) # Load a word from the source.\n\ > + st %r7,0(%r6) # Store the word in the destination.\n\ > + ahi %r5,4\n\ > + ahi %r6,4\n\ > + ltr %r7,%r7\n\ > + jne .L2 # Stop after copying the NULL.\n\ > + # Now we have to zero out the envp entries after NULL to allow\n\ > + # start.S to properly find auxv by skipping zeroes.\n\ > + # zero out loop:\n\ > + lhi %r7,0\n\ > +.L3: st %r7,0(%r6) # Store zero.\n\ > + ahi %r6,4 # Advance dest pointer.\n\ > + ahi %r1,-1 # Subtract one from the word count.\n\ > + ltr %r1,%r1\n\ > + jne .L3 # Keep copying if the word count is non-zero.\n\ > + # Set the back chain to zero again\n\ > +.L4: xc 0(4,%r15),0(%r15)\n\ > # The special initializer gets called with the stack just\n\ > # as the application's entry point will see it; it can\n\ > # switch stacks if it moves these contents over.\n\ The patch is ok to apply with the xc removed. Thanks! A minor improvement might be to make use of an index reg in the loops. Perhaps something like this (untested): + # argv copy loop:\n\ + lhi %r8,0 +.L1: l %r7,0(%r8,%r5) # Load a word from the source.\n\ + st %r7,0(%r8,%r6) # Store the word in the destination.\n\ + ahi %r8,4\n\ + ltr %r7,%r7\n\ + jne .L1 # Stop after copying the NULL.\n\ + # envp copy loop:\n\ +.L2: l %r7,0(%r8,%r5) # Load a word from the source.\n\ + st %r7,0(%r8,%r6) # Store the word in the destination.\n\ + ahi %r8,4\n\ + ltr %r7,%r7\n\ + jne .L2 # Stop after copying the NULL.\n\ + # Now we have to zero out the envp entries after NULL to allow\n\ + # start.S to properly find auxv by skipping zeroes.\n\ + # zero out loop:\n\ + lhi %r7,0\n\ +.L3: st %r7,0(%r8,%r6) # Store zero.\n\ + ahi %r8,4 # Advance dest pointer.\n\ + ahi %r1,-1 # Subtract one from the word count.\n\ + ltr %r1,%r1\n\ + jne .L3 # Keep copying if the word count is non-zero.\n\ + # Set the back chain to zero again\n\ +.L4: Bye, -Andreas-
On 11/11/2014 09:54 AM, Andreas Krebbel wrote: > Would you like to continue working on it or should we try to take > over? (Stefan or myself) We can get it done. It would be great to have you review the final patch. Please note that the patch I posted is incomplete, it fails to readjust _dl_argv which is cached by the loader and needs to be changed if argv is moved. Simple fix though. I also don't know how long s390 lasted without this breaking something. I guess aligned stacks don't really matter all that much ;-) Cheers, Carlos.
On 11/12/2014 05:00 AM, Carlos O'Donell wrote: > On 11/11/2014 09:54 AM, Andreas Krebbel wrote: >> Would you like to continue working on it or should we try to take >> over? (Stefan or myself) > > We can get it done. It would be great to have you review the final > patch. > > Please note that the patch I posted is incomplete, it fails to > readjust _dl_argv which is cached by the loader and needs to be > changed if argv is moved. Simple fix though. > > I also don't know how long s390 lasted without this breaking > something. I guess aligned stacks don't really matter all that > much ;-) The reason probably is that the broken alignment never reaches the executable. In _start.S we correct the alignment again. So it really is only a problem for code executed between the argv adjustments done by ld.so and the entry point of the executable. -Andreas- > > Cheers, > Carlos. >
On Wed, Nov 12, 2014 at 10:05:45AM +0100, Andreas Krebbel wrote: > On 11/12/2014 05:00 AM, Carlos O'Donell wrote: > > On 11/11/2014 09:54 AM, Andreas Krebbel wrote: > >> Would you like to continue working on it or should we try to take > >> over? (Stefan or myself) > > > > We can get it done. It would be great to have you review the final > > patch. > > > > Please note that the patch I posted is incomplete, it fails to > > readjust _dl_argv which is cached by the loader and needs to be > > changed if argv is moved. Simple fix though. > > > > I also don't know how long s390 lasted without this breaking > > something. I guess aligned stacks don't really matter all that > > much ;-) > > The reason probably is that the broken alignment never reaches the executable. In _start.S we > correct the alignment again. So it really is only a problem for code executed between the argv > adjustments done by ld.so and the entry point of the executable. How/why does any code get executed in that interval anyway? Rich
Rich Felker <dalias@libc.org> writes: > On Wed, Nov 12, 2014 at 10:05:45AM +0100, Andreas Krebbel wrote: >> The reason probably is that the broken alignment never reaches the executable. In _start.S we >> correct the alignment again. So it really is only a problem for code executed between the argv >> adjustments done by ld.so and the entry point of the executable. > > How/why does any code get executed in that interval anyway? It's the dynamic linker itself and the init code. Andreas.
diff --git a/sysdeps/s390/s390-32/dl-machine.h b/sysdeps/s390/s390-32/dl-machine.h index c56185c..b189552 100644 --- a/sysdeps/s390/s390-32/dl-machine.h +++ b/sysdeps/s390/s390-32/dl-machine.h @@ -166,18 +166,47 @@ _dl_start_user:\n\ # See if we were run as a command with the executable file\n\ # name as an extra leading argument.\n\ l %r1,_dl_skip_args@GOT12(0,%r12)\n\ - l %r1,0(%r1) # load _dl_skip_args\n\ + l %r1,0(%r1) # load _dl_skip_args\n\ + ltr %r1,%r1\n\ + je .L4 # Skip the arg adjustment if there were none.\n\ # Get the original argument count.\n\ l %r0,96(%r15)\n\ # Subtract _dl_skip_args from it.\n\ sr %r0,%r1\n\ - # Adjust the stack pointer to skip _dl_skip_args words.\n\ - sll %r1,2\n\ - ar %r15,%r1\n\ - # Set the back chain to zero again\n\ - xc 0(4,%r15),0(%r15)\n\ # Store back the modified argument count.\n\ st %r0,96(%r15)\n\ + # Copy argv and envp forward to account for skipped argv entries.\n\ + # We skipped at least one argument or we would not get here.\n\ + la %r6,100(%r15) # Destination pointer i.e. &argv[0]\n\ + lr %r5,%r6\n\ + lr %r0,%r1\n\ + sll %r0,2\n\ # Number of skipped bytes.\n\ + ar %r5,%r0 # Source pointer = Dest + Skipped args.\n\ + # argv copy loop:\n\ +.L1: l %r7,0(%r5) # Load a word from the source.\n\ + st %r7,0(%r6) # Store the word in the destination.\n\ + ahi %r5,4\n\ + ahi %r6,4\n\ + ltr %r7,%r7\n\ + jne .L1 # Stop after copying the NULL.\n\ + # envp copy loop:\n\ +.L2: l %r7,0(%r5) # Load a word from the source.\n\ + st %r7,0(%r6) # Store the word in the destination.\n\ + ahi %r5,4\n\ + ahi %r6,4\n\ + ltr %r7,%r7\n\ + jne .L2 # Stop after copying the NULL.\n\ + # Now we have to zero out the envp entries after NULL to allow\n\ + # start.S to properly find auxv by skipping zeroes.\n\ + # zero out loop:\n\ + lhi %r7,0\n\ +.L3: st %r7,0(%r6) # Store zero.\n\ + ahi %r6,4 # Advance dest pointer.\n\ + ahi %r1,-1 # Subtract one from the word count.\n\ + ltr %r1,%r1\n\ + jne .L3 # Keep copying if the word count is non-zero.\n\ + # Set the back chain to zero again\n\ +.L4: xc 0(4,%r15),0(%r15)\n\ # The special initializer gets called with the stack just\n\ # as the application's entry point will see it; it can\n\ # switch stacks if it moves these contents over.\n\