From patchwork Wed Aug 19 01:27:31 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 8282 Received: (qmail 109148 invoked by alias); 19 Aug 2015 01:27:39 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 109138 invoked by uid 89); 19 Aug 2015 01:27:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.2 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 19 Aug 2015 01:27:36 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 0BCE0341AC4; Wed, 19 Aug 2015 01:27:34 +0000 (UTC) Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t7J1RVFT009445; Tue, 18 Aug 2015 21:27:32 -0400 Message-ID: <55D3DB83.4050204@redhat.com> Date: Wed, 19 Aug 2015 02:27:31 +0100 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Sandra Loosemore , Gary Benson CC: Joel Brobecker , Doug Evans , Jan Kratochvil , gdb-patches , =?windows-1252?Q?Andr=E9_P=F6?= =?windows-1252?Q?nitz?= , Paul Koning Subject: Re: [PATCH 0/2] Better handling of slow remote transfers References: <001a11c301b0388ac5051d0c5ab8@google.com> <20150811185519.GA28644@host1.jankratochvil.net> <20150811195943.GC22245@adacore.com> <20150812094831.GD11096@blade.nx> <20150814182648.GO22245@adacore.com> <55CE6AA3.8000300@codesourcery.com> <20150816184913.GA2998@adacore.com> <20150817085310.GC25320@blade.nx> <55D1EE96.9060202@codesourcery.com> <20150818095858.GB9815@blade.nx> <55D3625B.40101@codesourcery.com> In-Reply-To: <55D3625B.40101@codesourcery.com> On 08/18/2015 05:50 PM, Sandra Loosemore wrote: > I would be happy to try to help debug what is going wrong here, but I'm > not a GDB expert by any means and I'm totally unfamiliar with this part > of the code and don't know where to start. If somebody has a theory on > this, is there someplace I could put in printfs to trace what is > happening? From the previous discussion, it seemed like the flow > through the different layers of abstraction to what the RSP was actually > doing was very obscure, in terms of where the chunking is happening, etc. That was just a misunderstanding. It's not really obscure. GDB has a file io abstraction that wires the open/close/pread/pwrite syscalls to RSP packets. Callers just need to use the target_fileio_open/pread etc. functions instead of the usual syscalls. Whenever BFD or GDB accesses a file that is prefixed target: or remote:, the open/read/write requests go through those calls. For remote targets, those end up in remote.c:remote_hostio_pread, etc. The only chunking going on is the server trimming the pread results to the maximum packet size it can return to gdb. Both gdb and gdbserver have that maximum buffer size hardcoded at 16384 for historical reasons. >> Your PandaBoard takes 8 seconds. That doesn't seem so bad to me. >> If this Altera board is the only one with the massive slowdown then >> I don't think we should delay 7.10 any further on this issue--and I >> certainly don't think we should lose the functionality that the >> default sysroot of "target:" brings. > > FWIW, I have just tried on a random sampling of our MIPS target boards > as well, with the same test program and same scenario of continuing to a > breakpoint on main. The timings were 1:40 (SEAD-3 LX110 board, M14Kc > microMips), 1:38 (MALTA 74Kc MIPS16), and 1:20 (MALTA 5Kef MIPS64). I > could try some PowerPC boards as well if we need more datapoints. Thanks for doing this. As yet another data point, it'd be interesting to know whether Gary's rate-limit patch speeds up any other of these boards as well. Did you try getting the nfs mount out of the picture? That is, copy the debug-build files to the target, and have gdb pull them automatically. It's expected that reading a file out of remote filesystem is going to be slower than not reading it. Obviously some slowdown is expected. There's no free lunch. The fact that Gary's chunk size limiting patch made things much faster on the nios2 board is still mysterious to me. I'd expect the slowness to be latency bound, given the request/response nature of the RSP, but then I'd expect that more chunking would slow things down, not speed it up. I'm trying to build a more complete picture of the potential for trouble. The more data, the better... So tonight I played with this. To amplify latency issues against a remote target, I've built gdbserver on gcc76 on the gcc compile farm. Anyone can request access to the gcc compile farm machines and to reproduce this. That's an x86-64 in France almost 2000km away from me. I've got a ping time of ~85ms to that machine, and I'm behind a ~16Mbit ADSL. As an illustration, here's what it takes to download the gdbserver binary using ssh/scp. Not the fastest file transfer method, but the most convenient: $ time scp gcc76.fsffrance.org:/home/palves/gdb/build/gdb/gdbserver/gdbserver /tmp/gdbserver gdbserver 100% 1451KB 1.4MB/s 00:01 real 0m1.296s user 0m0.015s sys 0m0.013s Now let's compare with GDB's file retrieval: [palves@gcc76] $ ./gdbserver --multi :9999 [palves@home] $ ssh -L 9999:localhost:9999 gcc76.fsffrance.org [palves@home] $ cat /tmp/remote-get define getfile shell date remote get /home/palves/gdb/build/gdb/gdbserver/gdbserver /tmp/gdbserver shell date end [palves@home] $ ./gdb -data-directory=data-directory -ex "tar extended-rem :9999" -x /tmp/remote-get (gdb) getfile Detaching after fork from child process 31567. Wed Aug 19 01:04:13 WEST 2015 Detaching after fork from child process 31568. Wed Aug 19 01:04:23 WEST 2015 (gdb) ~10s GDB/vFile vs ~1.3s scp/ssh. Not that impressive. So then I tried bumping the RSP max packet buffer size. That required a patch to gdb and another to gdbserver. I bumped the buffer size 16 times. The result was: (gdb) getfile Wed Aug 19 01:13:01 WEST 2015 Wed Aug 19 01:13:04 WEST 2015 (gdb) getfile Wed Aug 19 01:13:12 WEST 2015 Wed Aug 19 01:13:14 WEST 2015 (gdb) getfile Wed Aug 19 01:13:19 WEST 2015 Wed Aug 19 01:13:21 WEST 2015 So 2s/3s. Not bad. Bumping the packet buffer size to 4MB, to make sure a single vFile:pread covers the ~1.5MB file, I consistently get ~1s: (gdb) getfile Wed Aug 19 01:17:52 WEST 2015 Wed Aug 19 01:17:53 WEST 2015 (gdb) getfile Wed Aug 19 01:17:55 WEST 2015 Wed Aug 19 01:17:56 WEST 2015 (gdb) getfile Wed Aug 19 01:17:57 WEST 2015 Wed Aug 19 01:17:58 WEST 2015 (gdb) which is then in the ball park of scp. Seems clear to me that for whole-file transfers, the bottleneck (on this use case) is the number of chunks / round trips. For file transfer, obviously a stream-based transfer, like scp's, that avoids roundtrip hiccups, wins. For the use case of connecting to gdbserver with "target remote", and having gdb fetch the binary out of gdbserver with vFile, I tried something else. This time, I'll simply debug gdbserver itself remotely, letting gdb fetch the gdbserver binary, and run to main. Once main is reached, quit: [palves@gcc76] $ ./gdbserver :9999 ./gdbserver [palves@home] $ ssh -L 9999:localhost:9999 gcc76.fsffrance.org [palves@home] time ./gdb -data-directory=data-directory -ex "tar rem :9999" -ex "b main" -ex "c" -ex "set confirm off" -ex "quit" Pristine gdb 7.10.50.20150816-cvs gets me: ... Remote debugging using :9999 Reading symbols from target:/home/palves/gdb/build/gdb/gdbserver/gdbserver...done. Reading symbols from target:/lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done. 0x00007ffff7ddd190 in ?? () from target:/lib64/ld-linux-x86-64.so.2 Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635. Continuing. Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635 3635 ../../../src/gdb/gdbserver/server.c: No such file or directory. /home/palves/gdb/build/gdb/gdbserver/gdbserver: No such file or directory. real 0m55.315s user 0m0.400s sys 0m0.087s Bumping the remote max packet buffer size to (16384*16) helps a bit, though not impressive: real 0m51.600s user 0m0.404s sys 0m0.085s Bumping to 0x100000 helps a bit further: real 0m48.335s user 0m0.421s sys 0m0.091s Looking at "set debug remote 1" output, I noticed that gdb often issues a sequence of small vFile:pread requests for the same file. Ctrl-c'ing gdb while it was doing that and getting a backtrace shows that those requests often came from bfd. So I thought that maybe adding a readahead buffer/cache to gdb's vFile:pread requests could help. And indeed, that dropped the time for the same use case further down to: real 0m29.055s user 0m2.625s sys 0m0.090s I added a few counters to show cache hit/miss, and got: readahead cache miss 44 readahead cache hit 377 Not sure exactly what makes the "user" time go up that much (maybe reading too much ahead causes too much RSP escaping, etc.?) If I repeat the same leaving gdbserver's maximum buffer size set at the original 16384, I still get <30s: real 0m27.169s user 0m0.464s sys 0m0.061s though notice the user time goes down again. Maybe this indicates that bigger chunk sizes indeed increase CPU usage, which on slower boards would have a more profound effect than on the x86_64 machines I'm using. An observation in line with Sandra's observation of Gary's patch speeding up the nios2 board. Or maybe that's just a coincidence. BTW, the transfers seem to be always interruptible for me, even without Gary's patch, and even the slow ones. And finally, here's the time using a local sysroot and specifying a local program: $ time $g -ex "set sysroot /" -ex "tar rem :9999" -ex "b main" -ex "c" -ex "set confirm off" -ex "quit" /tmp/gdbserver Reading symbols from /tmp/gdbserver...done. Remote debugging using :9999 Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib64/ld-2.18.so.debug...done. done. 0x00007ffff7ddd190 in ?? () Breakpoint 1 at 0x41200c: file ../../../src/gdb/gdbserver/server.c, line 3635. Continuing. Breakpoint 1, main (argc=1, argv=0x7fffffffe3d8) at ../../../src/gdb/gdbserver/server.c:3635 3635 ../../../src/gdb/gdbserver/server.c: No such file or directory. real 0m7.385s user 0m0.132s sys 0m0.028s I'm attaching the 3 patches that I used. I wouldn't be surprised if they contain obvious blunders; it's getting quite late here... Thanks, Pedro Alves From d426ce0a36830378a8ec2e2cbdd94d9fcb47b891 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Tue, 18 Aug 2015 23:27:20 +0100 Subject: [PATCH 3/3] add readahead cache to gdb's vFile:pread --- gdb/remote.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 99 insertions(+), 3 deletions(-) diff --git a/gdb/remote.c b/gdb/remote.c index a839adf..8a739c8 100644 --- a/gdb/remote.c +++ b/gdb/remote.c @@ -10311,6 +10311,26 @@ remote_hostio_send_command (int command_bytes, int which_packet, return ret; } +struct readahead_cache +{ + int fd; + gdb_byte *buf; + ULONGEST offset; + size_t bufsize; +}; + +static struct readahead_cache *readahead_cache; + +static void +readahead_cache_invalidate (void) +{ + if (readahead_cache != NULL) + readahead_cache->fd = -1; +} + +static unsigned int readahead_cache_hit_count; +static unsigned int readahead_cache_miss_count; + /* Set the filesystem remote_hostio functions that take FILENAME arguments will use. Return 0 on success, or -1 if an error occurs (and set *REMOTE_ERRNO). */ @@ -10325,6 +10345,8 @@ remote_hostio_set_filesystem (struct inferior *inf, int *remote_errno) char arg[9]; int ret; + readahead_cache_invalidate (); + if (packet_support (PACKET_vFile_setfs) == PACKET_DISABLE) return 0; @@ -10389,6 +10411,9 @@ remote_hostio_pwrite (struct target_ops *self, int left = get_remote_packet_size (); int out_len; + if (readahead_cache != NULL && readahead_cache->fd == fd) + readahead_cache_invalidate (); + remote_buffer_add_string (&p, &left, "vFile:pwrite:"); remote_buffer_add_int (&p, &left, fd); @@ -10407,9 +10432,9 @@ remote_hostio_pwrite (struct target_ops *self, /* Implementation of to_fileio_pread. */ static int -remote_hostio_pread (struct target_ops *self, - int fd, gdb_byte *read_buf, int len, - ULONGEST offset, int *remote_errno) +remote_hostio_pread_1 (struct target_ops *self, + int fd, gdb_byte *read_buf, int len, + ULONGEST offset, int *remote_errno) { struct remote_state *rs = get_remote_state (); char *p = rs->buf; @@ -10443,6 +10468,74 @@ remote_hostio_pread (struct target_ops *self, return ret; } +static int +remote_hostio_pread_from_cache (struct readahead_cache *cache, + int fd, gdb_byte *read_buf, int len, + ULONGEST offset) +{ + if (cache->fd == fd + && cache->offset <= offset + && offset < cache->offset + cache->bufsize) + { + ULONGEST max = cache->offset + cache->bufsize; + + if (offset + len > max) + len = max - offset; + + if (remote_debug) + fprintf_unfiltered (gdb_stdlog, + "readahead cache hit %d\n", + ++readahead_cache_hit_count); + memcpy (read_buf, cache->buf + offset - cache->offset, len); + return len; + } + + return -1; +} + +static int +remote_hostio_pread (struct target_ops *self, + int fd, gdb_byte *read_buf, int len, + ULONGEST offset, int *remote_errno) +{ + int ret; + + if (readahead_cache != NULL) + { + ret = remote_hostio_pread_from_cache (readahead_cache, fd, + read_buf, len, offset); + if (ret >= 0) + return ret; + + readahead_cache_invalidate (); + } + + if (remote_debug) + fprintf_unfiltered (gdb_stdlog, "readahead cache miss %d\n", + ++readahead_cache_miss_count); + + if (readahead_cache == NULL) + readahead_cache = XCNEW (struct readahead_cache); + readahead_cache->fd = fd; + readahead_cache->offset = offset; + readahead_cache->bufsize = get_remote_packet_size (); + readahead_cache->buf = xrealloc (readahead_cache->buf, readahead_cache->bufsize); + + ret = remote_hostio_pread_1 (self, + readahead_cache->fd, + readahead_cache->buf, readahead_cache->bufsize, + readahead_cache->offset, remote_errno); + if (ret <= 0) + { + readahead_cache_invalidate (); + return ret; + } + + readahead_cache->bufsize = ret; + return remote_hostio_pread_from_cache (readahead_cache, fd, + read_buf, len, offset); +} + /* Implementation of to_fileio_close. */ static int @@ -10452,6 +10545,9 @@ remote_hostio_close (struct target_ops *self, int fd, int *remote_errno) char *p = rs->buf; int left = get_remote_packet_size () - 1; + if (readahead_cache != NULL && readahead_cache->fd == fd) + readahead_cache_invalidate (); + remote_buffer_add_string (&p, &left, "vFile:close:"); remote_buffer_add_int (&p, &left, fd); -- 1.9.3