From patchwork Wed May 6 17:46:29 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 6598 Received: (qmail 97158 invoked by alias); 6 May 2015 17:46:38 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 97147 invoked by uid 89); 6 May 2015 17:46:38 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, SPF_HELO_PASS, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Wed, 06 May 2015 17:46:36 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t46HkVi4023195 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 6 May 2015 13:46:31 -0400 Received: from [127.0.0.1] (ovpn01.gateway.prod.ext.ams2.redhat.com [10.39.146.11]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t46HkTdd007228; Wed, 6 May 2015 13:46:30 -0400 Message-ID: <554A5375.4040405@redhat.com> Date: Wed, 06 May 2015 18:46:29 +0100 From: Pedro Alves User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Simon Marchi , gdb-patches@sourceware.org Subject: [PATCH] Fix gdb.mi/mi-nsmoribund.exp timeouts (Re: [PATCH] Cleanup num_found usage in gdb_wait_for_event) References: <1430331569-17144-1-git-send-email-simon.marchi@ericsson.com> <55412EB9.7020206@redhat.com> <554133B8.5020004@ericsson.com> In-Reply-To: <554133B8.5020004@ericsson.com> On 04/29/2015 08:40 PM, Simon Marchi wrote: > Oh well, I was investigating the exact same test failure (mi-nsmoribund.exp). It's > failing maybe 90% of the time on our x86 Jenkins boxes. I was actually writing a > patch for that at the moment. > > For poll, I keep a static variable around that tells which pollfd structure to start > iterating from. So the first time we check 0, 1, 2, next time we'll check 1, 2, 0, then > 2, 0, 1, etc. It's really simple. The static variable should be a field in gdb_notifier > though. That's _exactly_ what my patch does too. :-) That was this bit: @@ -751,8 +794,20 @@ gdb_wait_for_event (int block) if (use_poll) { #ifdef HAVE_POLL - for (i = 0; (i < gdb_notifier.num_fds) && (num_found > 0); i++) + /* To level the fairness across event sources, we poll them in a + round-robin-like fashion. The number and order of the polled + file descriptors may change between invocations, but this is + good enough. */ + static int last_notifier = -1; + + while (num_found > 0) { + last_notifier++; + if (last_notifier >= gdb_notifier.num_fds) + last_notifier = 0; + + i = last_notifier; + if ((gdb_notifier.poll_fds + i)->revents) num_found--; else > > Here is my patch in raw form: > https://github.com/simark/binutils-gdb/commit/ec49aaec6963f0aa974d183b8fca669f6261274d > > In any case, I don't see why interacting with num_found here is useful, given that we only > handle one event. If we did handle all the found events, then I would understand, since you > want to know when you are done. The only reason I think is to avoid looking at the pollfds if poll returned no events. So, how about we combine and simplify things, like this? Thanks, Pedro Alves From 1dfb4a7a57179f633b18feb054211257050a22c2 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Wed, 6 May 2015 18:34:32 +0100 Subject: [PATCH] Fix gdb.mi/mi-nsmoribund.exp timeouts The PPC64 buildbot has been showing timeouts in mi-nsmoribund.exp, like this: (...) -thread-info FAIL: gdb.mi/mi-nsmoribund.exp: thread state: all running except the breakpoint thread (timeout) ... and I can reproduce this on gcc110 (PPC64) on the gcc compile farm. That is, the test sends "-thread-info" to GDB, but GDB never replies back. The problem is that these machines are too fast for gdb. :-) That test has a few threads running the same tight loop, and constantly hitting a thread-specific breakpoint that needs to be stepped over. If threads trip on breakpoints fast enough that linux-nat.c's event pipe associated with SIGCHLD is constantly being written to, even if the stdin file descriptor also has an event to handle, gdb never gets to it. because linux-nat.c's pipe comes first in the set of descriptors served by the poll/select code in the event loop. Fix this by having the event loop serve file event sources in round-robin-like fashion, similarly to how its done in gdb_do_one_event. Unfortunately, the poll and the select variants each need their own fixing. Tested on x86_64 Fedora 20 (poll and select variants), and PPC64 Fedora 18. Fixes the timeout in the PPC64 machine in the compile farm that times out without this, and I won't be surprised if it fixes other random timeouts in other tests. (gdbserver's copy of the event-loop doesn't need this (yet), as it still pushes all ready events to an event queue. That is, it hasn't had 70b66289 merged yet. We should really merge both event-loop.c copies into a single shared file, but that's for another day.) gdb/ChangeLog: 2015-05-06 Pedro Alves Simon Marchi * event-loop.c (gdb_notifier) : New fields. (get_next_file_handler_to_handle_and_advance): New function. (delete_file_handler): If deleting the next file handler to handle, advance to the next file handler. (gdb_wait_for_event): Bail early if no event fired. Poll file handlers in round-robin fashion. --- gdb/event-loop.c | 120 ++++++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 88 insertions(+), 32 deletions(-) diff --git a/gdb/event-loop.c b/gdb/event-loop.c index 79e41fd..9ac4908 100644 --- a/gdb/event-loop.c +++ b/gdb/event-loop.c @@ -165,10 +165,24 @@ static struct /* Ptr to head of file handler list. */ file_handler *first_file_handler; + /* Next file handler to handle, for the select variant. To level + the fairness across event sources, we serve file handlers in a + round-robin-like fashion. The number and order of the polled + file handlers may change between invocations, but this is good + enough. */ + file_handler *next_file_handler; + #ifdef HAVE_POLL /* Ptr to array of pollfd structures. */ struct pollfd *poll_fds; + /* Next file descriptor to handle, for the poll variant. To level + the fairness across event sources, we poll the file descriptors + in a round-robin-like fashion. The number and order of the + polled file descriptors may change between invocations, but + this is good enough. */ + int next_poll_fds_index; + /* Timeout in milliseconds for calls to poll(). */ int poll_timeout; #endif @@ -494,6 +508,31 @@ create_file_handler (int fd, int mask, handler_func * proc, file_ptr->mask = mask; } +/* Return the next file handler to handle, and advance to the next + file handler, wrapping around if the end of the list is + reached. */ + +static file_handler * +get_next_file_handler_to_handle_and_advance (void) +{ + file_handler *curr_next; + + /* The first time around, this is still NULL. */ + if (gdb_notifier.next_file_handler == NULL) + gdb_notifier.next_file_handler = gdb_notifier.first_file_handler; + + curr_next = gdb_notifier.next_file_handler; + gdb_assert (curr_next != NULL); + + /* Advance. */ + gdb_notifier.next_file_handler = curr_next->next_file; + /* Wrap around, if necessary. */ + if (gdb_notifier.next_file_handler == NULL) + gdb_notifier.next_file_handler = gdb_notifier.first_file_handler; + + return curr_next; +} + /* Remove the file descriptor FD from the list of monitored fd's: i.e. we don't care anymore about events on the FD. */ void @@ -576,6 +615,17 @@ delete_file_handler (int fd) file_ptr->mask = 0; + /* If this file handler was going to be the next one to be handled, + advance to the next's next, if any. */ + if (gdb_notifier.next_file_handler == file_ptr) + { + if (file_ptr->next_file == NULL + && file_ptr == gdb_notifier.first_file_handler) + gdb_notifier.next_file_handler = NULL; + else + get_next_file_handler_to_handle_and_advance (); + } + /* Get rid of the file handler in the file handler list. */ if (file_ptr == gdb_notifier.first_file_handler) gdb_notifier.first_file_handler = file_ptr->next_file; @@ -672,7 +722,6 @@ gdb_wait_for_event (int block) { file_handler *file_ptr; int num_found = 0; - int i; /* Make sure all output is done before getting another event. */ gdb_flush (gdb_stdout); @@ -743,37 +792,47 @@ gdb_wait_for_event (int block) } } + /* Avoid looking at poll_fds[i]->revents if no event fired. */ + if (num_found <= 0) + return 0; + /* Run event handlers. We always run just one handler and go back to polling, in case a handler changes the notifier list. Since events for sources we haven't consumed yet wake poll/select immediately, no event is lost. */ + /* To level the fairness across event descriptors, we handle them in + a round-robin-like fashion. The number and order of descriptors + may change between invocations, but this is good enough. */ if (use_poll) { #ifdef HAVE_POLL - for (i = 0; (i < gdb_notifier.num_fds) && (num_found > 0); i++) - { - if ((gdb_notifier.poll_fds + i)->revents) - num_found--; - else - continue; + int i; + int mask; - for (file_ptr = gdb_notifier.first_file_handler; - file_ptr != NULL; - file_ptr = file_ptr->next_file) - { - if (file_ptr->fd == (gdb_notifier.poll_fds + i)->fd) - break; - } + while (1) + { + if (gdb_notifier.next_poll_fds_index >= gdb_notifier.num_fds) + gdb_notifier.next_poll_fds_index = 0; + i = gdb_notifier.next_poll_fds_index++; - if (file_ptr) - { - int mask = (gdb_notifier.poll_fds + i)->revents; + gdb_assert (i < gdb_notifier.num_fds); + if ((gdb_notifier.poll_fds + i)->revents) + break; + } - handle_file_event (file_ptr, mask); - return 1; - } + for (file_ptr = gdb_notifier.first_file_handler; + file_ptr != NULL; + file_ptr = file_ptr->next_file) + { + if (file_ptr->fd == (gdb_notifier.poll_fds + i)->fd) + break; } + gdb_assert (file_ptr != NULL); + + mask = (gdb_notifier.poll_fds + i)->revents; + handle_file_event (file_ptr, mask); + return 1; #else internal_error (__FILE__, __LINE__, _("use_poll without HAVE_POLL")); @@ -781,11 +840,12 @@ gdb_wait_for_event (int block) } else { - for (file_ptr = gdb_notifier.first_file_handler; - (file_ptr != NULL) && (num_found > 0); - file_ptr = file_ptr->next_file) + /* See comment about even source fairness above. */ + int mask = 0; + + do { - int mask = 0; + file_ptr = get_next_file_handler_to_handle_and_advance (); if (FD_ISSET (file_ptr->fd, &gdb_notifier.ready_masks[0])) mask |= GDB_READABLE; @@ -793,15 +853,11 @@ gdb_wait_for_event (int block) mask |= GDB_WRITABLE; if (FD_ISSET (file_ptr->fd, &gdb_notifier.ready_masks[2])) mask |= GDB_EXCEPTION; - - if (!mask) - continue; - else - num_found--; - - handle_file_event (file_ptr, mask); - return 1; } + while (mask == 0); + + handle_file_event (file_ptr, mask); + return 1; } return 0; }