gai_cancel()

Message ID 1497874716.6717.39.camel@pbcl.net
State New, archived
Headers

Commit Message

Phil Blundell June 19, 2017, 12:18 p.m. UTC
  On Mon, 2017-06-19 at 12:04 +0100, Phil Blundell wrote:
> 
Under conditions that I don't entirely understand yet, we seem to be
> somehow returning from gai_suspend while its waitlist[] entry is
> still
> linked into requestlist->waiting. 

Here's a patch that fixes bug 20874 for me, fwiw.  It still passes
"make subdirs='resolv' xcheck". 

OK to commit?

Phil
  

Comments

Florian Weimer June 26, 2017, 3:35 p.m. UTC | #1
On 06/19/2017 02:18 PM, Phil Blundell wrote:
> Here's a patch that fixes bug 20874 for me, fwiw.  It still passes
> "make subdirs='resolv' xcheck". 
> 
> OK to commit?

How hard would it be to write a test case for this?  We also need a
ChangeLog entry.

Thanks,
Florian
  
Phil Blundell June 26, 2017, 3:41 p.m. UTC | #2
On Mon, 2017-06-26 at 17:35 +0200, Florian Weimer wrote:
> How hard would it be to write a test case for this?  We also need a
> ChangeLog entry.

It's a race, so a testcase is always going to be a bit hit-and-miss. 
But something similar to the example from the original bug report would
probably show it up fairly reliably.  I'll see what I can do.

I did include a changelog entry when I resent the patch this afternoon
though you're right that I forgot it earlier.

I think I also ought to run the testcase under valgrind for a while to
convince myself a bit more thoroughly that it's OK.

Thanks

Phil
  
Carlos O'Donell June 26, 2017, 4:22 p.m. UTC | #3
On 06/26/2017 11:41 AM, Phil Blundell wrote:
> On Mon, 2017-06-26 at 17:35 +0200, Florian Weimer wrote:
>> How hard would it be to write a test case for this?  We also need a
>> ChangeLog entry.
> 
> It's a race, so a testcase is always going to be a bit hit-and-miss. 
> But something similar to the example from the original bug report would
> probably show it up fairly reliably.  I'll see what I can do.

Even if it is unreliably the result is that we have a false-positive and
that IMO is OK from a regression testing standpoint. If we ever see the
test fail it would indicate an incomplete fix and that's important
information.
  

Patch

From 6e5dbbcfc0594dad90dc6f8b4537dba26bceb428 Mon Sep 17 00:00:00 2001
From: Phil Blundell <pb@pbcl.net>
Date: Mon, 19 Jun 2017 13:11:00 +0100
Subject: [PATCH] gai_suspend: Remove bogus check for EAI_INPROGRESS [BZ
 #20874]

If we added an entry to the waitlist for any request, it is important
that we remove it again before returning.  Failing to do so will
cause obscure and hard-to-debug crashes because the linked list
will contain a pointer to a struct that was assigned on the stack
and has since been overwritten.

Although we check that the current "return value" of the request
is EAI_INPROGRESS before adding an entry to its waitlist, this
value may change while we sleep so we cannot assume it will still
be EAI_INPROGRESS when we come to remove the entry afterwards.
---
 resolv/gai_suspend.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/resolv/gai_suspend.c b/resolv/gai_suspend.c
index a86bd4360d..139d636c78 100644
--- a/resolv/gai_suspend.c
+++ b/resolv/gai_suspend.c
@@ -111,8 +111,7 @@  gai_suspend (const struct gaicb *const list[], int ent,
       /* Now remove the entry in the waiting list for all requests
 	 which didn't terminate.  */
       for (cnt = 0; cnt < ent; ++cnt)
-	if (list[cnt] != NULL && list[cnt]->__return == EAI_INPROGRESS
-	    && requestlist[cnt] != NULL)
+	if (list[cnt] != NULL && requestlist[cnt] != NULL)
 	  {
 	    struct waitlist **listp = &requestlist[cnt]->waiting;
 
-- 
2.11.0