Fix PR win32/24284: tcp_auto_retry doesn't work in MinGW

Message ID 20190827155013.31610-1-sergiodj@redhat.com
State New, archived
Headers

Commit Message

Sergio Durigan Junior Aug. 27, 2019, 3:50 p.m. UTC
  This was reported by Bernhard Wodok, along with a patch to fix the
issue.  I adjusted the patch a bit, and I'm submitting the patch on
his behalf.

According to Bernhard, the issue can be reproduced by doing:

  1. start gdb
  2. enter 'target remote :2345'
  3. observe that it throws a "connection refused" error immediately
  instead of waiting and throwing a timeout error

I.e., I believe it can be reproduced by our current tests, which is
why I'm not proposing any extra tests here (well, I don't use nor have
any Windows system to test this, so...).

The problem happens because we call 'gdb_select' passing 0 as its
first argument, which, when using MinGW, ends up using the
'gdb_select' version from mingw-hdep.c, and when the first argument is
0 this means that WaitForMultipleObjects will be called with 0 as its
first argument as well.  According to the MS API docs, this is
forbidden:

  https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitformultipleobjects

The proposed fix is simple: we just call Sleep when N == 0 (and when
TIMEOUT is non-NULL), and return 0.  It makes sense to me.

Both Bernhard and Paul Carroll confirmed that the fix works.  I'm
Cc'ing Bernhard in case you have any questions about the patch.

OK?

gdb/ChangeLog:
2019-08-27  Bernhard Wodok  <barto@gmx.net>
	    Sergio Durigan Junior  <sergiodj@redhat.com>

	PR win32/24284
	* mingw-hdep.c (gdb_select): Handle case when 'n' is zero.
---
 gdb/mingw-hdep.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
  

Comments

Pedro Alves Aug. 29, 2019, 12:15 p.m. UTC | #1
On 8/27/19 4:50 PM, Sergio Durigan Junior wrote:
> This was reported by Bernhard Wodok, along with a patch to fix the
> issue.  I adjusted the patch a bit, and I'm submitting the patch on
> his behalf.
> 
> According to Bernhard, the issue can be reproduced by doing:
> 
>   1. start gdb
>   2. enter 'target remote :2345'
>   3. observe that it throws a "connection refused" error immediately
>   instead of waiting and throwing a timeout error
> 
> I.e., I believe it can be reproduced by our current tests, which is
> why I'm not proposing any extra tests here (well, I don't use nor have
> any Windows system to test this, so...).
> 
> The problem happens because we call 'gdb_select' passing 0 as its
> first argument, 

I assume this is from wait_for_connect?  Please mention that above.

Must be, because I just now noticed that gdb_usleep is now dead code.

The comment in wait_for_connect is quite curious, kind of implies that
this worked on Windows at some point:

    /* Use gdb_select here, since we have no file descriptors, and on
       Windows, plain select doesn't work in that case.  */
    n = gdb_select (0, NULL, NULL, NULL, &t);

> --- a/gdb/mingw-hdep.c
> +++ b/gdb/mingw-hdep.c
> @@ -64,6 +64,17 @@ gdb_select (int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
>    int num_ready;
>    size_t indx;
>  
> +  if (n == 0)
> +    {
> +      /* The MS API says that the first argument to
> +	 WaitForMultipleObjects cannot be zero.  That's we just use
> +	 a regular Sleep here.  */

The sentence:

 "That's we just use a regular Sleep here."

is non-grammatical.  Missing a "why" in "That's why" ?

OK with that fixed.

Thanks,
Pedro Alves
  
Sergio Durigan Junior Aug. 29, 2019, 4:36 p.m. UTC | #2
On Thursday, August 29 2019, Pedro Alves wrote:

> On 8/27/19 4:50 PM, Sergio Durigan Junior wrote:
>> This was reported by Bernhard Wodok, along with a patch to fix the
>> issue.  I adjusted the patch a bit, and I'm submitting the patch on
>> his behalf.
>> 
>> According to Bernhard, the issue can be reproduced by doing:
>> 
>>   1. start gdb
>>   2. enter 'target remote :2345'
>>   3. observe that it throws a "connection refused" error immediately
>>   instead of waiting and throwing a timeout error
>> 
>> I.e., I believe it can be reproduced by our current tests, which is
>> why I'm not proposing any extra tests here (well, I don't use nor have
>> any Windows system to test this, so...).
>> 
>> The problem happens because we call 'gdb_select' passing 0 as its
>> first argument, 

Thanks for the review, Pedro.

> I assume this is from wait_for_connect?  Please mention that above.

Yes.  I'll do that.

> Must be, because I just now noticed that gdb_usleep is now dead code.
>
> The comment in wait_for_connect is quite curious, kind of implies that
> this worked on Windows at some point:
>
>     /* Use gdb_select here, since we have no file descriptors, and on
>        Windows, plain select doesn't work in that case.  */
>     n = gdb_select (0, NULL, NULL, NULL, &t);

Indeed.  This comment was added by:

commit 84603566b73e9ad18d094da3b7510ab480db8170
Author: Sandra Loosemore <sandra@codesourcery.com>
Date:   Tue Jan 6 17:07:08 2009 +0000

It's not entirely clear that calling gdb_select on Windows will actually
call WaitForMultipleObjects.  There is a 'select'c call in the Windows
API:

  https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select

So maybe it was assumed that the code would call this function
instead...

>> --- a/gdb/mingw-hdep.c
>> +++ b/gdb/mingw-hdep.c
>> @@ -64,6 +64,17 @@ gdb_select (int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
>>    int num_ready;
>>    size_t indx;
>>  
>> +  if (n == 0)
>> +    {
>> +      /* The MS API says that the first argument to
>> +	 WaitForMultipleObjects cannot be zero.  That's we just use
>> +	 a regular Sleep here.  */
>
> The sentence:
>
>  "That's we just use a regular Sleep here."
>
> is non-grammatical.  Missing a "why" in "That's why" ?

Yes, fixed.

> OK with that fixed.

Thanks, pushed: 16d01f9cd49f553a958a69ad3c9f781ebd402da8
  

Patch

diff --git a/gdb/mingw-hdep.c b/gdb/mingw-hdep.c
index 44fb22e9a1..1710251c51 100644
--- a/gdb/mingw-hdep.c
+++ b/gdb/mingw-hdep.c
@@ -64,6 +64,17 @@  gdb_select (int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
   int num_ready;
   size_t indx;
 
+  if (n == 0)
+    {
+      /* The MS API says that the first argument to
+	 WaitForMultipleObjects cannot be zero.  That's we just use
+	 a regular Sleep here.  */
+      if (timeout != NULL)
+	Sleep (timeout->tv_sec * 1000 + timeout->tv_usec / 1000);
+
+      return 0;
+    }
+
   num_ready = 0;
   num_handles = 0;
   num_scbs = 0;